TELUSA Archives and their FATE - Database & Search

Tue, 22 Apr 1997 14:37:15 -0400

Chy. Sreenivas Parachuri has been constantly raising this issue with me.
So, I wrote a reply to him. Smt. Savitri raised a very genuine issue with
regards to the searching and I took it a bit more seriously. I think we should
do something about the search strategies. This is what I feel. Let us put a side
what Smt. Savitri remarked and what we are going to say in reply to her message.
She left something to us think about and act upon.
Please read this.

What Smt. Savitri Machiraju mentioned about finding an article drowned in the
muddle of nonsensical posts is also my concern. We Telugus in this country
comprise a major chunk of the computer scientists in the US and certainly most
of them are in Telusa. They can bring an elephant onto our screens. They can
come up with a nice tool to help our readers/serachers/users to search and find
a topic of their interest/need.

With my experience in database building and search strategies, I propose the

1) We need a controlled terminology to seach the articles/messages archived in
   Telusa database.
   Different authors call a "thing" or an "idea" by different names.
   e.g "raamuDu", "daSaradha tanayuDu", "sItApati", "Bharata's Bro" - all should
   be equated to "raamudu" and "Rama". Then it is easy to search the idea.
   e.g. "Sri Sri", "SrIramgam", "mahAkavi", "prajAkavi", "Sriirangam Sriinivaasa
   raavu" - should all be equated to "Sri Sri".

2) Keeping that in mind, the natural language (terminology) should be
   considered to establish links with the controlled vocabulary.
   e.g. Sreenivas Parachuri says in his essay "telugu calanacitramlO palle
   paaTalu". That is his terminology. As a database builder, I would maintain an
   index :1) Folk songs in Telugu Movies/Motion Pictures/Cinema.
	  2) Colloquial hymns in Andhra Movies/Motion Pictures/Cinema.
	  3) Village songs in Telugu Movies/Motion Pictures/Cinema.
	  4) Ja(a)napada Telugu literature and music on Telugu Silver Screen.
	  5) Andhra Cinema and Folklore Vocal Music.
	  6) Composers of Common Folk Telugu and Andhra Music in Telugu Cine
	  7) Farmers and workers of Andhra, their songs and Telugu Movie

   See! How many variations are there. Author of an essay has his/her liberty.
   The author may call the same thing in many ways. If the posts on a single
   topic "Telugu Folk Music in Films" is posted by Sreeni, and if you search
   using the words "Andhra Janapada Songs in Telugu Cinema" - Good Luck! You
   will never find.

   Proposal: What we have to do is:
	    1) Set up "Global Search Strategy".
	    2) Index topics under controlled terminology.
	       We have to come up with controlled terminology.
	       Controlled terminology should be above 50% in circulation/useage.
	    3) Set up links between author's "natural language" and
	       archive's "controlled terminology/vocabulary".
	    4) Build a Search Engine for that.
	       With HTML it is very easy to do.
	       Now there is a meaning for our archives.

3) For information purpose, there is nothing such as a "Good Posting" and "Junk
   Posting". All are same. The author of a posting should do the following"

	    1) Please give a short and condensed title.
	    2) Give Key words for your title and message.
	       e.g. "telugu raamaayaNaalalO strI citraNamu".
	       Keywords: telugu raamaayaNam strI SIlamu
			 Andhra raamaayaNam aaDavaari caaritra citraNa
			 Telugu Ramayana Woman Character
			 Andrha Ramayanam Female Character Literary Criticism
	       Please be liberal in using both English Equivalents and Telugu
	       Terms. It will be easy for those Non-Telugu speaking researchers
	       in Telugu and Andhra literature, culture, and history. We are
	       doing a favor to all of us and them.
	       This will also help the indexers of the archives to build links
	       between the natural language and controlled terminology.

4) Telusa List Operators (Chy. Ratnakar) should build a couple of fields:

	       In the message, at the top, the writer should do the following:

       Type:   essay/discussion/article/story/poem/culture/music/drama/discussion
       Keywords: pertinent keywords

       This will enable the indexer to build a search strategy.

       We should do this ASAP for the new postings.

5) Once everything above is set and done, we have to visit the old archives.
   Remove the non-telugu postings.
   Keep all the discussions, essays, and other stuff.
   Categorize them.
   Keep them in packages in designated boxes (dockets).
   Name the boxes.
   Provide key words from the titles.
   Read the essays quickly and pull out the natural language terms.
   Give the controlled vocabulary terms for Telusa Archives.
   Set up links between natural language and controlled vocabulary.
   Build a search program.
   Come up with a search engine.
   Release it on HTML WWW.

   Hope people will be interested in this. This is a Herculian task.
   Prasad Chodavarapu discussed with me a long time ago. He is willing to help
   in this. Madan Parigi (who is enjoying the Purna Market Mangoes and Pulla
   Reddy Sweets now) has some exposure in what I talked about. Sreenivas
   Parachuri discussed with me a number of times before and he hit me hard now
   again on this issue. Ram Sanka (where are you) is also knowledgeable in this.
   Bhishmacharya Kanneganti Ramarao and Dronacharya, of Computer Science, are
   there to help us. Our Abhimanya, Suresh Kolichala, has already done a great
   deal for both Telusa and SCIT, knows not only how to get into the byte vyUha
   but also seasoned now how to get out and get around the program-kauravaas.
   We have a wealth of Computer-Brains. Please do something.
   I am willing to provide the background and strategies to search the archives.

   Hope I make some sense.


   Disclaimer: Opinions are mine only.