This blog will act as a journal through which to track progess on our term project in DCCT*4090 - Information Retrieval.

Tuesday, July 1, 2008

Adding Synonym Search

After a quick post on the SomethingAwful.com forums' Cavern of COBOL sub-forum, it seems as though there's a NLP project called WordNet operated by the University of Princeton. This NLP engine supports many lexical features, like definition lookup; however, it also supports synonyms! Furthermore, there's a Java API for WordNet Search (JAWS) interface to the WordNet database, which can be used to query from Java applications. The URL for this API is http://engr.smu.edu/~tspell/. We will have to modify the index structure to use an ADT as the key instead of a String. The ADT should contain the lexical String keyword from the article, as well as a Vector of synonyms.

http://wordnet.princeton.edu

Update:
After some investigation, it looks like the time complexity involved in building the synonym list simultaneously with the index is too high. I believe that if we generate a synonym list on the user query when its submitted, we might be ok.

No comments: