USC - Viterbi School of Engineering - ISI Grad Team's Knowledge-Mining Work Wins NIST Honors

August 16, 2005 —

In search of smarter search technology, two computer science graduate students and their advisor won second-place honors recently in one of the competition tracks of the annual Text REtrieval Conference (TREC). Soo-Min Kim, Deepak Ravichandran and advisor Eduard Hovy, all of whom work in the Viterbi School's Information Sciences Institute Intelligent Systems Division,

Kim, left, and Ravichandran

pitted their systems for information retrieval and natural language processing against those of some 34 other competitors in the competition sponsored by the National Institute for Standards and Technology (NIST).

The challenge was to scan the text of news stories and identify sentences that either offered opinions or described events. The competitors were given 26 topics (such as abortion or drugs)and told to extract the target data from 50 newspaper articles in each topic.

Kim designed her program to read the texts on a given topic (e.g. "abortion"), read through each text sentence by sentence and output either Yes (= opinion-bearing) or No (not).

Ravichandran constructed his program to identify sentences that contained events relevant to a given topic. In this case, the topic was an earthquake in Afghanistan, and the program was set to read each sentence and identify whether it did or did not contain other events relevant to the topic.

The output of the competitors' programs was then scored on precision - should each "yes" answer have been designated as such? - and recall - how many of the full set of correct "yes" answers were logged? These scores were then measured against a human-scored read of each text.

"In both cases, the technology will form part of future QA systems, which is the direction in which Web search engines are evolving," says Hovy.

Hovy

If you ask Google today, 'Who likes Al Qaeda?' all you get is a list of documents about Al Qaeda - it has no idea about like or dislike or opinions in general," he says. "Soo-Min's system will help give you real answers. Similarly, Deepak's will help answer questions like 'What happened after the earthquake?' or 'Tell me about Einstein's life in Berne.' The QA system needs to know what constitutes an event."