August 16, 2005 —
In search of smarter search technology, two computer science graduate
students and their advisor won second-place honors recently in one of
the competition tracks of the annual Text REtrieval Conference (TREC).
Soo-Min Kim, Deepak Ravichandran and advisor Eduard Hovy, all of whom work
in the Viterbi School's Information Sciences Institute
Intelligent Systems Division,
Kim, left, and Ravichandran
pitted their systems for information
retrieval and natural language processing against those of some 34
other competitors in the competition sponsored by the National
Institute for Standards and Technology (NIST).
The challenge was to scan the text of news stories and identify
sentences that either offered opinions or described events. The
competitors were given 26 topics (such as abortion or drugs)and told to
extract the target data from 50 newspaper articles in each topic.
Kim designed her program to read the texts on a given topic (e.g.
"abortion"), read through each text sentence by sentence and output
either Yes (= opinion-bearing) or No (not).
Ravichandran constructed his program to identify sentences that
contained events relevant to a given topic. In this case, the topic was
an earthquake in Afghanistan, and the program was set to read each
sentence and identify whether it did or did not contain other events
relevant to the topic.
The output of the competitors' programs was then scored on precision -
should each "yes" answer have been designated as such? - and recall - how
many of the full set of correct "yes" answers were logged? These scores
were then measured against a human-scored read of each text.
"In both cases, the technology will form part of future QA systems,
which is the direction in which Web search engines are evolving," says
If you ask Google today, 'Who likes Al Qaeda?' all you get is a list of
documents about Al Qaeda - it has no idea about like or dislike or
opinions in general," he says. "Soo-Min's system will help give you
real answers. Similarly, Deepak's will help answer questions like 'What
happened after the earthquake?' or 'Tell me about Einstein's life in
Berne.' The QA system needs to know what constitutes an event."