USC - Viterbi School of Engineering

Aug
30

CS Colloquium: Filip Ilievski (Vrije Universiteit Amsterdam) - Identity of Long-tail Entities in Text: A Knowledge Perspective
Fri, Aug 30, 2019 @ 10:00 AM - 11:00 AM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars

Speaker: Filip Ilievski, Vrije Universiteit (VU) Amsterdam

Talk Title: Identity of Long-tail Entities in Text: A Knowledge Perspective

Series: Computer Science Colloquium

Abstract: Entity linking systems are faced with a complex M-to-N mapping between surface forms in text and instances in a knowledge base, caused by the ambiguity of surface forms, the variance of the instances, and their frequency/popularity interplays, well-explained by pragmatic principles such as the Gricean maxims (Grice, 1975). Although current entity linkers report high accuracy scores, in this talk I will describe phenomena that capture large differences in performance between 'head' and 'tail' entities. To improve performance on the tail entities, I will argue that we need: to revisit evaluation (part I) and to employ knowledge and reason over it in a more systematic way (part II).

During the first half, I will depict how the current evaluation datasets, as well as the metrics employed, obfuscate the difference between head and tail, and discourages focus on tail entities. I will propose recommended actions and examples for long-tail-focused evaluation.

In the second half of my talk, I will present our efforts to generate expectations on long-tail entities through building neural profiling machines on top of background knowledge from Wikidata. In addition to an intrinsic evaluation, these profiling techniques are evaluated extrinsically on clustering NIL entities. I will discuss how an extension of this work can be used to capture commonsense knowledge and act as an active component in future reading machines.

This lecture satisfies requirements for CSCI 591: Research Colloquium. Please note, due to limited capacity in RTH 105, seats will be first come first serve.

Biography: Filip Ilievski is a Postdoctoral Researcher in Natural Language Processing at Vrije Universiteit (VU) Amsterdam, and closely affiliated with the Knowledge Representation and Reasoning group at the same University. His research investigates how systematic and extensive use of knowledge can help machines to deal with the 'long-tail' (knowledge scarcity and ambiguity) of human communication. To do so, he combines ideas from Information Extraction, Knowledge Graphs, and Machine Learning.

He developed LOTUS (Ilievski et al., 2016a), the largest publicly available index over the Linked Data cloud at the time, which received an award at the Semantics conference in 2016. Later, he collaborated with prof. Ed Hovy at CMU on building neural generalization models ('profiling machines') over Linked Data knowledge and applying them to cluster long-tail entities. As part of his research on measuring and improving biases in NLP evaluations, he co-organized a SemEval competition on 'Counting Events and Participants in the Long Tail' in 2018 (Ilievski et al., 2016b, Postma et al., 2018).

Filip Ilievski authored over 20 publications about these topics in peer-reviewed international journals and conference proceedings, including COLING, ESWC, and SWJ.

Host: Xiang Ren

Location: Ronald Tutor Hall of Engineering (RTH) - 105
Audiences: Everyone Is Invited

Contact: Computer Science Department

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

CS Colloquium: Filip Ilievski (Vrije Universiteit Amsterdam) - Identity of Long-tail Entities in Text: A Knowledge Perspective