-
AI SEMINAR
Fri, Oct 09, 2015 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Pedro Szekely, Research Associate Professor
Talk Title: Domain Specific Search Where No Search Has Gone Before
Series: AI Seminar
Abstract: We are investigating crawling, extraction, alignment, entity resolution, database and indexing technologies to enable rapid creation of large domain-specific knowledge graphs. We are also investigating analytics, query and visualization techniques that use these graphs to deliver sophisticated, yet easy to use query and analysis capabilities to end-users. Our goal is to build technology that can use any source of information on the Web, including Web pages and services, text, images, text-delimited files and databases, and that scales to 1 billion Web pages. The project is a collaboration of the ISI information integration and natural language processing groups, Columbia University (deep learning for image analysis), JPL (Web crawling), Inferlink (extraction from Web pages and entity resolution) and NextCentury (user interface and visualization). The MEMEX program runs at a frantic pace (interview and demos in CBS 60 minutes, briefings and demos in the White House situation room, deployment to law enforcement in February 2015). The talk will cover the goals and key challenges of the project, and describe the system we built in several domains, including the human trafficking domain with over 50 million escort ads updated hourly, and the deployment to users in law enforcement.
Biography: Dr. Pedro Szekely is a Research Team Leader at the USC Information Sciences Institute (ISI) and a Research Associate Professor at the USC Computer Science Department. Dr. Szekely joined USC in 1988 after receiving his M.S. and Ph.D. degrees in Computer Science from Carnegie Mellon University in 1982 and 1987 respectively. His research interests include Big-Data, Semantic Web and Human-Computer Interaction. His focus is on techniques and tools to extract and integrate data from a wide variety of sources (Web pages, databases, spreadsheets, etc.), and on methods to index the integrated data to support accurate querying and sophisticated analysis. The resulting software tools, Karma and DIG, released as Open Source, have been used in a variety of applications, including intelligence analysis, bioinformatics, environmental engineering and cultural heritage. A notable example is the work with the Smithsonian American Art Museum to publish the meta-data about the museums collection as Linked Open Data. Dr. Szekely is currently applying this work to combat human trafficking, deploying the tools to victim-support agencies and law enforcement.
Host: Craig Knoblock
Location: Information Science Institute (ISI) - 1135 - 11th fl Large CR
Audiences: Everyone Is Invited