-
AI SEMINAR
Thu, Aug 13, 2015 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Paul Groth, Disruptive Tech Director, Elsevier Labs
Talk Title: Provenance for Data Munging Environments
Series: AI Seminar
Abstract: Data munging is a crucial task across domains ranging from drug discovery and policy studies to data science. Indeed, it has been reported that data munging accounts for 60% of the time spent in data analysis. Because data munging involves a wide variety of tasks using data from multiple sources, it often becomes difficult to understand how a cleaned dataset was actually produced (i.e. its provenance). In this talk, I discuss our recent work on tracking data provenance within desktop systems, which addresses problems of efficient and fine grained capture. I also describe our work on scalable provence tracking within a triple store/graph database that supports messy web data. Finally, I briefly touch on whether we will move from adhoc data munging approaches to more declarative knowledge representation languages such as Probabilistic Soft Logic.
Biography: Paul Groth (pgroth.com) is Disruptive Technology Director at Elsevier Labs. He holds a Ph.D. in Computer Science from the University of Southampton (2007) and has done research at the University of Southern California (ISI!) and the VU University Amsterdam. His research focuses on dealing with large amounts of diverse contextualized knowledge with a particular focus on the web and science applications. This includes research in data provenance, data science, data integration and knowledge sharing. Paul was co-chair of the W3C Provenance Working Group that created a standard for provenance interchange. He is co-author of Provenance: an Introduction to PROV and The Semantic Web Primer: 3rd Edition as well as numerous academic articles. He blogs at http://thinklinks.wordpress.com. You can find him on twitter: @pgroth .
Host: Ashish Vaswani
Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=b46b31a4e04f4f83a6da32bf8dd040271dLocation: Information Science Institute (ISI) - 6th fl Large CR (689)
WebCast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=b46b31a4e04f4f83a6da32bf8dd040271d
Audiences: Everyone Is Invited