Conferences, Lectures, & Seminars
Events for September
-
NL Seminar- Towards automatic extraction of experimental data from scientific papers [Intern final talk]
Thu, Sep 11, 2014 @ 03:30 PM - 04:30 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Eunsol Choi & Matic Horvat, University of Washington and Cambridge
Talk Title: Towards automatic extraction of experimental data from scientific papers [Intern final talk]
Series: Natural Language Seminar
Abstract: Many areas of science have experienced rapid growth in the amount of scientific literature published. For example, there are approximately 400 new papers published each year in the area of Machine Translation. As such amount of new data is virtually impossible to processes by a single researcher, a new tool is needed that would help researchers explore existing and discover new MT literature. To address this problem we built an approach for automatic extraction of experimental data from scientific papers that populates a database enabling structured queries.
Biography: Eunsol Choi is a PhD student at the University of Washington, advised by Prof. Luke Zettlemoyer. Prior to UW, she studied mathematics and computer science at Cornell University.
Matic Horvat is a PhD student at University of Cambridge researching integration of semantics and Statistical Machine Translation. He is originally from Ljubljana, Slovenia, where he completed a BSc in Computer Science in 2012. He continued with a masters in Advanced Computer Science at University of Cambridge, graduating in 2013.
Host: Aliya Deri and Kevin Knight
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited
Contact: Peter Zamar
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
AI SEMINAR
Fri, Sep 12, 2014 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Pascal Van Hentenryck, National ICT Australia (NICTA)
Talk Title: Measuring and Optimizing Cultural Markets
Abstract: Social influence has been shown to create significant unpredictability
in cultural markets, providing one potential explanation why experts
routinely fail at predicting commercial success of cultural
products. To counteract the difficulty of making accurate predictions,
``measure and react'' strategies have been advocated but finding a
concrete strategy that scales for very large markets has remained
elusive so far. Here we propose a ``measure and optimize'' strategy
that uses product quality, appeal, and social influence to maximize
expected profits in the market. Computational experiments show that
our ``measure and optimize'' strategy can leverage social influence to
produce significant performance benefits for the market. Our
theoretical analysis also proves that a ``measure and optimize''
strategy with social influence outperforms in expectation any
``measure and react'' strategy not displaying social information. In
other words, we show for the first time that dynamically showing
consumers positive social information increases the expected
performance of the seller in cultural markets, when using a ``measure
and optimize'' strategy.
Biography: Pascal Van Hentenryck leads the Optimisation Research Group (about 75
people) at National ICT Australia (NICTA). He also holds a
Vice-Chancellor Strategic Chair in data-intensive computing at the
Australian National University. Van Hentenryck is the recipient of
two honorary degrees and a fellow of the Association for the
Advancement of Artificial Intelligence. He was awarded the 2002
INFORMS ICS Award for research excellence in operations research and
compute science, the 2006 ACP Award for research excellence in
constraint programming, the 2010-2011 Philip J. Bray Award for
Teaching Excellence at Brown University, and was the 2013 IFORS
Distinguished speaker. Van Hentenryck is the author of five MIT Press
books and has developed a number of innovative optimisation systems
that are widely used in academia and industry.
Host: Kristina Lerman
More Info: TBA
Webcast: TBALocation: Information Science Institute (ISI) - 11th floor large conference room
WebCast Link: TBA
Audiences: Everyone Is Invited
Contact: Kary LAU
Event Link: TBA
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
AI SEMINAR - Incentivizing Exploration [joint work with Peter Frazier, Jon Kleinberg, Robert Kleinberg]
Fri, Sep 19, 2014 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: David Kempe, USC CS Dept. Associate Professor and Associate Chair for Undergraduate Programs
Talk Title: Incentivizing Exploration [joint work with Peter Frazier, Jon Kleinberg, Robert Kleinberg]
Series: AISeminar
Abstract: We study a Bayesian multi-armed bandit (MAB) setting in which a principal seeks to maximize the sum of expected time-discounted rewards obtained by pulling arms, when the arms are actually pulled by selfish and myopic individuals. Since such individuals pull the arm with highest expected posterior reward (i.e., they always exploit and never explore), the principal must incentivize them to explore by offering suitable payments. Among others, this setting models crowdsourced information discovery and funding agencies incentivizing scientists to perform high-risk, high-reward research.
We explore the tradeoff between the principal's total expected time-discounted incentive payments, and the total time-discounted rewards realized. Specifically, with a time-discount factor gamma in (0,1), let OPT denote the total expected time-discounted reward achievable by a principal who pulls arms directly without having to incentivize selfish agents, in a MAB problem. We call a (payment, reward) combination (b,rho) in [0,1]^2 achievable if for every MAB instance, using expected time-discounted payments of at most b*OPT, the principal can guarantee an expected time-discounted reward of at least rho*OPT. Our main result is a complete characterization of achievable (payment, reward) pairs: (b,rho) is achievable if and only if sqrt(b) + sqrt(1-rho) >= sqrt(gamma).
In proving this characterization, we analyze so-called time-expanded policies, which in each step let the agents choose myopically with some probability p, and incentivize them to choose "optimally" with probability 1-p. The analysis of time-expanded policies leads to a question that may be of independent interest: If the same MAB instance (without selfish agents) is considered under two different time-discount rates gamma, eta, how small can the ratio of OPT(eta) to OPT(gamma) be? We give a complete answer to this question, showing that OPT(eta) >= ((1-gamma)^2/(1-eta)^2) OPT(gamma), and that this bound is tight.
Biography: David Kempe received his Ph.D. from Cornell University in 2003, and has been on the faculty in the computer science department at USC since the Fall of 2004, where he is currently an Associate Professor and Associate Chair for Undergraduate Programs.
His primary research interests are in computer science theory and the design and analysis of algorithms, with a particular emphasis on social networks, algorithms for feature selection, and game-theoretic and pricing questions. He is a recipient of the NSF CAREER award, the VSoE Junior Research Award, the ONR Young Investigator Award, a Sloan Fellowship, and an Okawa Fellowship, in addition to several USC mentoring awards.
***Upon Speakers request there will be no Live Webcast viewing; It will be recorded for internal viewing only***
Host: Greg Ver Steeg
Location: Information Science Institute (ISI) - 1135
Audiences: Everyone Is Invited
Contact: Alma Nava / Information Sciences Institute
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
NL Seminar- An open-source toolkit for the representation, manipulation and optimization of weighted hypergraphs
Fri, Sep 19, 2014 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Markus Dreyer, SDL
Talk Title: An open-source toolkit for the representation, manipulation and optimization of weighted hypergraphs
Series: Natural Language Seminar
Abstract: Weighted hypergraphs arise naturally in parsing, syntax-based machine translation and other tree-based NLP models, as well as in weighted logic programming.
We present an open-source toolkit for the representation and manipulation of weighted hypergraphs. It provides hypergraph data structures and algorithms, such as the shortest path and inside-outside algorithms, composition, projection, and more. In addition, it provides functionality to optimize hypergraph feature weights from training data. We model finite-state machines as a special case. We give a tutorial on hypergraphs and the hypergraph toolkit and explain how you can use these tools in your research.
This is joint work with Jonathan Graehl.
Biography: Markus Dreyer is a Senior Research Scientist at SDL Language Weaver. His research focuses on algorithms and machine learning techniques for large-scale machine translation and NLP. He received his PhD in Computer Science from Johns Hopkins University, advised by Jason Eisner. For more information, see http://goo.gl/d6mHUi.
Host: Aliya Deri and Kevin Knight
More Info: http://nlg.isi.edu/nl-seminar/
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited
Contact: Peter Zamar
Event Link: http://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
AI SEMINAR - Query-driven approach to entity resolution
Fri, Sep 26, 2014 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Dmitri V. Kalashnikov , UCI
Talk Title: Query-driven approach to entity resolution
Series: AISeminar
Abstract: The significance of data quality research is motivated by the observation that the effectiveness of data-driven technologies such as decision support tools, data exploration, analysis, and scientific discovery tools is closely tied to the *quality of data* to which such techniques are applied. It is well recognized that the outcome of the analysis is only as good as the data on which the analysis is performed. That is why today organizations spend a substantial percentage of their budgets on cleaning tasks such as removing duplicates, correcting errors, and filling missing values, to improve data quality prior to pushing data through the analysis pipeline.
Given the critical importance of the problem, many efforts, in both industry and academia, have explored systematic approaches to addressing the cleaning challenges. This talk focuses primarily on the *entity resolution* challenge that arises because objects in the real world are referred to using references or descriptions that are not always unique identifiers of the objects, leading to ambiguity.
Traditionally, data cleaning is performed as a preprocessing step when creating a data warehouse prior to making it available to analysis -- an approach that works well under standard settings. Cleaning the entire data warehouse, however, can require a considerable amount of time and significant computing resources. Hence, this approach is often suboptimal for many modern query-driven and Big Data applications that need to analyze only small portions of the entire dataset and produce answers "on-the-fly" and in real-time.
To address these new cleaning challenges, we have developed a *Query-Driven Approach (QDA)* to data cleaning. QDA exploits the specificity and semantics of the given SQL selection query to significantly reduce the cleaning overhead by resolving only those records that may influence the answer of the query. It computes answers that are equivalent to those obtained by first using a regular cleaning algorithm, and then querying on top of the cleaned data. However, in many cases QDA can compute these answers much more efficiently.
A key concept driving the QDA approach is that of *vestigiality*. A cleaning step (i.e., call to the resolve function for a pair of records) is called vestigial (redundant) if QDA can guarantee that it can still compute correct final answer without knowing the outcome of this resolve. We formalize the concept of vestigiality in the context of a large class of SQL selection queries and develop techniques to identify vestigial cleaning steps. Technical challenges arise since vestigiality, as we will show, depends on several factors, including the specifics of the cleaning function (e.g., the merge function used if two objects are indeed duplicate entities), the predicate associated with the query, and the query answer semantics of what the user expects as the result of the query. We show that determining vestigiality is NP-hard and propose an effective approximate solution to test for vestigiality that performs very well in practice.
The comprehensive empirical evaluation of the proposed approach demonstrates its significant advantage in terms of efficiency over traditional techniques for query-driven applications.
Biography: http://www.ics.uci.edu/~dvk/CV/dvk_bio.txt
Dmitri V. Kalashnikov is an Associate Adjunct Professor of Computer Science at the University of California, Irvine. He received his PhD degree in Computer Science from Purdue University in 2003. He received his diploma in Applied Mathematics and Computer Science from Moscow State University, Russia in 1999, graduating summa cum laude.
His general research interests include databases and data mining. Currently, he specializes in the areas of entity resolution & data quality, and real-time situational awareness. In the past, he has also contributed to the areas of spatial, moving-object, and probabilistic databases.
He has received several scholarships, awards, and honors, including an Intel Fellowship and Intel Scholarship. His work is supported by the NSF, DH&S, and DARPA.
Host: Greg Ver Steeg
Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=dd8c0e0eef1749fdb4bc581af408d8561dLocation: Information Science Institute (ISI) - 1135
WebCast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=dd8c0e0eef1749fdb4bc581af408d8561d
Audiences: Everyone Is Invited
Contact: Alma Nava / Information Sciences Institute
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
NL Seminar-Semantic Parsing at Google
Fri, Sep 26, 2014 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Bill MacCartney, (Google/Stanford)
Talk Title: Semantic Parsing at Google
Series: Natural Language Seminar
Abstract: With the shift from desktop to mobile, and the rise of voice-driven UIs, a growing proportion of the Google query stream is not well-served by conventional keyword-based information retrieval. More and more queries use natural language ("when does walgreens close"), seek answers not found on any web page ("how do i get to work from here"), or demand action rather than information ("text my wife i'm 10 minutes late"). Satisfying such queries requires semantic parsing, that is, mapping the query into a structured, machine-readable representation of meaning. In this talk, I will give an overview of the techniques Google has developed to address the problem of semantic parsing, and discuss some of the challenges that remain. I'll also highlight differences between academia and industry in how the problem is conceived.
Biography: Bill MacCartney is a Senior Research Scientist at Google, working primarily on semantic parsing. He is also a Consulting Assistant Professor of Computer Science at Stanford. For more info: http://nlp.stanford.edu/~wcmac/
Host: Aliya Deri and Kevin Knight
More Info: http://nlg.isi.edu/nl-seminar/
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited
Contact: Peter Zamar
Event Link: http://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.