Logo: University of Southern California

Events Calendar



Select a calendar:



Filter June Events by Event Type:


SUNMONTUEWEDTHUFRISAT

Events for June 15, 2015

  • PhD Defense- George Konstantinidis

    Mon, Jun 15, 2015 @ 03:00 PM - 05:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    PhD Candidate: George Konstantinidis

    Committee:

    Jose Luis Ambite (chair)
    Craig Knoblock
    Cyrus Shahabi
    Daniel O'Leary

    Monday June 15th
    GFS 101, 3pm

    Title: Scalable Data Integration under Constraints

    In this thesis, we are dealing with the problems of data integration, data exchange, query answering and query containment with and without the presence of integrity constraints/ontologies. First, we develop a scalable algorithm for virtual data integration. The novel insight of our approach is to look at the problem from a graph perspective and design an algorithm which compactly represents overlapping graph patterns in the mappings that connect the mediating schema to the source schema. This representation speeds up tremendously the computation of homomorphisms (a core mechanism in query rewriting) and, optionally, pushes some computation offline. This, together with other optimizations, results in an experimental performance that rewrites queries in the presence of 10000 sources in under a second, which is about two orders of magnitude faster than state-of-the-art algorithms. We also present a solution for query containment under linear weakly-acyclic constraints, a set of constraints that captures the useful RDF/S language (a w3c recommendation). Subsequently, we examine the chase algorithm, which is the main tool to reason with dependencies for data exchange. We develop an optimization of the standard chase algorithm, called the frugal chase, that produces smaller target databases, still with polynomial data complexity.

    Instead of adding the entire consequent to a solution when a chase rule is applicable, the frugal chase avoids adding provably redundant atoms. In contrast to the core chase (which produces the minimal output), our algorithm does not rely on a costly minimization step; hence, it is almost three orders of magnitude faster. Compared to the standard chase, which produces a larger but equivalent output, the frugal chase is almost two orders of magnitude faster. Lastly, we develop the theoretical foundations for a combined approach to query answering under constraints. We focus on linear TGDs, which capture the essential part of practically useful languages, such as DL-Lite/OWL2-QL (also a w3c recommendation). We propose a novel and much simpler combined approach to query answering. In particular, we show how to use only a finite portion of the infinite chase and prove that one can chase the data up to a point independently of any query, in effect pruning the rest of the infinite chase, and then just pre-process any incoming query at runtime, such that it can "fit" inside our pruned chase. Unlike the traditional, or the combined UCQ rewriting of the query, which is of a large exponential size, our UCQ rewriting needed to get the certain answers, using our pruned chase result, is much smaller in size.

    Location: 101

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File