Logo: University of Southern California

Events Calendar

  • PhD Thesis Proposal - Minh Pham

    Tue, Aug 03, 2021 @ 12:30 PM - 02:00 PM

    Thomas Lord Department of Computer Science

    University Calendar

    Date and Time: 12:30 - 2:00 pm

    Tuesday, August, 3rd

    Committee: Craig Knoblock, Bistra Dilkina, Muhao Chen, Xiang Ren, Gerard Hoberg

    Title: Robust and Proactive Error Detection and Correction in Tables

    Web Tables serve as a rich source of knowledge that supports many knowledge-driven intelligent applications. However, similar to other online resources, information in Web tables is prone to errors and noise. To that end, data cleaning is an important step in table preprocessing and any untreated errors in tables can be detrimental for applications in later phases. Existing supervised methods in data cleaning depend on obtaining sufficient training data, which requires extensive human involvement, while unsupervised methods rely on fixed inductive biases to solve the problem, which is often not generalizable. In this proposal, we articulate the challenges posed in traditional table data cleaning studies and propose a unified solution to address these issues. The proposed approach uses open-domain question answering to proactively mine evidence from Web text to verify table semantic content. Also, an active learning model is integrated to leverage weakly supervised detectors/correctors for closed-domain robust syntactic error detection and correction. In combination, the unified framework aims to improve the accuracy and reduce human interaction in data cleaning for both syntactic and semantic errors.

    WebCast Link: https://usc.zoom.us/j/96447577773

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon


Return to Calendar