Logo: University of Southern California

Events Calendar


  • AI SEMINAR

    Fri, Jul 17, 2015 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Mesrob I. Ohannessian, Postdoctoral researcher at UC San Diego

    Talk Title: Good-Turing rare probability estimation: When it does and doesn't work.

    Series: AI Seminar

    Abstract: The "missing mass" is the probability of all unseen symbols in i.i.d. samples from a discrete distribution. It captures a very fundamental notion of rare event. Being able to estimate this probability was critical in the wartime efforts of Alan Turing and his coworker Jack Good. Together, they proposed a very simple estimator that has been very influential to this day. In this talk, I will first overview the Good-Turing estimator and its favorable properties. I will then dismantle this impeccable image. In particular, I will show that Good-Turing can fail to learn the missing mass in relative error, for even the simplest light-tailed distributions. In fact, no other estimator can do this without further specifying the distribution class. I will then reconstruct a new reputation for this old estimator, as a highly effective specialized rare probability estimator for heavy-tailed distributions. This explains its success in areas where these distributions arise, such as in natural language modeling. This change in perspective opens the door to streamlined estimation techniques that are inspired by extreme value theory, and that extend far beyond missing mass estimation.




    Biography: Mesrob I. Ohannessian is a postdoctoral researcher at UC San Diego. Previously, he spent two years in France, one at the Microsoft Research - Inria joint centre as a postdoc, and another at Université Paris-Sud as a Marie Curie Fellow under an ERCIM Alain Bensoussan Fellowship. He received his PhD in Electrical Engineering and Computer Science from MIT. His research interests are broadly in statistics, information theory, machine learning, and their applications, particularly to problems marked by data scarcity.


    Host: Aram Galstyan

    Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=55d2344730a54d739928f6a760f319511d

    Location: Information Science Institute (ISI) - 1135 - 11th fl Large CR

    WebCast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=55d2344730a54d739928f6a760f319511d

    Audiences: Everyone Is Invited

    Contact: Alma Nava / Information Sciences Institute

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar