BEGIN:VCALENDAR METHOD:PUBLISH PRODID:-//Apple Computer\, Inc//iCal 1.0//EN X-WR-CALNAME;VALUE=TEXT:USC VERSION:2.0 BEGIN:VEVENT DESCRIPTION:Ph.D. Candidate: Zhiyun Lu\n Date: Wednesday, May 20, 2020\n Time: 12:00 PM - 2:00 PM\n Committee: Fei Sha (Chair), Haipeng Luo, C.-C. Jay Kuo\n \n Title: Leveraging Training Information for Efficient and Robust Deep Learning\n \n Abstract: Deep neural nets have exhibited great success on a wide range of machine learning problems across various domains, such as speech, image, and text. Despite decent prediction performances, there are rising concerns for the `in-the-lab' machine learning models to be vastly deployed in the wild. In this thesis, we study two of the main challenges in deep learning: efficiency, computational as well as statistical, and robustness. We describe a set of techniques to solve the challenges by utilizing information from the training process intelligently. The solutions go beyond the common recipe of a single point estimate of the optimal model. \n \n The first part of the thesis studies the efficiency challenge. We propose a budgeted hyper-parameter tuning algorithm to improve the computation efficiency of hyper-parameter tuning in deep learning. It can estimate and utilize the trend of training curves to adaptively allocate resources for tuning, which demonstrates improved efficiency over state-of-the-art tuning algorithms. Then we study the statistical efficiency on tasks with limited labeled data. Specifically we focus on the task of speech sentiment analysis. We apply pre-training using automatic speech recognition data, and solve sentiment analysis as a downstream task, which greatly improves the data efficiency of sentiment labels.\n \n \n \n The second part of the thesis studies the robustness challenge. Motivated by the resampling method in statistics, we study the uncertainty estimate of neural networks by local perturbative approximations. We propose to sample replicas of the model parameters from a Gaussian distribution to form a pseudo-ensemble. The ensemble predictions are used to estimate the uncertainty of the original model, which improves its robustness against invalid inputs.\n \n \n Meeting links:\n \n \n Zoom: https://usc.zoom.us/j/96089712182 (Meeting ID: 960 8971 2182)\n \n \n Google Meet (backup): meet.google.com/nxz-eybf-urw\n \n SEQUENCE:5 DTSTART:20200520T120000 LOCATION: DTSTAMP:20200520T120000 SUMMARY:PhD Defense - Zhiyun Lu UID:EC9439B1-FF65-11D6-9973-003065F99D04 DTEND:20200520T140000 END:VEVENT END:VCALENDAR