-
CS Colloquium
Tue, Nov 23, 2010 @ 03:30 PM - 05:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Dr. Chun-Nan Hsu, Information Sciences Institute (ISI)
Talk Title: Accelerating Machine Learning by Aggressive Extrapolation
Abstract: This talk presents how to accelerate statistical machine learning algorithms for large scale applications by aggressive extrapolation. Extrapolation methods, such as Aitken's acceleration, have the advantage that they can achieve quadratic convergence with an overhead linear to the dimension of the training data. However, they can be numerically unstable and their convergence is only locally guaranteed. We show that this can be fixed by a double extrapolation method. There are two options for the extrapolation, global or component-wise. Previously, it was not clear which option is more effective. We show a general condition to determine which option will be more effective and show how to apply the condition to the training of Bayesian networks and conditional random fields (CRF). Then we show that extrapolation can accelerate on-line learning with a method called Periodic Step-size Adaptation (PSA). We show that PSA is an approximation of a theoretic "single-pass" on-line learning method, which can converge to an empirical optimum in a single pass through the training examples. With a single-pass on-line learning method, disk I/O can be minimized when a training set is too large to fit in memory. Experimental results for a wide variety of models, including CRF, linear SVM, and convolutional neural networks, show that single-pass performance of PSA is always very close to empirical optimum. Finally, an application to gene mention tagging for biological text mining will be presented, which achieved the top score in BioCreative 2 challenge in 2007 and again in BioCreative 3 challenge in 2010.
Biography: Dr. Chun-Nan Hsu is a computer scientist at Information Sciences Institute (ISI). Prior to joining ISI, he is Research Fellow and Leader of the Adaptive Internet Intelligent Agents (AIIA) Lab at the Institute of Information Science, Academia Sinica, Taipei, Taiwan. His research interests include machine learning, data mining, databases and bioinformatics. He earned his M.S. and Ph.D. degree in Computer Science from the University of Southern California, Los Angeles, CA, in 1992 and 1996, respectively. In 1996, before he passed his doctoral oral exam, he had been offered a position as Assistant Professor at the Department of Computer Science and Engineering, Arizona State University, Tempe, AZ. He taught there for two years before he returned to Taiwan in 1998. Since 2005, he has been the principal investigator of the Advanced Bioinformatics Core, National Research Program in Genomic Medicine, Taiwan, and leading one of the largest research efforts in computerized drug design and discovery in Taiwan. In 2006, the first drug candidate due to the use of the software his team developed was commercialized. In 2007, his teams achieved the best scores in the BioCreative 2 text mining challenge. Dr. Hsu has published about 90 scientific articles since 1993. Currently, Dr. Hsu has been working on applying artificial intelligence to computational biology and bioinformatics.
Host: Dr. Dennis McLeod
Location: Seaver Science Library (SSL) - 150
Audiences: Everyone Is Invited
Contact: Kanak Agrawal