NL Seminar-Fighting COVID 19 using Linear Time Algorithms from Computational Linguistics
Thu, May 21, 2020 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Liang Huang, Baidu Oregon State University
Talk Title: Fighting COVID 19 using Linear Time Algorithms from Computational Linguistics
Abstract: To defeat the current COVID 19 pandemic, which has already claimed 250,000 deaths as of early May, a messenger RNA mRNA vaccine has emerged as a promising approach thanks to its rapid and scalable production and non infectious and non integrating properties. However, designing an mRNA sequence to achieve high stability and protein yield remains a challenging problem due to the exponentially large search space e.g., there are 10 632 possible mRNA sequence candidates for the spike protein of SARS CoV 2.
We describe two on going efforts at solving this problem, both using linear time algorithms from my group inspired by my earlier work in parsing. On one hand, the Eterna Open Vaccine project from Stanford Medical School takes a crowd sourcing approach to let game players all over the world design stable sequences. To evaluate sequence stability in terms of free energy, they use Linear Fold from my group 2019 since its the only linear time RNA folding algorithm available which makes it the only one fast enough for COVID scale genomes. On the other hand, we take a computational approach to directly search for the optimal sequence in this exponentially large space via dynamic programming. It turns out this problem can be reduced to a classical problem in formal language theory and computational linguistics intersection between CFG and DFA , which can be solved in O n 3 time, just like lattice parsing for speech. In the end, we can design the optimal mRNA vaccine candidate for SARS CoV 2 spike protein in 1 hour with exact search, or just 11 minutes with a beam of 1000 at the cost of only 0.6 percent loss in energy.
Biography: Liang Huang is currently an Assistant Professor of EECS at Oregon State University and Distinguished Scientist part time at Baidu Research USA. Before that he was Assistant Professor for three years at the City University of New York CUNY and a part-time Research Scientist with IBM's Watson Group. He graduated in 2008 from Penn and has worked as a Research Scientist at Google and a Research Assistant Professor at USC ISI. Most of his work develops fast algorithms and provable theory to speedup large-scale natural language processing, structured machine learning, and computational structural biology. He has received a Best Paper Award at ACL 2008 sole author, a Best Paper Honorable Mention at EMNLP 2016, several best paper nominations ACL 2007, EMNLP 2008, and ACL 2010, two Google Faculty Research Awards 2010 and 2013, a Yahoo! Faculty Research Award 2015, and a University Teaching Prize at Penn 2005. He was a keynote speaker at ACL 2019. His recent interest is to apply computational linguistics to computational biology, where he works on RNA folding & design using his earlier work on incremental parsing.
Host: Emily Sheng
More Info: https://nlg.isi.edu/nl-seminar/
Location: Virtual Only
WebCast Link: https://usc.zoom.us/j/94526753732
Audiences: Everyone Is Invited
Contact: Petet Zamar
Event Link: https://nlg.isi.edu/nl-seminar/