-
NL Seminar-Translating faster than a keystroke and dumpster diving for training data
Thu, Jan 13, 2022 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Kenneth Heafield, Univ of Edinburgh
Talk Title: Translating faster than a keystroke and dumpster diving for training data
Abstract: REMINDER Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you are an outside visitor, please inform us at nlg DASH seminar DASH host AT isi DOT edu beforehand so we will be aware of your attendance and let you in.
Machine translation has a deserved reputation for computational cost. But by burning even more GPU time upfront, we can
make inference fast enough to translate thousands of words per second on a desktop or a sentence in under 10 ms on one CPU core. I will talk about optimizations from chopping off transformer heads to writing assembly that make this possible. Software is available at translatelocally.com and coming soon as a Firefox extension. Fast translation was also useful for the ParaCrawl project, where we went dumpster diving on the web for translations and found a few COMET/BLEU points.
Biography: Kenneth Heafield is a Reader/Associate Professor in en-US)
at the University of Edinburgh working on fast and often good machine translation. He coordinates the Bergamot project adding local translation to Firefox, ran the ParaCrawl project, and was friendly competition with ISI in MATERIAL. He wrote kenlm to do large language models before they were cool.
Host: Jon May and Thamme Gowda
More Info: https://nlg.isi.edu/nl-seminar/
Webcast: https://usc.zoom.us/j/93223825200Location: Information Science Institute (ISI) - Virtual Only
WebCast Link: https://usc.zoom.us/j/93223825200
Audiences: Everyone Is Invited
Contact: Pete Zamar
Event Link: https://nlg.isi.edu/nl-seminar/