-
CS Colloquium: Ibrahim Sabek (MIT) - Building Better Data-Intensive Systems Using Machine Learning
Thu, Apr 13, 2023 @ 11:00 AM - 12:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Ibrahim Sabek, MIT
Talk Title: Building Better Data-Intensive Systems Using Machine Learning
Series: CS Colloquium
Abstract: Database systems have traditionally relied on handcrafted approaches and rules to store large-scale data and process user queries over them. These well-tuned approaches and rules work well for the general-purpose case, but are seldom optimal for any actual application because they are not tailored for the specific application properties (e.g., user workload patterns). One possible solution is to build a specialized system from scratch, tailored for each use case. Although such a specialized system is able to get orders-of-magnitude better performance, building it is time-consuming and requires a huge manual effort. This pushes the need for automated solutions that abstract system-building complexities while getting as close as possible to the performance of specialized systems. In this talk, I will show how we leverage machine learning to instance-optimize the performance of query scheduling and execution operations in database systems. In particular, I will show how deep reinforcement learning can fully replace a traditional query scheduler. I will also show that-”in certain situations-”even simpler learned models, such as piece-wise linear models approximating the cumulative distribution function (CDF) of data, can help improve the performance of fundamental data structures and execution operations, such as hash tables and in-memory join algorithms.
This lecture satisfies requirements for CSCI 591: Research Colloquium
Biography: Ibrahim Sabek is a postdoc at MIT and an NSF/CRA Computing Innovation Fellow. He is interested in building the next generation of machine learning-empowered data management, processing, and analysis systems. Before MIT, he received his Ph.D. from University of Minnesota, Twin Cities, where he studied machine learning techniques for spatial data management and analysis. His Ph.D. work received the University-wide Best Doctoral Dissertation Honorable Mention from University of Minnesota in 2021. He was also awarded the first place in the graduate student research competition (SRC) in ACM SIGSPATIAL 2019 and the best paper runner-up in ACM SIGSPATIAL 2018.
Host: Cyrus Shahabi
Location: Olin Hall of Engineering (OHE) - 132
Audiences: Everyone Is Invited
Contact: Assistant to CS chair