CS Colloquium: Rashmi K. Vinayak (UC Berkeley) - Smart redundancy for big-data systems: Theory and Practice
Thu, Feb 16, 2017 @ 11:00 AM - 12:20 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Rashmi K. Vinayak, UC Berkeley
Talk Title: Smart redundancy for big-data systems: Theory and Practice
Series: CS Colloquium
Abstract: This lecture satisfies requirements for CSCI 591: Computer Science Research Colloquium.
Large-scale distributed storage and caching systems form the foundation of big-data systems. A key scalability challenge in distributed storage systems is achieving fault tolerance in a resource-efficient manner. Towards addressing this challenge, erasure codes provide a storage-efficient alternative to the traditional approach of data replication. However, classical erasure codes come with critical drawbacks: while optimal in utilizing storage space, they significantly increase the usage of other important cluster resources such as network and I/O. In the first part of the talk, I present new erasure codes and theoretical optimality guarantees. The proposed codes reduce the network and I/O usage by 35-70% for typical parameters while retaining the storage efficiency of classical codes. I then present an erasure-coded storage system that employs the proposed codes, and demonstrate significant benefits over the state-of-the-art in evaluations under production setting at Facebook. Our codes have been incorporated into Apache Hadoop 3.0. The second part of the talk focuses on achieving high performance in distributed caching systems. These systems routinely face the challenges of skew in data popularity, background traffic imbalance, and server failures, which result in load imbalance across servers and degradation in read latencies. I present EC-Cache, a cluster cache that employs erasure coding to achieve a 3-5x improvement as compared to the state-of-the-art.
Biography: Rashmi K. Vinayak is a postdoctoral researcher in the EECS department at UC Berkeley, where she received her PhD in 2016. Her dissertation received the Eli Jury Award 2016 from the EECS department at UC Berkeley for outstanding achievement in the area of systems, communications, control, or signal processing. Rashmi is also a recipient of the Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16. She is also the recipient of the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. Her research interests lie in the theoretical and system challenges that arise in storage and analysis of big data, with a current focus on erasure coding for big-data systems.
Host: CS Department
Audiences: Everyone Is Invited
Contact: Assistant to CS chair