Logo: University of Southern California

Events Calendar


  • PhD Defense - Haoyu Huang

    Thu, Jul 16, 2020 @ 02:00 PM - 04:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Ph.D. Defense - Haoyu Huang 7/16 2:00 pm "Nova-LSM: A Distributed, Component-based LSM-tree Data Store"

    Ph.D. Candidate: Haoyu Huang
    Date: Thursday, July 16, 2020
    Time: 2:00 pm - 4:00 pm
    Committee: Shahram Ghandeharizadeh (chair), Murali Annavaram, Jyotirmoy V. Deshmukh
    Title: Nova-LSM: A Distributed, Component-based LSM-tree Data Store
    Zoom: https://usc.zoom.us/j/99943500149
    Google Meet (only if there are issues with Zoom): meet.google.com/ruu-jjiu-fbk

    Abstract:
    The cloud challenges many fundamental assumptions of today's monolithic data stores. It offers a diverse choice of servers with alternative forms of processing capability, storage, memory sizes, and networking hardware. It also offers fast network between servers and racks such as RDMA. This motivates a component-based architecture that separates storage from processing for a data store. This architecture complements the classical shared-nothing architecture by allowing nodes to share each other's disk bandwidth. This is particularly useful with a skewed pattern of access to data by scattering a large file across many disks instead of storing it on one disk.

    This emerging component-based software architecture constitutes the focus of this dissertation. We present design, implementation, and evaluation of Nova-LSM as an example of this architecture. Nova-LSM is a component-based design of LSM-tree using RDMA. Its components implement the following novel concepts. First, they use RDMA to enable nodes of a shared-nothing architecture to share their disk bandwidth and storage. Second, they construct ranges dynamically at runtime to parallelize compaction and boost performance. Third, they scatter blocks of a file across an arbitrary number of disks and use power-of-d to scale. Fourth, the logging component separates availability of log records from their durability. These design decisions provide for an elastic system with well-defined knobs that control its performance and scalability characteristics. We present an implementation of these designs using LevelDB as a starting point. Our evaluation shows Nova-LSM scales and outperforms its monolithic counterpart by several orders of magnitude. This is especially true with workloads that exhibit a skewed pattern of access to data.

    WebCast Link: https://usc.zoom.us/j/99943500149

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    OutlookiCal

Return to Calendar