BEGIN:VCALENDAR METHOD:PUBLISH PRODID:-//Apple Computer\, Inc//iCal 1.0//EN X-WR-CALNAME;VALUE=TEXT:USC VERSION:2.0 BEGIN:VEVENT DESCRIPTION:Ph.D. Defense - Haoyu Huang 7/16 2:00 pm "Nova-LSM: A Distributed, Component-based LSM-tree Data Store"\n \n Ph.D. Candidate: Haoyu Huang\n Date: Thursday, July 16, 2020\n Time: 2:00 pm - 4:00 pm\n Committee: Shahram Ghandeharizadeh (chair), Murali Annavaram, Jyotirmoy V. Deshmukh\n Title: Nova-LSM: A Distributed, Component-based LSM-tree Data Store\n Zoom: https://usc.zoom.us/j/99943500149\n Google Meet (only if there are issues with Zoom): meet.google.com/ruu-jjiu-fbk\n \n Abstract:\n The cloud challenges many fundamental assumptions of today's monolithic data stores. It offers a diverse choice of servers with alternative forms of processing capability, storage, memory sizes, and networking hardware. It also offers fast network between servers and racks such as RDMA. This motivates a component-based architecture that separates storage from processing for a data store. This architecture complements the classical shared-nothing architecture by allowing nodes to share each other's disk bandwidth. This is particularly useful with a skewed pattern of access to data by scattering a large file across many disks instead of storing it on one disk.\n \n This emerging component-based software architecture constitutes the focus of this dissertation. We present design, implementation, and evaluation of Nova-LSM as an example of this architecture. Nova-LSM is a component-based design of LSM-tree using RDMA. Its components implement the following novel concepts. First, they use RDMA to enable nodes of a shared-nothing architecture to share their disk bandwidth and storage. Second, they construct ranges dynamically at runtime to parallelize compaction and boost performance. Third, they scatter blocks of a file across an arbitrary number of disks and use power-of-d to scale. Fourth, the logging component separates availability of log records from their durability. These design decisions provide for an elastic system with well-defined knobs that control its performance and scalability characteristics. We present an implementation of these designs using LevelDB as a starting point. Our evaluation shows Nova-LSM scales and outperforms its monolithic counterpart by several orders of magnitude. This is especially true with workloads that exhibit a skewed pattern of access to data. SEQUENCE:5 DTSTART:20200716T140000 LOCATION: DTSTAMP:20200716T140000 SUMMARY:PhD Defense - Haoyu Huang UID:EC9439B1-FF65-11D6-9973-003065F99D04 DTEND:20200716T160000 END:VEVENT END:VCALENDAR