Logo: University of Southern California

Events Calendar


  • EE Seminar: Robust System Design

    Thu, Apr 03, 2014 @ 10:30 AM - 12:00 PM

    Ming Hsieh Department of Electrical and Computer Engineering

    Conferences, Lectures, & Seminars


    Speaker: Dr. Yanjing Li, Research Scientist, Intel Labs

    Talk Title: Robust System Design

    Abstract: Malfunctions in electronic systems can have major consequences ranging from loss of data and services, to financial
    and productivity losses, or even loss of human life. Such impacts continue to increase as systems become more
    complex, interconnected, and pervasive. Hardware failures are especially a growing concern because:
    1. Existing test and validation methods barely cope with today’s complexity. New techniques will be essential to
    minimize the effects of defects and design flaws.
    2. For coming generations of silicon technologies, several failure mechanisms that were largely benign in the
    past are now becoming visible at the system level. A large class of future systems will require tolerance of hardware
    errors during their operation.
    Robust system design is required to ensure that future electronic systems, from supercomputers all the way to
    embedded systems, perform correctly despite rising levels of complexity and disturbances. Traditional fault‐tolerant
    computing techniques are generally very expensive, and often inadequate, for this purpose. I will present two
    techniques that are essential for robust system design:
    1. A new online self‐test and diagnostics technique, called CASP, which enables a system to test itself thoroughly
    during normal operation to quickly detect and localize hardware failures. CASP is very thorough with respect to a
    wide variety of test coverage metrics (96‐99.5%) while incurring only 1% area and power costs, and 3% performance
    cost. In contrast, existing techniques suffer from low coverage (e.g., 70%), high area costs (e.g., 20%), or significant
    performance penalties (e.g., 30%) including possible system unresponsiveness.
    2. A new self‐repair technique to keep the system functioning correctly even in the presence of hardware failures.
    Unlike naïve redundancy with very high (20%) area costs, this technique enables thorough self‐repair with only 7.5%
    area impact, 3% power impact, and 0‐5% performance impact.
    A key aspect of the approach to these techniques is the orchestration across multiple abstraction layers: physical
    design, architecture, and system software. I will demonstrate the effectiveness and practicality of these techniques
    using results from the industrial OpenSPARC T2 multi‐core design and the Intel Core i7 hardware platform. I will also
    share recent experiences in implementing these techniques in the latest Intel designs.

    Biography: Yanjing Li is a research scientist at Intel Labs and a visiting scholar at Stanford University. She received her Ph.D. in
    Electrical Engineering from Stanford University. Her research interests include robust system design, energy‐efficient
    systems, system validation and test, computer architecture, and system software. Dr. Li received the European Design
    and Automation Association Outstanding Dissertation Award, the IEEE International Test Conference Best Student
    Paper Award, and the IEEE VLSI Test Symposium Best Paper Award for novel research on robust system design, and
    two Intel Divisional Recognition Awards for mobile processor designs that are being adopted by product groups at
    Intel.

    Host: Professor Murali Annavaram

    Location: Hughes Aircraft Electrical Engineering Center (EEB) - 248

    Audiences: Everyone Is Invited

    Contact: Janice Thompson

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar