USC - Viterbi School of Engineering

May
14

PhD Defense - Joshua Garcia
Wed, May 14, 2014 @ 01:00 PM - 03:00 PM
Thomas Lord Department of Computer Science
University Calendar

Title:
A Unified Framework for Identifying and Studying Architectural Decay of Software Systems

Ph.D Candidate: Joshua Garcia

Time: 1:00pm
Date: May 14, 2014
Location: PHE 223

Committee:
Nenad Medvidovic (Chair)
William G.J. Halfond
Stan Settles (Outside Member)

Abstract:
The effort and cost of software maintenance tends to dominate other activities in a software system's lifecycle. A critical aspect of maintenance is understanding and updating a software system's architecture. However, the maintenance of a system's architecture is exacerbated by the related phenomena of architectural drift and erosion---collectively called architectural decay---which are caused by careless, unintended addition, removal, and/or modification of architectural design decisions. These phenomena make the architecture more difficult to understand and maintain and, in more severe cases, can lead to errors that result in wasted effort or loss of time or money. To deal with architectural decay, an engineer must be able to obtain (1) the current architecture of her system and understand (2) the possible types of decay that may occur in a software system and (3) the manner in which architectures tend to change and the decay it often causes.

The high-level contribution of this dissertation is a unified framework for addressing different aspects of architectural decay in software systems. This framework includes a catalog comprising an expansive list of architectural smells (i.e., architectural-decay instances) and a means of identifying such smells in software architectures; a framework for constructing ground-truth architectures to aid the evaluation of automated recovery techniques; ARC, a novel recovery approach that is accurate and extracts rich architectural abstractions; and ARCADE, a framework for the study of architectural change and decay. Together, these aspects of the unified framework are a comprehensive means of addressing the different problems that arise due to architectural decay.

This dissertation provides several evaluations of its different contributions: it presents case studies of architectural smells, describes lessons learned from applying the ground-truth recovery framework, compares architecture-recovery techniques along multiple accuracy measures, and contributes the most extensive empirical study of architectural change and decay to date. This dissertation's comparative analysis of architecture-recovery techniques addresses several shortcomings of previous analyses, including the quality of ground truth utilized, the selection of recovery techniques to be analyzed, and the limited number of perspectives from which the techniques are evaluated. The empirical study of architectural change and decay in this dissertation is the largest empirical study to date of its kind in long-lived software systems; the study comprises over 112 million source-lines-of-code and 460 system versions.

Location: Charles Lee Powell Hall (PHE) - 223
Audiences: Everyone Is Invited

Contact: Lizsl De Leon

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
May
14

PhD Defense - Qunzhi Zhou
Wed, May 14, 2014 @ 02:00 PM - 04:00 PM
Thomas Lord Department of Computer Science
University Calendar

Title: A Complex Event Processing Framework for Holistic Fast Data Management

Ph.D Candidate: Qunzhi Zhou

Defense Committee:
Viktor Prasanna (Co-Chair)
Yogesh Simmhan (Co-Chair)
Ellis Horowitz
Petros Ioannou

Time: 2:00 PM - 4:00 PM @ Wednesday, May 14, 2014

Location: Hughes Aircraft Electrical Engineering Building (EEB) 248

Abstract:
Emerging applications in domains like Smart Grid, e-commerce and financial services have been motivating Fast Data which emphasizes the Velocity aspect of Big Data. Utility companies, social media and financial institutions often face scenarios where they need to process data arriving continuously at high rate for businesses innovation and analytics. Existing Big Data management systems however have mostly focused on the Volume aspect of Big Data. Systems including Hadoop and NoSQL databases provide programming and query primitives that allow scalable storage and querying of very large data sets. These systems are best suited for applications that perform write-once-read-many operations on slow-changing data volumes for their focuses on data availability and read performance.

Complex Event Processing (CEP), on the other hand, is a promising paradigm to manage Fast Data. CEP is recognized for online analytics of data that arrive continuously from ubiquitous, always-on sensors and digital event streams. It allows event patterns composed with correlation constraints, also called complex events, to be detected from examining event streams in realtime for situation awareness. Specifically, CEP adopts high throughput temporal pattern matching algorithms to handle data Velocity. As a result, CEP has grown popular for operational intelligence where online pattern detection drives realtime response.

Fast Data management motivates certain distinctive capabilities from CEP systems to deal with concurrent data Variety, Volume and Velocity. In this dissertation, we present a Complex Event Processing framework for holistic Fast Data management that considers all the 3 VÃ¢â¬â¢s. In particular, we extend the state-of-the-art CEP systems and make the following contributions: 1) Semantic Complex Event Processing for on-the-fly query processing over diverse data streams, shielding data and domain Varieties; 2) Stateful Complex Event Processing that provides a hierarchical query paradigm for dynamic stream Volume management and on-demand query evaluation; 3) Resilient Complex Event Processing that supports integrated querying across low-Velocity data archives and realtime data streams. We perform quantitative evaluations using real-world applications from Smart Grid domain to verify the efficacy of the proposed framework and demonstrate the performance benefits of the optimization techniques.

Bio:
Qunzhi Zhou is currently a PhD candidate in the Computer Science Department at the University of Southern California. His research interests are in information integration, stream processing and distributed computing systems. He has a M.S. in Computer Science from University of Southern California and received his B.S. in Automation from Tsinghua University, China.

Location: Hughes Aircraft Electrical Engineering Center (EEB) - 248
Audiences: Everyone Is Invited

Contact: Lizsl De Leon

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
May
14

NL Seminar- Qualification Practice Talk / Beyond Parallel Data
Wed, May 14, 2014 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Qing Dou, USC/ISI

Talk Title: Beyond Parallel Data

Series: Natural Language Seminar

Abstract: Thanks to the availability of parallel data and advances in machine learning techniques, we have seen tremendous improvement in the field of machine translation over the past 20 years. However, due to lack of parallel data, the quality of machine translation is still far from satisfying for many language pairs and domains. In general, it is easier to obtain non-parallel data, and much work has tried to learn translations from non-parallel data. Nonetheless, improvements to machine translation have been limited. In this work, I follow a decipherment approach to learn translations from non parallel data and achieve significant gains in machine translation.
I apply slice sampling to Bayesian decipherment. Compared with the state-of-the-art algorithm, the new approach is highly scalable and accurate, making it possible to decipher billions of tokens with hundreds of thousands of word types at high accuracy for the first time. Furthermore, I introduce dependency relations to address the problems of word reordering, insertion, and deletion when deciphering foreign languages, and show that dependency relations help improve deciphering accuracy by over 5-fold. I decipher large amounts of monolingual data to learn translations for out-of-vocabulary words and observe significant gains of up to 3.8 BLEU points in domain-adaptation. Moreover, I show that a translation lexicon learned from large amounts of non-parallel data with decipherment can improve a phrase-based machine translation system trained with limited parallel data. In experiments, I observe BLEU gains of 1.2 to 1.8 across three different test sets.

Given the above success, I propose to work on advancing machine translation of real world low density languages, and to explore using non-parallel data to improve word alignment and discovery of phrase translations.

Qing Dou is a fourth year PhD student at USC/ISI, advised by Professor Kevin Knight.

Biography: Home Page:
http://www.isi.edu/~qdou/

Host: Aliya Deri and Kevin Knight

More Info: http://nlg.isi.edu/nl-seminar/

Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited

Contact: Peter Zamar

Event Link: http://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Events Calendar

Select a calendar:

Filter May Events by Event Type:

Events for May 14, 2014

PhD Defense - Joshua Garcia

PhD Defense - Qunzhi Zhou

NL Seminar- Qualification Practice Talk / Beyond Parallel Data