University Calendar
Events for June
-
PhD Defense - Weiwei Chen
Thu, Jun 05, 2014 @ 08:30 AM - 10:30 AM
Thomas Lord Department of Computer Science
University Calendar
Title: Workflow Restructuring Techniques for Improving the Performance for Scientific Workflows Executing in Distributed Environments
PhD Candidate: Weiwei Chen
Committee: Ewa Deelman(Chair), Viktor K. Prasanna (External member), Aiichiro Nakano
Time: Jun 5, 8:30am-10:30am
Location: RTH 306
Abstract Scientific workflows are a means of defining and orchestrating large, complex, multi-stage computations that perform data analysis, simulation, visualization, etc. Scientific workflows often involve a large amount of data transfers and computations and require efficient optimization techniques to reduce the runtime of the overall workflow.
Today, with the emergence of large-scale scientific workflows executing in modern distributed environments such as grids and clouds, the optimization of workflow runtime has introduced new challenges that existing optimization methods do not tackle. Traditionally, many existing runtime optimization methods are confined to the task scheduling problem. They do not consider the refinement of workflow structures, system overheads, the occurrence of failures, etc. Refining workflow structures using techniques such as workflow partitioning and task clustering represent a new trend in runtime optimization and can result in significant performance improvement. The runtime improvement of these workflow restructuring methods depends on the ratio of application computation time to the system overheads. Since system overheads in modern distributed systems can be high, the potential benefit of workflow restructuring can be significant.
This thesis argues that workflow restructuring techniques can significantly improve the runtime of scientific workflows executing in modern distributed environments. In particular, we innovate in the area of workflow partitioning and task clustering techniques. Several previous studies also utilize workflow partitioning and task clustering techniques to improve the performance of scientific workflows. However, existing methods are based on the trial-and-error approach and require users' knowledge to tune the workflow performance. For example, many workflow partitioning techniques do not consider constraints on resources used by the workflows such as the data storage. Also, many task clustering methods optimize task granularity at the workflow level without considering data dependencies between tasks. We distinguish our work from other research by modeling a realistic distributed system with overheads and failures and we use real world applications that exhibit load imbalance in their structure and computations. We investigate the key concern of refining workflow structures and propose a series of innovative workflow partitioning and task clustering methods to improve the runtime performance. Simulation-based and real system-based evaluation verifies the effectiveness of our methods.
Location: Ronald Tutor Hall of Engineering (RTH) - 306
Audiences: Everyone Is Invited
Contact: Lizsl De Leon
-
PhD Defense- Ashish Vaswani
Thu, Jun 12, 2014 @ 01:00 PM - 03:00 PM
Thomas Lord Department of Computer Science
University Calendar
PhD Candidate: Ashish Vaswani
Date: 12th June, 2014
Location: GFS 111
Time: 1pm
Committee:
Dr. David Chiang (Chair)
Dr. Liang Huang (Co-chair)
Dr. Kevin Knight
Dr. Jinchi Lv (Outside member)
Title: Smaller, Faster, and Accurate Models for Statistical Machine Translation
The goal of machine translation is to translate from one natural language into another using computers. The current dominant approach to machine translation, statistical machine translation (SMT), uses large amounts of training data to automatically learn to translate from the source language to target language. SMT systems typically contain three primary components: word alignment models, translation rules, and language models. These are some of the largest models in all of natural language processing, containing up to a billion parameters. Learning and employing these components pose difficult challenges of scale and generalization: using large models in statistical machine translation can slow down the translation process; learning models with so many parameters can cause them to explain the training data too well, degrading their performance at test time. In this thesis, we improve SMT by addressing these issues of scale and generalization for word alignment, learning translation grammars, and language modeling.
Word alignments, which are correspondences between pairs of source and target words, are used to derive translation grammars. Good word alignment can result in good translation rules, improving downstream translation quality. We will present an algorithm for training unsupervised word alignment models by using a prior that encourages learning smaller models, which improves both alignment and translation quality on large scale SMT experiments.
SMT systems typically model the translation process as a sequence of translation steps, each of which uses a translation rule. Most statistical machine translation systems use composed rules (rules that can be formed out of smaller rules in the grammar) to capture more context, improving translation quality. However, composition creates many more rules and large grammars, making both training and decoding inefficient. We will describe an approach that uses Markov models to capture dependencies between a minimal set of translation rules, which leads to a slimmer model, a faster decoder, yet the same translation quality as composed rules.
Good language models are important for ensuring fluency of translated sentences. Because language models are trained on very large amounts of data, in standard n-gram language models, the number of parameters can grow very quickly, making parameter learning difficult. Neural network language models (NNLMs) can capture distributions over sentences with many fewer parameters. We will present recent work on efficiently learning large-scale, large-vocabulary NNLMs. Integrating these NNLMs into a hierarchical phrase based MT decoder improves translation quality significantly.
Location: Grace Ford Salvatori Hall Of Letters, Arts & Sciences (GFS) - 111
Audiences: Everyone Is Invited
Contact: Lizsl De Leon
-
PhD Defense- Anand Kumar Narayanan
Mon, Jun 16, 2014 @ 01:00 PM - 03:00 PM
Thomas Lord Department of Computer Science
University Calendar
Title: Computation of Class Groups and Residue Class Rings of Function Fields over Finite Fields.
PhD Candidate: Anand Kumar Narayanan
Committee: Ming-Deh Huang (Chair), Leonard Adleman, Sheldon Kamienny (External member)
Time: Monday, Jun 16, 1:00pm-3:00pm. Location: EEB 248
Abstract.
We study the computation of the structure of two naturally occurring finite abelian groups associated with function fields over finite fields : the degree zero divisor class group and the multiplicative group of the finite field itself.
Let k be the rational function field over a finite field and let F/k be a finite geometric abelian extension with a rational place that completely splits. We prove that for all primes p neither dividing the characteristic of k nor the degree of F/k, the structure of the p-part of the divisor class group of F is determined by Kolyvagin derivative classes that are constructed out of Euler systems associated with Stark units. Further, we describe an algorithm to compute the structure of the p-part of the divisor class group given a certain Galois module generator of the Stark units. Unlike index calculus methods, our algorithm for computing divisor class groups is deterministic. Other applications of our technique include a fast algorithm for computing the divisor class number of narrow ray class extensions.
The multiplicative group of a finite field is cyclic and generators (primitive elements) are abundant. However, finding one efficiently remains an unsolved problem. We describe a deterministic algorithm for finding a generating element of the multiplicative group of the finite field with p^n elements where p is a prime. In time polynomial in p and n, the algorithm either outputs an element that is provably a generator or declares that it has failed in finding one. Under a heuristic assumption, the algorithm does succeed in finding a generator.
In addition, we present a novel algorithm to factor polynomials over finite fields using Carlitz modules from the arithmetic theory of function fields.
Location: Hughes Aircraft Electrical Engineering Center (EEB) - 248
Audiences: Everyone Is Invited
Contact: Lizsl De Leon
-
PhD Defense - Nupul Kukreja
Mon, Jun 23, 2014 @ 01:00 PM - 03:00 PM
Thomas Lord Department of Computer Science
University Calendar
PhD Candidate: Nupul Kukreja
Title: Social-Networking Based Collaborative Requirements Elicitation, Negotiation and Prioritization
Date: Monday, June 23rd, 2014
Time: 1 p.m.
Location: GFS 114
Committee:
Barrry Boehm (chair)
William Halfond
Ann Majchrzak (outside member)
Abstract:
Avoiding a major source of system and software project failures by finding more non-technical-user friendly methods of system definition and evolution has been a significant challenge. Another challenging problem, an outcome of the system definition process, is selecting system and software requirements to implement in a particular product or release. Business stakeholders strive to maximize return on investment by selecting the most valuable requirements for implementation. Deciding on the requirements to be selected entails a great deal of communication and coordination amongst the stakeholders to ascertain the priorities of the individual requirements.
With the advent of social networking and popularity of Facebook and Gmail, we have developed a radically different way for collaborative requirements management, negotiation and prioritization based on the WinWin negotiation model. The new avatar of the WinWin framework called ââ¬ËWinbookââ¬â¢ is based on the social networking paradigm, similar to Facebook and content organization using color coded labels, similar to Gmail and an 'Excel' like ability for prioritizing the requirements collaboratively.
The prioritized requirements aid in the planning and sequencing of implementation activities associated with the software system and provides a basis of a prioritized backlog from which the requirements can be ââ¬Ëpulledââ¬â¢ for development. Changing business priorities may require a complete reprioritization of the backlog, leading to wasted effort. Individual change requests and new requirements need to be prioritized and inserted into the correct location in the backlog requiring high communication overhead.
In my thesis I present a social-networking inspired negotiation and two-step prioritization approach using a decision theoretic model, to negotiate and prioritize system and software requirements. I show how social-networking helps increase participation in online negotiation sessions and how the a rigorous prioritization model can help channelize the output of the negotiation and progress towards the implementation of the requirements - all in a value centric manner.
Location: Grace Ford Salvatori Hall Of Letters, Arts & Sciences (GFS) - 114
Audiences: Everyone Is Invited
Contact: Lizsl De Leon