-
Rafael Ferreira da Silva (USC ISI) - Task Resource Consumption Prediction for Scientific Applications and Workflows
Mon, Jan 11, 2016 @ 11:00 AM - 12:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Rafael Ferreira da Silva, USC ISI
Talk Title: Task Resource Consumption Prediction for Scientific Applications and Workflows
Series: CS Colloquium
Abstract: This lecture satisfies requirements for CSCI 591: Computer Science Research Colloquium
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable scientific application executions. Such algorithms often assume that accurate estimates are available, but such estimates are difficult to generate in practice. In this work, we first profile real scientific applications and workflows, collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize task requirements based on these profiles. Our method estimates task runtime, disk space, and peak memory consumption. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict task characteristics of scientific applications based on the collected data. For scientific workflows, we propose an online estimation process based on the MAPE-K loop, where task executions are monitored and estimates are updated as more information becomes available. Experimental results show that our online estimation process results in much more accurate predictions than an offline approach, where all task requirements are estimated prior to workflow execution.
Biography: Rafael Ferreira da Silva is a Computer Scientist in the Collaborative Computing Group at the USC Information Sciences Institute. He received his PhD in Computer Science from INSA-Lyon, France, in 2013. In 2010, he received his Master's degree in Computer Science from Universidade Federal de Campina Grande, Brazil, and his BS degree in Computer Science from Universidade Federal da Paraiba, in 2007. His research focuses on the execution of scientific workflows on heterogeneous distributed systems such as clouds and grids. See http://www.rafaelsilva.com for further information.
Host: Computer Science Department
Location: Olin Hall of Engineering (OHE) - 136
Audiences: Everyone Is Invited
Contact: Assistant to CS chair