Logo: University of Southern California

ISI To Play $1.5 Million Role in NIH Genetic Epidemiology Initiative

IT specialists will structure data from four research centers to allow consistent queries

July 17, 2008 — Jose-Luis Ambite and Ewa Deelman, both research faculty members of the Viterbi School's department of Computer Science  working at the School's Information Sciences Institute, will be part of a 4-year, $31 million initiative aimed at understanding how genetic variation influences the risk of diabetes, heart disease, cancer and other common diseases.
 
 Ewa Deelman, left, and Jose-Luis Ambite will structure data gathered by the new genetic epidemological study to interface with a query function researchers can use to quickly and consistently test hypotheses.
The National Human Genome Research Institute, one of the National Institutes of Health, announced the study July 17. Study centers in Hawaii, North Carolina, Tennessee and Washington State will carry on the research, with their work coordinated and facilitated by a separate Center at Rutgers University in New Jersey.

Ambite and Deelman will provide IT expertise to the Rutgers Center, which will also receive funding from the National Institute of Mental Health. Their combined funding will total some $1.5 million over the study period.

Deelman, a project leader in the ISI Center for Grid Technologies will lead the effort of developing the project's cyber infrastructure and provide researchers with web-based tools to easily query data managed by the Center. Additionally she will create software tools to analyze  the data accessible through the Center.

The Center will use computational resources both at Rutgers and USC / ISI. Deelman plans to use her previous work developing the NSF-funded Pegasus Workflow Management System to guide the new Center's data ingestion process, the computations necessary for data integrations, and any analysis performed on the data. 

Pegasus-WMS, developed at ISI in collaboration with the Condor team at the University of Wisconsin, Madison has already been shown to provide reliable and efficient workflow management for a number of complex applications in a variety of domains ranging from astronomy, earthquake science, to gravitational-wave physics.

Ambite, a senior research scientist at ISI, will lead the data integration efforts of the Center, aimed at making sure that the data is structured in such a way that responses to queries are consistent, so that apples always wind up being compared to apples, This is not an easy task in a hugely heterogeneous database, he says. "Access to raw data is not enough.Data from different studies must be integrated, so that concepts and values coming from different studies are faithfully represented and harmonized, both semantically and syntactically."

Ambite says that the resulting 'integrated view' of the data facilitates new discoveries by allowing investigators to reach and combine data in novel, previously unfeasible, ways. The center will leverage the significant progress in data integration in the last decade and will build upon the expertise of the Information Integration Group at ISI.

More information about the study is available from the National Human Genome Project press annoucement.