Ewa Deelman, left, and Jose-Luis Ambite will structure data gathered by the new genetic epidemological study to interface with a query function researchers can use to quickly and consistently test hypotheses.
Ambite and Deelman will provide IT expertise to the Rutgers Center, which will also receive funding from the National Institute of Mental Health. Their combined funding will total some $1.5 million over the study period.
Deelman, a project leader in the ISI Center for Grid Technologies will lead the effort of developing the project's cyber infrastructure and provide researchers with web-based tools to easily query data managed by the Center. Additionally she will create software tools to analyze the data accessible through the Center.
The Center will use computational resources both at Rutgers and USC / ISI. Deelman plans to use her previous work developing the NSF-funded Pegasus Workflow Management System to guide the new Center's data ingestion process, the computations necessary for data integrations, and any analysis performed on the data.
Pegasus-WMS, developed at ISI in collaboration with the Condor team at the University of Wisconsin, Madison has already been shown to provide reliable and efficient workflow management for a number of complex applications in a variety of domains ranging from astronomy, earthquake science, to gravitational-wave physics.
Ambite, a senior research scientist at ISI, will lead the data integration efforts of the Center, aimed at making sure that the data is structured in such a way that responses to queries are consistent, so that apples always wind up being compared to apples, This is not an easy task in a hugely heterogeneous database, he says. "Access to raw data is not enough.Data from different studies must be integrated, so that concepts and values coming from different studies are faithfully represented and harmonized, both semantically and syntactically."
Ambite says that the resulting 'integrated view' of the data facilitates new discoveries by allowing investigators to reach and combine data in novel, previously unfeasible, ways. The center will leverage the significant progress in data integration in the last decade and will build upon the expertise of the Information Integration Group at ISI.More information about the study is available from the National Human Genome Project press annoucement.