-
AI Seminar-Fabio Rinaldi:
Fri, Feb 14, 2014 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Fabio Rinaldi, Senior Researcher, Lecturer, and PI at the University of Zurich, Switzerland
Talk Title: OntoGene & SASEBio: biomedical text mining research at UZH
Series: AISeminar
Abstract: There are vast amounts of knowledge encoded in the scientific literature which could be made more easily accessible and useful to a broader range of users through the application of more effective software tools. Text mining is a new discipline which seeks to provide ways to find, extract and manipulate the knowledge which still remains to a large extent hidden in the literature.
Text mining tools can already provide a very effective way to extract some specific types of information, but are not yet so advanced that their results can be used without human verification by domain experts. Therefore one very promising area of application of text mining technologies is within the process of database curation.
The need to efficiently retrieve key information derived from experimental results, and published in the scientific literature, is of fundamental importance in biology. In order to help biologists, as well as in some cases medical practitioners, to efficiently find such
information in the enormous quantity of published articles, several public and private institutions fund the construction and maintenance of specialized databases, which have the role to collect specific knowledge items and provide them in an easily accessible format. There are several dozens of such databases, each specializing in a
particular domain of the life sciences [1].
In this talk I will describe text mining activities conducted by my research group at the University of Zurich (OntoGene: www.ontogene.org). The OntoGene group is supported by the Swiss National Science Foundation (project SASEBIO: Semi-Automated Semantic
Enrichment of the Biomedical Literature) and by Roche Pharmaceuticals. The SASEBio project focuses in particular on applications of text mining technologies to the process of biomedical database curation.
The OntoGene team has participated in several competitive evaluations of biomedical text mining technologies, obtaining competitive results in all of them. Some of these results will be discussed in the talk. Additionally, I will present ODIN (OntoGene Document Inspector), an interactive tool which allows database curators to leverage upon the results of the OntoGene text mining system and use them in their
curation tasks.
---
[1] Xose M. Fernandez-Suarez, Daniel J. Rigden, and Michael Y. Galperin. The 2014 nucleic acids research database issue and an updated NAR online molecular biology database collection. Nucleic Acids Research, 42(D1):D1-D6, 2014
The OntoGene text mining system is based on a scalable entity recognition component with a semi-automated organism-based disambiguation module, an in-house dependency parser, and a flexible relation mining approach. The OntoGene team has participated in several biomedical text mining challenges (BioCreative, BioNLP,
CALBC), obtaining competitive results in all of them. Some of these results will be discussed in the talk.
The OntoGene Document Inspector (ODIN) is an interactive tool which allows database curators to leverage upon the results of the OntoGene text mining system and use them in their curation tasks. One recent version of the system has been tested in the curation process of the Pharmacogenomics Knowledge Base (PharmGKB), and another version
adapted for the Comparative Toxicogenomics Database in the context of
a BioCreative challenge.
Biography: Fabio Rinaldi is the leader of the OntoGene research group at the University of Zurich and the principal investigator of the SASEBio project. He holds an MSc in Computer Science (University of Udine, Italy) and a PhD in Computational Linguistics (University of Zurich, Switzerland). He is author of more than 100 scientific publications (including 19 journal papers) dealing with topics such as Ontologies, Text Mining, Text Classification, Document and Knowledge Management, Language Resources and Terminology.
Host: David Chiang
Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=7bf4d5a5d7404d249254a2b96006ea6e1dLocation: Information Science Institute (ISI) - 11th fl Large CR
WebCast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=7bf4d5a5d7404d249254a2b96006ea6e1d
Audiences: Everyone Is Invited