o    Overview
o    Schedule
o    Keynote Speaker
o    Invited Talks
o    Papers
o    Organizers
o    Sponsor

Proceedings of the 12th International Workshop on Data Mining in Bioinformatics (BIOKDD'13)


Overview

Bioinformatics is the science of managing, mining, and interpreting information from biological data. Various genome projects have contributed to an exponential growth in DNA and protein sequence databases. Rapid advances in high-throughput technologies, such as microarrays, mass spectrometry and new/next-generation sequencing, can monitor quantitatively the presence or activity of thousands of genes, RNAs, proteins, metabolites, and compounds in a given biological state. The ongoing influx of these data, the pressing need to address complex biomedical challenge, and the gap between the two have collectively created exciting opportunities for data mining researchers.

 

While tremendous progress has been made over the years, many of the fundamental problems in bioinformatics, such as protein structure prediction, gene-environment interaction, and regulatory network mapping, have not been convincingly addressed. Besides these, new technologies such as next-generation sequencing are producing massive amount of sequence data; managing, mining and compressing these data raise challenging issues. Finally, there is a pressing need to use these data and computational techniques to build network models of complex biological processes and disease phenotypes. Data mining will play an essential role in addressing these fundamental problems and the development of novel therapeutic/diagnostic solutions in the post-genomics era of medicine.

 

The goal of this workshop is to encourage KDD researchers to take on the numerous challenges that Bioinformatics offers. This year, the workshop will feature the theme of Building network and predictive models of biological processes and diseases using complex data. This field focuses on the use of computational approaches, especially from data mining and machine learning, and the large amount and variety of biological data being generated. The goal here is to build accurate predictive or descriptive network models of biological processes and diseases. These approaches have revolutionized the new age biology by enabling novel discoveries in basic biology and diseases like cancer and diabetes, as well as the development of therapeutics.



Schedule Return to Top


Workshop Schedule at a Glance

August 11, 2013 Sunday

9:00-9:10

Opening remarks

9:10-10:25

10:25-11:00

Coffee break

11:00-Noon   

Keynote Speech: Building predictive models of disease
Eric Schadt, Icahn School of Medicine at Mount Sinai

Noon-1:30

Lunch

  1:30-3:30

3:30-4:00

Coffee break

  4-5:15


Keynote Speaker Return to Top


Eric Schadt, Icahn School of Medicine at Mount Sinai, New York

Title: Building predictive models of disease
Abstract:
The causal chain of events that lead to the development of complex diseases such as schizophrenia remains elusive. Such diseases are complex, resulting from the interplay of potentially hundreds (or thousands) of genetic loci and environmental factors.  Genetic and environmental perturbations induce changes in the molecular interactions of cellular pathways whose collective effect may become clear through the organized structure of multiscale biological networks. We have developed a novel systems approach to study psychiatric disorders such as schizophrenia that models the global molecular, functional, and structural changes in the affected brain that in turn can lead us to the root causes of the disease. To characterize the molecular, cellular, and physiological systems associated with common human diseases, we constructed gene regulatory networks, functional and structural MRI based networks, high-content phenotypic networks and then integrated these network models across all of the data modalities generated across multiple human cohorts comprised of several thousand individuals.  Because DNA variation was systematically assessed across all cohorts, it provides a common set of perturbations that can be leveraged to not only infer causal relationships among different molecular and higher order traits, but that can help link networks at different scales (e.g., molecular and imaging) across cohorts.     Through this integrative network-based approach, we rank-order the resulting network structures for relevance to different diseases, highlighting both known and novel biological pathways involved in disease pathogenesis and progression.  We demonstrate that the causal network structures we construct from this big data integration exercise is a useful predictor of response to gene perturbations and presents a novel framework to test models of disease mechanisms underlying disease. We further demonstrate that our approach can offer novel insights for drug discovery programs aimed at treating disease by screening our disease-associated networks against molecular signatures induced by marketed and novel compounds across a number of cell-bases systems, including those derived from stem cells isolated from patients with disease.

Bio: Eric Schadt, PhD, is Director of the Institute for Genomics and Multiscale Biology, Chair of the Department of Genetics and Genomics Sciences and the Jean C. and James W. Crystal Professor of Genomics.
 
Dr. Schadt is an expert on the generation and integration of very large-scale sequence variation, molecular profiling and clinical data in disease populations for constructing molecular networks that define disease states and link molecular biology to physiology. His research has provided novel insights into what is needed to master diverse, large-scale data collected on normal and disease populations in order to elucidate the complexity of disease and make more informed decisions in the drug discovery arena.  He has contributed to a number of discoveries relating to the genetic basis of common human diseases such as diabetes and obesity, which have been widely published in leading scientific journals. 
 
Dr. Schadt is also a founding member of Sage Bionetworks, an open-access genomics initiative designed to build and support databases and an accessible platform for creating innovative dynamic disease models. Prior to joining Pacific Biosciences in 2009, he was Executive Scientific Director of Genetics at Rosetta Inpharmatics, a subsidiary of Merck & Co., Inc. in Seattle, and before Rosetta, Dr. Schadt was a Senior Research Scientist at Roche Bioscience.  He received his B.A. in applied mathematics and computer science from California Polytechnic State University, his M.A. in pure mathematics and his Ph.D. in bio-mathematics from University of California, Los Angeles. 



Invited Talks Return to Top


Invited Talk 1: State-of-the-art in protein function prediction
Speaker: Predrag Radivojac, Indiana University

Abstract: In this talk I will first provide the significance and computational problem formulation of protein function prediction. I will then present details of the first Critical Assessment of Functional Annotation (CAFA) experiment, where we evaluated state-of-the-art in the field. We provided evidence that modern methods significantly outperform simple BLAST alignments but that there is significant need and room for improvement. I will lay out possible avenues for improvements and accuracy assessment of function prediction proposed by my research group. Finally, I will briefly discuss the CAFA 2013-2014 challenge whose start is anticipated for Summer 2013.


Invited Talk 2: Systems Biology of Cellular Aging and Age-Related Degeneracies 
Speaker: Ananth Grama, Purdue University

Abstract: Cellular aging is a multi-factorial complex phenotype, characterized by the accumulation of damaged cellular components over the organism's life-span. The progression of aging depends on both the increasing rate of damage to DNA, RNA, proteins, and cellular organelles, as well as the gradual decline of the cellular defense mechanisms against stress. This can ultimately lead to a dysfunctional cell, with a higher risk factor for a number of diseases, including cancers, cardiovascular disease, and multiple neurodegenerative disorders. With a view to uncovering the pathways associated with aging, and their role in age-related degeneracies, we have developed a number of algorithms and statistical models that integrate and analyze disparate data over human and yeast interactomes. In this talk, we present two recent results: (i) we demonstrate the use of directed random walks in uncovering the downstream effectors of Target of Rapamycin (TOR), a highly conserved protein kinase that plays a key role in the aging process of various organisms; and (ii) we build tissue-specific networks for human cells and develop a complete framework for projecting these tissue-specific networks on to the yeast interactome. The goals of this effort are many-fold -- strong alignments indicate tissues for which yeast is a good model organism (in terms of underlying biochemistry), alignments reveal specific pathways that are well conserved, and they serve as a first step in understanding the etiology of age-related degeneracies.



Table of Contents Return to Top




Organizers Return to Top


Organizing Committee
   
General Chairs

  • Mohammed Zaki (Rensselaer Polytechnic Institute)
  • Jake Chen (Indiana University-Purdue University at Indianapolis)

Program Chairs

Program Committee


William S Noble    University of Washington
Ambuj Singh    University of California, Santa Barbara
Jinbo Xu    Toyota Technological Institute at Chicago
Andrea Tagarelli      University of Calabria, Italy
Asa Ben-Hur      Colorado State University
Bojan Losic    Icahn School of Medicine at Mount Sinai
Chad Myers     University of Minnesota
Chandan K. Reddy     Wayne State University
T.M. Murali      Virginia Tech
Francis Chin     University of Hong Kong
Gang Fang     Icahn School of Medicine at Mount Sinai
Jieping Ye      Arizona State University
Mohammad Al Hasan    IUPUI
Jun (Luke) Huan    University of Kansas
Jinze Liu    University of Kentucky
Tae Hyun Hwang    University of Texas Southwest Medical Center
Vipin Kumar    University of Minnesota
Mehmet Koyuturk     Case Western Reserve University
Minghua Deng Peking University, China
Jie Zheng Nanyang Technological University of Singapore
Naren Ramakrishnan Virginia Tech
Rui Chang    Icahn School of Medicine at Mount Sinai
Rui Kuang University of Minnesota
Saeed Salem North Dakota State University
Tamer Kahveci University of Florida
Xia Ning    NEC Labs
Xiaohua (Tony) Hu     Drexel University
Sanghamitra Bandyopadhyay Indian Statistical Institute, Kolkata, India
Ying Ding Indiana University
Predrag Radivojac Indiana University
Min Song    New Jersey Institute of Technology
Stefan Kramer Johannes Guternberg University Mainze
Vladimir Pavlovic North Dakota State University
Isidore Rigoutsos Jefferson University


Sponsors Return to Top