o    Overview

o    Schedule

o    Invited Speakers

o    Accepted Papers

o    Organizers

Proceedings of the ACM SIGKDD 2013 Workshop on Outlier Detection and Description

The goal of the workshop on Outlier Detection and Description (ODD) is to address outlier mining as the twofold task of outlier detection, and outlier description. In other words, the quantitiave and qualitative analysis of anomalies in data. These topics are rarely considered in unison, and literature for these tasks is spread over different research communities. The main goal of ODD is to bridge this gap and provide a venue for knowledge exchange between these different research areas for a corroborative union of quantitative and qualitative analyses for the study of outlier mining.

For detailed information, please refer to ODD@KDD13 website.


Schedule Return to Top


Workshop Schedule at a Glance

August 11, 2013 Sunday

9:00-9:30

Opening remarks

Keynote Speech #1: Outlier Detection in Personalized Medicine
Raymond Ng, Professor of Computer Science, University of British Columbia

9:30-10:00

Session 1:

10:00-10:30

Coffee break

10:30-11:00

Keynote Speech #2: Outlier Ensembles
Charu Aggarwal, Research Scientist, IBM T.J. Watson New York

11:00-12:00

Session 2:

12:00-12:05

Concluding remarks


Invited Speakers Return to Top


Raymond Ng, Professor of Computer Science, University of British Columbia

Title: Outlier Detection in Personalized Medicine

Abstract: Personalized medicine has been hailed as one of the main directions for medical research in this century. In the first half of the talk, we give an overview on our personalized medicine projects that use gene expression, proteomics, DNA and clinical features. In the second half, we give two applications where outlier detection is valuable for the success of our work. The first one focuses on identifying mislabeled patients, and the second one deals with quality control of microarrays.

Bio: Dr. Raymond Ng is a professor in Computer Science at the University of British Columbia. His main research area for the past two decades is on data mining, with a specific focus on health informatics and text mining. He has published over 150 peer-reviewed publications on data clustering, outlier detection, OLAP processing, health informatics and text mining. He is the recipient of two best paper awards - from 2001 ACM SIGKDD conference and the 2005 ACM SIGMOD conference. He was one of the program co-chairs of the 2009 International conference on Data Engineering, and one of the program co-chairs of the 2002 ACM SIGKDD conference. He was also one of the general co-chairs of the 2008 ACM SIGMOD conference. He was an editorial board member of the Very large Database Journal and the IEEE Transactions on Knowledge and Data Engineering until 2008. For the past decade, Dr. Ng has co-led several large scale genomic projects, funded by Genome Canada, Genome BC and NSERC. The total amount of funding of those projects well exceeded $40 million Canadian dollars. He now holds the Chief Informatics Officer position of the PROOF Centre of Excellence, which focuses on biomarker development for end-stage organ failures.


Charu Aggarwal, Research Scientist, IBM T.J. Watson New York

Title: Outlier Ensembles

Abstract: Ensemble analysis is a widely used meta-algorithm for many data mining problems such as classification and clustering. Numerous ensemble-based algorithms have been proposed in the literature for these problems. Compared to the cluster- ing and classification problems, ensemble analysis has been studied in a limited way in the outlier detection literature. In some cases, ensemble analysis techniques have been implicitly used by many outlier analysis algorithms, but the approach is often buried deep into the algorithm and not formally recognized as a general-purpose meta-algorithm. This is in spite of the fact that this problem is rather important in the context of outlier analysis. This talk discusses the various methods which are used in the literature for outlier ensembles and the general principles by which such analysis can be made more effective. A discussion is also provided on how outlier ensembles relate to the ensemble-techniques used commonly for other data mining problems.

Bio: Charu Aggarwal is a Research Scientist at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his B.S. from IIT Kanpur in 1993 and his Ph.D. from Massachusetts Institute of Technology in 1996. His research interest during his Ph.D. years was in combinatorial optimization (network flow algorithms), and his thesis advisor was Professor James B. Orlin . He has since worked in the field of performance analysis, databases, and data mining. He has published over 200 papers in refereed conferences and journals, and has applied for or been granted over 80 patents. Because of the commercial value of the above-mentioned patents, he has received several invention achievement awards and has thrice been designated a Master Inventor at IBM. He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of an IBM Research Division Award (2008) for his scientific contributions to data stream research. He has served on the program committees of most major database/data mining conferences, and served as program vice-chairs of the SIAM Conference on Data Mining , 2007, the IEEE ICDM Conference, 2007, the WWW Conference 2009, and the IEEE ICDM Conference, 2009. He served as an associate editor of the IEEE Transactions on Knowledge and Data Engineering Journal from 2004 to 2008. He is an associate editor of the ACM TKDD Journal , an action editor of the Data Mining and Knowledge Discovery Journal , an associate editor of the ACM SIGKDD Explorations, and an associate editor of the Knowledge and Information Systems Journal. He is a fellow of the IEEE for "contributions to knowledge discovery and data mining techniques", and a life-member of the ACM.


Accepted Papers Return to Top


Enhancing One-class Support Vector Machines for Unsupervised Anomaly Detection
Mennatallah Amer (German University in Cairo, Egypt)
Markus Goldstein (German Research Center for Artificial Intelligence) Slim Abdennadher (German University in Cairo, Egypt)

Systematic Construction of Anomaly Detection Benchmarks from Real Data
Andrew Emmott (Oregon State University)
Shubhomoy Das (Oregon State University)
Thomas Dietterich (Oregon State University)
Alan Fern (Oregon State University)
Weng-Keen Wong (Oregon State University)

Anomaly Detection on ITS Data via View Association
Junaidillah Fadlil (National Taiwan University of Science and Technology)
Hsing-Kuo Pao (National Taiwan University of Science and Technology)
Yuh-Jye Lee (National Taiwan University of Science and Technology)

On-line relevant anomaly detection in the Twitter stream: An Efficient Bursty Keyword Detection Model
Jheser Guzman (University of Chile)
Barbara Poblete (University of Chile, Yahoo! Labs Santiago)

Distinguishing the Unexplainable from the Merely Unusual: Adding Explanations to Outliers to Discover and Detect Significant Complex Rare Events
Ted Senator (SAIC)
Henry Goldberg (SAIC)
Alex Memory (SAIC)

Latent Outlier Detection and the Low Precision Problem
Fei Wang (University of Sydney)
Sanjay Chawla (University of Sydney and NICTA)
Didi Surian (University of Sydney and NICTA)


Organizers Return to Top


Organizing Committee

  • Leman Akoglu (Stony Brook University)
  • Emmanuel Müller (Karlsruhe Institute of Technology)
  • Jilles Vreeken (Universiteit Antwerpen)

Program Committee

  • Fabrizio Angiulli, University of Calabria
  • Ira Assent, Aarhus University
  • James Bailey, University of Melbourne
  • Arindam Banerjee, University of Minnesota
  • Albert Bifet, Yahoo! Labs Barcelona
  • Christian Böhm, LMU Munich
  • Rajmonda Caceres, MIT
  • Varun Chandola, Oak Ridge Natitonal Lab
  • Polo Chau, Georgia Tech
  • Sanjay Chawla, University of Syndey
  • Tijl De Bie, University of Bristol
  • Christos Faloutsos, Carnegie Mellon University
  • Jing Gao, University of Buffalo
  • Manish Gupta, Microsoft, India
  • Jaakko Holmén, Aalto University
  • Eamonn Keogh, University of California – Riverside
  • Matthijs van Leeuwen, KU Leuven
  • Daniel B. Neill, Carnegie Mellon University
  • Naren Ramakrishnan, Virginia Tech
  • Spiros Papadimitriou, Rutgers University
  • Koen Smets, University of Antwerp
  • Hanghang Tong, CUNY
  • Ye Wang, The Ohio State University
  • Arthur Zimek, LMU Munich