o    Overview

o    Schedule

o    Invited Speaker

o    Panels

o    Papers

o    Organizers

o    Sponsor

The Seventh International Workshop on Data Mining for Online Advertising (ADKDD'13) Proceedings

Workshop Description

Online advertising is a key component in the whole internet ecosystem and is growing rapidly with constantly evolving business models and practices. Examples of online advertising include sponsored search, display advertising, Rich Media Ads, interstitial ads, online classified advertising, e-mail marketing and so on. With the rapid growth of social network, social network advertising (as exemplified by Facebook and Groupon) is taking off and playing an important role in the whole landscape. Also, more and more offline ads for both big brands and local businesses are moving online.

The online advertising industry is facing tons of challenges. For example, how to understand end users’ need and advertisers’ goals; what is the right strategy to connect user and ads; what ads should be delivered through which type of of advertising. These challenges bring great opportunities for researchers and data miners to come up with new technologies. Therefore, a forum for researchers and industry practitioners to exchange latest research results and construct collaborations will be of great service to the data mining community and generate value for the industry.

In conjunction with SIGKDD’07/08/09/10/11/12, six ADKDD workshops have been organized, which were very well received both by academic community and the advertising industry. All the workshops were well attended, and often were standing room only. For the latest one in 2012, ADKDD ranked No. 2 in terms of attendees among the 20 workshops.

We look forward to your contribution and attendance! See you in Chicago!

Schedule Return to Top

Workshop Schedule at a Glance

August 11, 2013 Sunday


Opening ceremony

Session 1:


Coffee break


Keynote Speech: Advertising - Why Human Intuition Still Exceeds Our Best Technology
Brian Burdick


Session 2:




Keynote Speech: Machine Learning Challenges in Targeted Advertising: Transfer Learning in Action
Claudia Perlich


Session 3:


Coffee break


Panel discussions



Invited Speakers Return to Top

Brian Burdick, Burdick Media and Technology Consulting

Title: Advertising - Why Human Intuition Still Exceeds Our Best Technology

Abstract: Advertising has a history of over 150 years in the United States – with all of our technology and personalization how much more effective is it now? What are the things that creative agencies recognized in the most iconic ads of all time and why as technologists are we focused on things that don’t matter most for creating advertiser results? How will media convergence affect advertising and how much better are we at measuring results anyway?

First a look back at the history of media and advertising – including some of the most iconic ads. What made these ads iconic and what was implicitly understood about the target audience and the message? Why can’t the ability to produce this kind of result be automatically replicated? We will also go through the rise and fall of media including print, radio, tv, direct mail, and online media. How has media shaped advertising and where is it likely to lead? In particular, what are the implications of large amounts of consumer TV viewing shifting to video delivered on a one to one basis via the Internet across devices?

For each potential consumer there is a buy funnel going from awareness to consideration to finally purchase of products. Social networks, search engines, communication carriers, and behind the scenes data aggregators have persistent notions of consumer context in terms of social relationships, content consumption, advertising exposures, purchase history, and detailed physical locality. Human beings at agencies implicitly understand some of these consumer context changes and have different messages for different contexts. Why isn’t more being done to orchestrate across all of these contexts and mediums to pipeline sales through the buy funnel instead of narrow optimization within a medium?

All advertising is an investment for a future return for an advertiser. Today there are many systems that can response model how to target consumers and use heuristics to measure online conversions. These are narrow systems primarily focused within a single medium often resulting in the same sales being attributed multiple times without any measurement of what would have happened without the ads. To truly measure results, cross-medium lift measurement must supplant response modeling and only then can we get the most important metric in advertising – net present value of advertiser sales.

Bio: Brian Burdick is the Principal at Burdick Media and Technology Consulting. Burdick has a fifteen-year history in advertising technology across email, paid search, display advertising, and targeted television. Burdick is the inventor of over 35 pending or granted patents and has been on the advisory boards of Comscore, Technorati, and OVP Ventures. Burdick was one of the original founders of Microsoft AdCenter (Bing Ads) being responsible for the implementation of the auction design, ranking systems, and matching systems. After that, Burdick was the CTO of the first real-time bidded online display exchange, AdECN. Burdick has also worked as head of product, technology, and analytics for an online ad network and a targeted television-advertising firm. Burdick is a graduate of the University of Texas at Austin in MIS.

Claudia Perlich, Media6Degrees

Title: Machine Learning Challenges in Targeted Advertising: Transfer Learning in Action

Abstract: The most interesting paradox in predictive modeling is that in situations where a predictive model would be most useful, it is often the hardest to obtain adequate training data to build it. In no application domain is this more true than display advertising. To mention just two of the prevalent challenges: optimization for a campaign should preferably happen prior to the start of the campaign at which point nobody has ever seen the ad yet. To make things ever harder, the outcomes towards which we are optimizing (e.g., purchases) are exceedingly rare if not observable at all to the entity charged with running the campaign.

So while advertising has become a great playing field of machine learning applications, almost none of the problems are fitting in the conventional IID training scenario. Acquiring sufficient data for training from the ideal sampling distribution is almost always prohibitively expensive or just unfeasible. Instead, we have to make due with data that are drawn from surrogate distributions and learning tasks (labels), and then transferred to the actual problem we are trying to optimize. To make matter worse, we need to address the highly heterogeneous sets of features, some with massive dimensionality.

In this talk we will highlight 3 instances of transfer deployed in the M6D targeting engine:

1) Learning from proxy labels where instead of using the rare purchase event as outcome the model is estimated on surrogate events.
2) Learning from alternative sampling distributions to increase the number of positives in the data.
3) Reducing the dimensionality of the feature space by clustering features in the parameter space of models across multiple campaigns.

The system needs to be robust enough to learn models for hundreds of different campaigns for a wide variety of different brands and products automatically. We present production results across a variety of advertising clients from a variety of industries, illustrating the performance of the system in use.

Bio: Claudia Perlich serves as Chief Scientist at m6d and in this role designs, develops, analyzes and optimizes the machine learning that drives digital advertising to prospective customers of brands. An active industry speaker and frequent contributor to industry publications, Claudia enjoys serving as a guide in world of data and was recently named winner of the Advertising Research Foundation’s (ARF) Grand Innovation Award and was selected as member of the Crain’s NY annual 40 Under 40 list. She has published numerous scientific articles, and holds multiple patents in machine learning and won many data mining competitions. Prior to joining m6d in February 2010, Claudia worked in Data Analytics Research at IBM’s Watson Research Center, concentrating on data analytics and machine learning for complex real-world domains and applications. Claudia has a PhD in Information Systems from NYU and an MA in Computer Science from Colorado University. Claudia takes active interest in the making of the next generation of data scientists and is teaching “Data Mining for Business Intelligence” in the NYU Stern MBA program.

Panel Discussions Return to Top

Panel Discussion: Status Discussions for Online Advertising Research Problems and Collaboration Enabling
Moderator: Ying Li, Concurix Corporation

Description: Every year at ADKDD, the audience chooses the topics to be discussed. We talk about where online advertising is, what we have accomplished so far, what else needs to be done, the challenges that lie ahead, and promising new research directions. The panel also aims to increase collaboration amongst different research groups, both from industry and academia.

Table of Contents Return to Top

Full Papers

Real Time Bid Optimization with Smooth Budget Delivery in Online Advertising
Kuang-Chih Lee (Turn Inc.)
Ali Jalali (Turn Inc.)
Ali Dasdan (Turn Inc.)

TSum: Fast, Principled Table Summarization
Jieying Chen (Google Inc.)
Jia-Yu Pan (Google Inc.)
Spiros Papadimitriou (Rutgers University)
Christos Faloutsos (Carnegie Mellon University)

Real-time Bidding for Online Advertising: Measurement and Analysis
Shuai Yuan (University College London)
Jun Wang (University College London)
Xiaoxue Zhao (University College London)

CTR Prediction for Contextual Advertising: Learning-to-Rank Approach
Yukihiro Tagami (Yahoo Japan Corporation)
Shingo Ono (Yahoo Japan Corporation)
Koji Yamamoto (Yahoo Japan Corporation)
Koji Tsukamoto (Yahoo Japan Corporation)
Akira Tajima (Yahoo Japan Corporation)

Audience Segment Expansion Using Distributed In-Database K-Means Clustering
Archana Ramesh (nPario Inc.)
Ankur Teredesai (University of Washington)
Ashish Bindra (nPario Inc.)
Sreenivasulu Pokuri (nPario Inc.)
Krishna Uppala (nPario Inc.)

Organizers Return to Top

Organizing Committee

  • Esin Saka, Microsoft Corporation
  • Dou Shen, Baidu Inc.
  • Bin Gao, Microsoft Research Asia
  • Jun Yan, Microsoft Research Asia
  • Ying Li, Concurix Corporation

Sponsors Return to Top