KDD Cup Competition over Real World Data Sets, from Chinese Social Network Leader Tencent Fuels Data Mining Innovation

Share Article

SIGKDD Annual Big Data Challenge Gives Scientists Unprecedented Access to Complex Big Data Sets Bridging the Gap between Science and Business

KDD Cup winners will be announced at KDD 2012, in Beijing, China, August 12-16, 2012

“The KDD Cup is a unique opportunity for scientists and real world businesses to join forces and deliver the next wave of innovation,” said Gordon Sun (Ph.D.), Chair of the KDD Cup Organizing Committee and Chief Scientist of Tencent.

The Association for Computing Machinery Special Interest Group on Knowledge Discovery and Data Mining (ACM SIGKDD) announced today that it has attracted a record number of teams for its KDD Cup competition, one of the most prestigious data science competitions in the world. Presenting data scientists with a real world data set and two different problems, the annual competition attracted participation from more than 900 experts. This year’s competition is sponsored by Chinese Internet giant, Tencent, and features the largest and most complex data set ever made available in a KDD Cup contest. Winners of KDD Cup will be announced at KDD 2012, the premier international conference on data science, Big Data and data mining, taking place in Beijing, China, from August 12-16, 2012.

Using complex, multi-category data from Tencent’s 600 million users, KDD Cup 2012 offers scientists two tracks designed to address a real world situation in which data mining and knowledge discovery can improve social network consumer engagement and advertising effectiveness. In Track 1, scientists are tasked with predicting whether or not a user will engage with a microblog item (a user, organization, or group) that has been recommended to them, using more than 300 million recommendation records, of two million anonymous user IDs with over 50 million social network connections. Track 2 challenges data scientists to predict the effectiveness of search advertising and improve ad relevance (as measured by click-through rate) based on data from 267 million impressions, 23 million users and 670 thousand ads.

“The KDD Cup is a unique opportunity for scientists and real world businesses to join forces and deliver the next wave of innovation,” said Gordon Sun (Ph.D.), Chair of the KDD Cup Organizing Committee and Chief Scientist of Tencent. “One of the biggest challenges for the research community is the lack of real world large data sets that allow them to develop and test new algorithms and solutions. By giving the KDD community access to Tencent’s large amount of complex data, we not only hope to improve the experience for our customers and advertisers, but we bridge the gap between the fields of research and business.”

Since its inception 16 years ago, the KDD Cup has provided data scientists with challenges and data sets from different industries, facilitating data science advancement on hot issues such as early detection of breast cancer from X-ray images, predicting student performance, developing accurate retail, marketing and recommendations, identifying pulmonary embolisms from 3D image data and more. It is one of the few data competitions that give data scientists the opportunity to work with real world Big Data sets that present them with tangible complex challenges. In addition, since participants have access to the same data sets, results and algorithms can easily be tested and evaluated, creating a model for collaboration between industry and research institutions and building a scientific foundation for future research in a responsible and privacy-sensitive manner.

“Every year, the KDD Cup opens a new treasure chest for data scientists, giving them the opportunity to explore new applications of algorithms for Big Data, providing not just a technical challenge but an opportunity to learn and innovate in Data Mining” said Usama Fayyad, SIGKDD Executive Committee Chairman and chairman & CTO of ChoozOn Corp. “In today’s data-rich world, virtually every aspect of our lives can be improved by data science. The opportunity to work with Tencent’s unique complex data sets and develop new solutions that the entire scientific and business communities can take advantage of is unprecedented and will move us all forward in our quest for better data science and practical Big Data solutions.”

With over 2000 members from leading research institutions, universities and business organizations in more than 80 countries, the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) is the premier forum for advancement and adoption of KDD, data science, and data mining over Big Data. SIGKDD’s mission is to provide the growing community of big data and analytics experts with tools and resources to promote the value of knowledge discovery and data mining in today’s data-centric economy. The current SIGKDD Executive Council Chairman is leading data mining expert Usama Fayyad, Ph.D. For additional information, please visit http://www.kdd.org or follow us on Twitter (@kdd_news).

About ACM
ACM, the Association for Computing Machinery http://www.acm.org, is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. ACM strengthens the computing profession’s collective voice through strong leadership, promotion of the highest standards, and recognition of technical excellence. ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional networking.

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Emilia Palaveeva

(206) 890-8973
Email >
Visit website