DMSIG Meeting – Growing Trees in Clouds with PLANET. On August 12, 2009.

Posted August 12th, 2009 by SteveLazarus and filed in DM SIG Meeting

Presented by: Josh Herbach
Date: Wednesday, 12 August 2009, 6:30 PM

    Location:
    NASA Exploration Center
    NASA Ames Research Center
    Moffett Field, CA

Cost: Free and open to all who wish to attend, but membership is only $20/year.

Topic

Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree

learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Google’s computing infrastructure is based on commodity hardware.

In this presentation, we describe PLANET: a scalable distributed framework for learning tree models over large datasets. PLANET defines tree learning as a series of distributed computations, and implements each one using the MapReduce model of distributed computation. We show how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models. We discuss the benefits and

HerbachJosh

challenges of using a MapReduce compute cluster for tree learning, and demonstrate the scalability of this approach by applying it to a real world learning task from the domain of computational advertising.

About the Speaker
Josh Herbach is an engineer at Google where he works on ads quality. Prior to joining Google in June 2008, he received his bachelors degree in computer science from Princeton University where he did research in clustering evaluation, electronic voting systems and autonomous vehicles. When he isn’t busy making self-driving cars that can hack elections and run k-means, he occasionally spends his time puzzling, backpacking, or hunting for good dim sum restaurants.

2 Responses to “DMSIG Meeting – Growing Trees in Clouds with PLANET. On August 12, 2009.”

  1. GregMakowski says:

    For the corresponding paper, see
    http://www.bayardo.org/ps/vldb2009.pdf

  2. GregMakowski says:

    For the video for this talk, see http://fora.tv/2009/08/12/Josh_Herbach_PLANET_MapReduce_and_Tree_Learning

    The SF Bay ACM would like to thank our video sponsor for this talk, Odyssey Capital Management
    http://www.ody.com/

Leave a Reply