GraphLab: A Distributed Abstraction for Machine Learning in the Cloud

Date

Monday, January 28, 2013 - 6:30pm - 8:00pm

Venue

eBay Whitman Campus

eBay Whitman Campus
2065 Hamilton Ave
San Jose, CA
Speaker: 
Carlos Guestrin

Event Details

Today, machine learning (ML) methods play a central role in industry and science.  The growth of the Web and improvements in sensor data collection technology have been rapidly increasing the magnitude and complexity of the ML tasks we must solve.  This growth is driving the need for scalable, parallel ML algorithms that can handle "BigData."  Unfortunately, designing and implementing efficient parallel ML algorithms is challenging.  Existing high-level parallel abstractions such as MapReduce and Pregel are insufficiently expressive to achieve the desired performance, while low-level tools such as MPI are difficult to use, leaving ML experts repeatedly solving the same design challenges.

This talk describes the GraphLab framework, which naturally expresses asynchronous, dynamic graph computations that are key for state-of-the-art ML algorithms.  When these algorithms are expressed in our higher-level abstraction, GraphLab will effectively address many of the underlying parallelism challenges, including data distribution, optimized communication, and guaranteeing sequential consistency, a property that is surprisingly important for many ML algorithms.  On a variety of large-scale tasks, GraphLab provides 20-100x performance improvements over Hadoop.  In recent months, GraphLab has received thousands of downloads, and is being actively used by a number of startups, companies, research labs and universities.  

This talk represents joint work with Yucheng Low, Joey Gonzalez,  Aapo Kyrola, Jay Gu, and Danny Bickson.

Speaker Bio

Carlos Guestrin is the Amazon Professor of Machine Learning at the Computer Science & Engineering Department of the University of Washington. His previous positions include Associate Professor at Carnegie Mellon University and senior researcher at the Intel Research Lab in Berkeley. He is also the co-founder of Flashgroup, a start up focused on addressing information and social overload on the web. Carlos received his PhD and Masters from Stanford University, and a Mechatronics Engineer degree from the University of Sao Paulo, Brazil.  Carlos' work has been recognized by awards at a number of conferences and two journals: KDD 2007 and 2010, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML 2005, AISTATS 2010, JAIR in 2007 & 2012, and JWRPM in 2009.  He is also a recipient of the ONR Young Investigator Award, NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award. Carlos was named one of the 2008 `Brilliant 10' by Popular Science Magazine, received the IJCAI Computers and Thought Award and the Presidential Early Career Award for Scientists and Engineers (PECASE).  He is a former member of the Information Sciences and Technology (ISAT) advisory group for DARPA.

Attached files

AttachmentSize
Slides from Talk20.87 MB

Event page provided by ACM