Mahout 0.8 - New and Improved with Super-fast Clustering (TM)

Date

Monday, February 25, 2013 - 6:30pm - 8:00pm

Venue

eBay Whitman Campus

eBay Whitman Campus
2065 Hamilton Ave
San Jose, CA
Speaker: 
Ted Dunning, Chief Application Architect MapR Technologies

Event Details

The recent Mahout 0.7 release was a house-keeping release that featured code cleanups and the deletion of unused or unmaintained code.  The upcoming Mahout 0.8 release, however, is a functional release that will include major new functionality including a new k-nearest neighbor (k-nn) modeling framework.  At the heart of this framework is a new super-fast clustering algorithm.

I will provide an overview of the recent and planned changes in 0.7 and the upcoming 0.8 release and then will do a deep dive into the new k-nn algorithms and code with special attention paid to the new clustering code. 
 

Speaker Bio

Ted has held Chief Scientist positions at Veoh Networks, ID Analytics and at MusicMatch, (now Yahoo Music). Ted is responsible for building the most advanced identity theft detection system on the planet, as well as one of the largest peer-assisted video distribution systems and ground-breaking music and video recommendations systems. Ted has 15 issued and 15 pending patents and contributes to several Apache open source projects including Hadoop, Zookeeper and Hbase™. He is also a committer for Apache Mahout. Ted earned a BS degree in electrical engineering from the University of Colorado; a MS degree in computer science from New Mexico State University; and a Ph.D. in computing science from Sheffield University in the United Kingdom. Ted also bought the drinks at one of the very first Hadoop User Group meetings.

Event page provided by ACM