- This event has passed.
Power of Declarative Languages: From Information Extraction to Machine Learning
September 24, 2012 @ 6:30 pm
Monday, September 24, 2012 – 6:30pm – 9:00pm
2025 Stierlin Ct.
Mountain View, CA 94043
Speaker: Shivakumar ‘Shiv’ Vaithyanathan, IBM Research – Almaden
Modern enterprises are performing complex analyses on increasingly large data sets to drive business decisions. Tasks such as root cause analysis from system logs and social media analytics for customer retention, new customer acquisition and digital marketing are rapidly gaining importance. These tasks consist of three major analytic phases: text analytics / entity resolution, structured data processing (joins, group-by, aggregation), and predictive modeling. Traditionally these phases have been handled by a combination of custom code and separate systems such as ETL engines, relational databases, and statistical packages. However, the size of the datasets involved in these modern applications make it prohibitively expensive to shuttle data across different specialized systems for analysis. This has resulted in the need for a single infrastructure that is sufficiently flexible to handle all these workloads. At IBM we are building tools and technologies to support each of these analytic phases: SystemT, HIL and SystemML are declarative languages for text analytics, entity resolution and predictive modeling respectively. While the declarative nature of the language abstracts away the need for programmer-optimization, the syntax of these languages is designed to appeal to the corresponding communities. For instance, SystemML exposes a high-level language with a syntax similar to R — a very popular statistical processing language. Each of these languages compile down to a common runtime infrastructure. With examples from varied domains such as marketing, finance, retail, and media, I will describe how these technologies are enabling modern enterprises to deal with big data challenges.
Shivakumar Vaithyanathan is the IBM Chief Scientist for Text Analytics and the Department Manager of the Intelligent Information Systems Group at the IBM Almaden Research Center. Since joining IBM in 1998, he has been involved in multiple research areas including development of learning algorithms, especially for extremely high-dimensional sparse data. His department is currently involved in building systems for Scalable Unstructured Analytics, Enterprise Search and Large-scale machine learning and Statistical Modeling. Multiple technologies developed in his department currently ship with several IBM products including IBM’s Big Data Products. Prior to IBM Shivakumar was part of the newly formed Altavista Group at Digital. Shivakumar was a invited keynote speaker at the 2011 German Database Conference and 2011 ACM SiGIR Industrial Track. He is also an Associate Editor of Journal of Statistical Analysis and Data Mining.
Event page provided by ACM