A Smarter Process for Sensing the Information Space, Scott Spangler, IBM Almaden

Posted September 3rd, 2010 by Paul O'Rorke and filed in Announcement, DM SIG Meeting

Posted September 3, 2010 by Paul O’Rorke

LOCATION: LinkedIn, 2025 Stierlin Ct, Mountain View, CA 94043

Date: Monday November 22, 2010; 6:30 pm 6:30 – 9:00 pm (6:30 – 7:00 networking & snacks; 7:00 – 7:10 announcements; 7:10+ presentation, Q&A)

Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events.

Speaker: W. Scott Spangler, IBM Almaden Research Center

Title: ”A Smarter Process for Sensing the Information Space”

Abstract:

With the growth of the internet the size of the information space is increasing exponentially. But more information is not always better. Furthermore, as the complexity of business relationships increases, there is a natural tendency towards less structured interaction. This highlights the growing relevance of unstructured information in documenting the interactions of organizations and individuals. Analyzing and making sense of this unstructured information space requires more than text mining algorithms, it requires a strategic approach.

While every information analysis situation is somewhat unique, we propose a unified approach that addresses a wide variety of information space analytics problems. Our method for making sense out of unstructured data is described by six steps that are analogous to the algebraic order of operations, PEMDAS. These basic text mining operations can be combined in many interesting ways to handle a diverse set of problems, and just as in algebra, it is critical that these operations be performed in the correct order to guarantee a meaningful result. In this talk, I describe how PEMDAS has been implemented within smart organizations to enable decisions that produced substantial and quantifiable business value.

Bio:
W. Scott Spangler IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, CA 95120 (electronic mail: spangles@almaden.ibm.com) Scott Spangler is a Senior Technical Staff Member and Master Inventor at the IBM Almaden Research Center. He has been doing knowledge base and data mining research for the past 20 years. Since coming to IBM in 1996, Scott has developed software components for data visualization and text mining, which are available through eClassifier, Business Insights Workbench, COBRA and SIMPLE service offerings. Scott holds a Bachelors in Math from MIT and a Master in Computer Science from the University of Texas. Scott holds 22 patents and has authored 24 conference/journal publications as well as a book entitled, Mining the Talk: Unlocking the Business Value in Unstructured Information.

Learning when Concepts Abound – Omid Madani, SRI AI Center

Posted September 3rd, 2010 by Paul O'Rorke and filed in Announcement, DM SIG Meeting

Posted September 3, 2010 by Paul O’Rorke

LOCATION: LinkedIn, 2025 Stierlin Ct, Mountain View, CA 94043

Date: Monday September 27th, 2010; 6:30 pm 6:30 – 9:00 pm (6:30 – 7:00 networking & snacks; 7:00 – 7:10 announcements; 7:10+ presentation, Q&A)

Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events.

Title: Learning when Concepts Abound

Abstract:

Categorization is fundamental to intelligence. Without categories
(concepts or classes), every experience would be new, and we couldn’t
make sense of our world. We humans also require numerous concepts for
our increased sophisticated intelligence. From a practical perspective,
in some of today’s applications, such as text categorization, image
tagging, and word prediction, the number of classes can easily exceed
tens of thousands. A number of applications can benefit from
scalable learning under a huge number of classes.

In this talk, I will briefly go over supervised learning, in
particular multiclass learning. I will then present the approach of
learning a sparse feature-to-class mapping, or index learning. The
crucial property in efficient index learning is constraining each
feature to connect to (predict) a relatively small number of classes.
Online updating and classification take time that is almost linear in
the number of features of a given instance. I will touch on a number
of update techniques and related approaches. While our primary driver
has been scalability and simplicity, we have observed that
classification accuracies remain competitive or better when compared
to a number of other approaches, while we obtain speed up of orders of
magnitude. I will discuss applications to several tasks.

Bio:

Omid Madani is a senior computer scientist at the Artificial
Intelligence Center of SRI International. He is interested in all
aspects of intelligence and mind, as well as algorithms design and
analysis. His current research revolves around the themes of
large-scale learning and data mining, including learning in the
presence of myriad concepts, online learning, and unsupervised
learning, in particular exploring and engineering systems that learn
their own many concepts (computational development). In the 2009
European PASCAL Challenge on Large-Scale Hierarchical Text
Classification, with just over 12k classes, his team’s approach
obtained top rankings from among 18 participants. He has
successfully applied learning techniques to a number of information
retrieval applications.

Omid obtained a PhD in computer science from the University of
Washington in 2000 (thesis topic: Computational Complexity of Markov
Decision Processes). After a brief period in the industry, he went
back to academia as a postdoc at the University of Alberta, and then
back to the industry, as a senior research scientist at Overture and
then Yahoo! Research, before joining SRI. He was awarded the Alberta
Ingenuity Associateship while in Alberta. He is a life-time member of
the Association for Advancement of Artificial Intelligence (AAAI), and
a member of the Association for Computing Machinery (ACM), and the
Cognitive Science Society.

web: http://www.omadani.net

Software Package Development Processes and R on September 15, 2010

Posted August 30th, 2010 by MatthewBascom and filed in ACM Meeting

Date: Wednesday, September 15, 2010; 6:30 pm 6:30 – 9:00 pm (6:30 – 7:00 networking & snacks; 7:00 – 7:10 announcements; 7:10+ presentation, Q&A)

Location: LinkedIn, 2025 Stierlin Ct, Mountain View, CA 94043

Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events.

Speakers: Spencer Graves, PhD, Productive Systems Engineering, and Sundar Dorai-Raj, Google Inc.

Title: “Software Package Development Processes and R”

Abstract:

This presentation will outline major elements of a good software package development process, illustrated primarily with the standard package development process used with R. R is an object-oriented programming language for statistics and an open-source alternative to S-Plus. The Comprehensive R Archive Network (CRAN) repository of contributed packages has grown roughly exponentially since its founding thirteen years ago, with over 2400 contributed packages available as of June 2010. CRAN and the standard package development process have helped make R the language of choice for an increasing portion of people involved in new statistical algorithm  development. Continue Reading »

Partner Announcement: SDForum

Posted August 24th, 2010 by Martin Stein and filed in ACM Meeting

Two very interesting events by SDForum:

Title: SDForum’s Clean Tech Breakfast: “Revamping the Smart Grid”

Title: SDForum’s Quarterly Venture Breakfast: “Clean Technology’

Continue Reading »

DMSIG – Charting SearchLand: Search Quality for Beginners August 23, 2010

Posted May 8, 2010 by Patricia Hoffman, PhD

LOCATION: LinkedIn, 2025 Stierlin Ct, Mountain View, CA 94043  

Date: Monday August 23, 2010; 6:30 pm 6:30 – 9:00 pm (6:30 – 7:00 networking & snacks; 7:00 – 7:10 announcements; 7:10+ presentation, Q&A)

Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events.

Speakers: Valeria de Paiva PhD, Cuil, Inc.

Title: “Charting SearchLand:
Search Quality for Beginners”
Continue Reading »

Partner announcement: IEEE Cloud Forum & Multicode Programming on October 13, Sept 14 2010

Posted August 20th, 2010 by Martin Stein and filed in ACM Meeting

Two events by the IEEE – another interesting organisation of computing professionals:

  • The IEEE Cloud Forum for Practitioners. The Cloud in 2013 – October 13 in Monterey
  • Multicore Programming. Pitfalls and Solutions. September 14 at Microsoft in Mountain View

Continue Reading »