- This event has passed.
Data Science Reinvents Learning
August 24, 2015 @ 6:30 pm
Paco Nathan, O’Reilly Media
*** Bring ID (e.g. Driver’s License) for eBay Security ***
6:30 Doors Open, Food & Networking
*** Please arrive by 7 PM due to Security ***
Project Jupiter https://jupyter.org/ evolved from IPython notebooks, and now supports a wide variety of programming language back-ends. Notebooks have proven to be effective tools used in Data Science, providing convenient packages for what Don Knuth coined as “literate programming” in the 1980s: code plus exposition in markdown. Results of running the code appear in-line as interactive graphics — all packaged as collaborative, web-based documents. Some have said that the introduction of cloud-based notebooks is nearly as large of a fundamental change in software practice as the introduction of spreadsheets.
O’Reilly Media has been considering the question, “What comes after books and video?” Or, as one might imagine more pointedly, what comes after Kindle? To that point we have collaborated with Project Jupyter to integrate notebooks into our content management process, allowing authors to generate articles, tutorials, reports, and other media products as notebooks that also incorporate video segments. Code dependencies are containerized using Docker, and all of the content gets managed in Git repositories. We have added another layer, an open source project called Thebe that provides a kind of “media player” for embedding the containerized notebooks into web pages.
Some examples include:
An early POC, working with Nature magazine, showed how peer-reviewed scientific articles could be provided such that readers could interact with the code+data:
The overall goal is to support repeatable science — or perhaps one as might say “open science”. The tools of Computer Science and Software Engineering are being leveraged to create this. Data Science provided the initial examples; however, now this tooling is being adopted rapidly by genomics and other areas of life science. Overall at O’Reilly Media, we see these frameworks working together as first steps toward a retooling of learning platforms in general.
Experiences teaching with MOOC platforms such as edX have shown that instrumentation and analysis can be bottlenecks for effective pedagogy. Meanwhile, programs at Cal Tech and other institutions have been advancing notions of “inverted classrooms” as an alternative way to leverage online platforms. The use of cloud-based containerized notebooks allows for much better instrumentation and measurement of student interactions, to help model pedagogical aspects of this work. This talk will also consider how and where Data Science practices can benefit Education through an evolution of software platforms.
Known as a “player/coach” Data Scientist who’s led innovative Data teams building large-scale apps for several years. Expertise in machine learning, distributed systems, functional programming, cloud computing. Advisor for Amplify Partners, GalvanizeU. 30+ years tech industry experience, ranging from Bell Labs to early-stage start-ups. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.