Data Science Camp 2017

(Subsections here:  Overview,   Schedule,   Address,   Maps,    Past Sessions)

Register: 0n Eventbrite

Date: Saturday, October 14, 2017

Overview:  Data Science Camp is SF Bay ACM’s annual event combining sessions, keynote, and optional tutorial (extra fee). It’s an excellent opportunity to increase your experience in Data Science and connect with others.  We keep it near-free ($10 charge, includes lunch & coffee), now running in its eighth year.  You can also sign up as part of a group of 2-6 people for $7.50 per person.  The morning class is $60 and includes the afternoon camp. If you wait until the day before the event, or walk in, the price goes up from $60 to $75 for the morning class.

Attendees:  We have 281 people registered, which includes 135 for the morning class.    Attendees companies include:  Apple, Box, Cannon, Cisco, City of San Francisco, Columbia, eBay, GE Digital,, Infosys, Intel, Kaiser Permanente, Knight Capital Funding, Macys, Mavo Institute, Microsoft, Palo Alto Medical Foundation, PayPal, Pfizer, Qualcomm, Symantec, Teradata, Thermo Fisher Scientific, US Dept of Treasury, View Dynamic Glass, Visa, Yahoo, Youtube, Zingbox.     Universities include: San Jose State University, San Francisco State University, UC Davis, University of Lucerne, UC Merced, UC Santa Cruz


8:00 am – 8:40   Arrive, register for class, network, coffee

8:40 am – 10:40  Tutorial: AI and Deep Learning with Python and Keras  ($60, includes full day)

10:30 am – 11:00  People coming for just the Camp ($10) arrive, register and network

11:00 am   Camp Kickoff

Major Sponsor 5 min presentations


Alo Ghosh
Alo Ghosh

Keynote Presentation, 50 min.   Data Science:  Let’s Cut the Hype & Measure its Value to Prevent a Y2K Like Fiasco by Alo Ghosh, A professor → advisor → startup → PE → VC guy’s perspective









12:25             Session Proposals (30 sec description,   count audience hands,  assign to a room for that sized audience).  In the main auditorium, people line up and give a 30 second pitch for their session.  We ask people to vote, raising their hands for the 4-5 sessions they are most likely to attend.  A count of hands determines the room size for the talk (for example, for 30, 60 or 120 seats).  The advance session proposals and web votes educate the audience in advance and give a guideline if the session leaders want to fine tune their content.  You must be present to propose a session.

1:15                Lunch, post Session Matrix  (4 time slot rows by 4-7 room columns).  The organizers put together the session matrix, which will be posted under this Session Proposal tab.  We will try to get it finished as soon before 2pm as possible.  The organizers request minimal disturbances to finish.

2:00 – 2:50   Session 1  (over all the rooms used, likely subdivide the main room that seats 410).  In general, we can open and close dividers in Town Hall (see maps below) or Fireside to make more smaller rooms or fewer large rooms.  We invite “note takers” to add any notes or links to the discussion thread for a given session, or create a thread if one does not yet exist.

3:00 – 3:50   Session 2

afternoon coffee and snacks

4:00 – 4:50    Session 3

5:00 – 5:50     Session 4

6:00 – 6:30    Session Summary, in the largest part of the main room.  We go through the sessions, asking people who attended each session to share any lessons learned or notes.  We also invite any conference feedback.

6:45                 All audience should be out of the building


Location Address:

PayPal Town Hall

2161 North 1st Street

San Jose, CA 95131


Location Maps:
PayPal TownHall, downstairs
PayPal TownHall, 2nd floor


Past Years Sessions (as example content):
DS Camp 2016 Sessions
DS Camp 2015 Sessions

Register: 0n Eventbrite

Time and Cost:

The class will be from 8:40 – 10:40am, with a $60 charge which includes the afternoon camp.  For just the Camp, from 10:40am and the rest of the day is $10 or less.  Registration is on Eventbrite.


Title:  AI and Deep Learning with Python and Keras



Speaker:  Bhairav Mehta and Ravi Ilango are founders of DataInquest, which gives AI / Deep Learning and Data Science training in the SF bay area.  The Teaching Assistants include (alphabetically) Alok Tongaonkar of Redlock,   Greg Makowski of Ligadata,   Harvind Rai of Apple,   Yash Shroff of Intel


What Will I Learn?

  • To describe what Deep Learning is in a simple yet accurate way
  • To explain how deep learning can be used to build predictive models
  • To distinguish which practical applications can benefit from deep learning
  • To install and use Python and Keras to build deep learning models
  • To apply deep learning to solve supervised and unsupervised learning problems involving images, text, sound, time series and tabular data.
  • To build, train and use fully connected, convolutional and recurrent neural networks
  • To look at the internals of a deep learning model without intimidation and with the ability to tweak its parameters
  • To train and run models in the cloud using a GPU
  • To estimate training costs for large models
  • To re-use pre-trained models to shortcut training time and cost (transfer learning)



  • Knowledge of Python, familiarity with control flow (if/else, for loops) and pythonic constructs (functions, classes, iterables, generators)
  • Use of bash shell (or equivalent command prompt) and basic commands to copy and move files
  • Basic knowledge of linear algebra (what is a vector, what is a matrix, how to calculate dot product)
  • Use of ssh to connect to a cloud computer



This training is designed to provide a introduction to Deep Learning using Keras. It is aimed at beginners and intermediate programmers and data scientists who are familiar with Python and want to understand and apply Deep Learning techniques to a variety of problems.

We start with a review of Deep Learning applications and a recap of Machine Learning tools and techniques. Then we introduce Artificial Neural Networks and explain how they are trained to solve Regression and Classification problems.

We introduce and explain several architectures including Fully Connected, Convolutional and Recurrent Neural Networks, and for each of these we explain both the theory and give plenty of example applications.

The goal is to provide students with a strong foundation, not just theory, not just scripting, but both. At the end of the course you’ll be able to recognize which problems can be solved with Deep Learning, you’ll be able to design and train a variety of Neural Network models and you’ll be able to use cloud computing to speed up training and improve your model’s performance.


Who is the target audience?
Software engineers who are curious about data science and about the Deep Learning buzz and want to get a better understanding of it
Data scientists who are familiar with Machine Learning and want to develop a strong foundational knowledge of deep learning


  • Machine Learning and Deep Learning Introduction
  • Gradient Descent
  • Neural Network
  • Convolution Neural Networks
  • Recurrent Neural Network
  • Keras Intro along with other libraries like Theano, Tensorflow, DL4J, MXnet etc.
  • Set up your environment and AWS powered GPU based Jupyter Notebook
  • Identify Keras and other Deep Learning libraries are installed
  • Load image data from MNIST and CIFAR.
  • Preprocess input data for Keras.
  • Preprocess class labels for Keras.
  • Define model architecture.
  • Compile model.
  • Fit model on training data.
  • Evaluate model on test data.
  • Load Time Series data from Airline Passenger Data
  • Define RNN model Architecture
  • Fit RNN model
  • Evaluate model on test data
  • Improve Performance
  • Track performance
  • Some demonstrations of AI based technologies


Speaker Bio:

Bhairav Mehta is Senior Data Scientist with extensive professional experience and academic background. Bhairav works for Apple Inc. as Sr. Data Scientist.

Bhairav Mehta is experienced engineer, business professional and seasoned Statistician / programmer with 19 years of combined progressive experience working on data science in electronics consumer products industry (7 years at Apple Inc.), yield engineering in semiconductor manufacturing (6 years at Qualcomm and MIT Startup) and quality engineering in automotive industry (OEM, Tier2 Suppliers, Ford Motor Company) (3 years). Bhairav founded a start up DataInquest Inc. in 2014 that is specialized in training/consulting in Artificial Intelligence, Machine Learning, Blockchain and Data Science.

Bhairav Mehta has MBA from Johnson School of Management at Cornell University, Masters in Computer science from Georgia Tech (Expected 2018), Masters in Statistics from Cornell University, Masters in Industrial Systems Engineering from Rochester Institute of Technology and BS Production Engineering from Mumbai University.

See also:


Keynote Title:  

Data Science:  Let’s Cut the Hype & Measure its Value to Prevent a Y2K Like Fiasco

Alo Ghosh, A professor → advisor → startup → PE → VC guy’s perspective

Alo Ghosh
Alo Ghosh










Talk Description:

From ‘data deluge’ to ‘deep learning’, data science has been hyped to the point that businesses across the spectrum have spent billions, spurred on mostly by ‘FOMO’, only to find very little to show for this spend. There is today widespread CXO frustration re data science. To prevent a Y2K like fiasco, the data science community must show value, not from trumped up analyst reports and pithy product demos, but by addressing their key pain points:

  1. Data prep/munging/wrangling solutions, particularly regarding unstructured data,
  2. REAL data science talent and its ubiquitous unavailability, despite its myriad credentials,
  3. Expose machine learning (pattern/anomaly detection) and its celebrity child – deep learning (“the only real success of deep learning so far has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data”) for what their minuses and pluses really are. True learning must combine data with rules.
  4. Black boxes created by ML/DL algorithms deter their use in financial and medical worlds

And quickly moving on to demonstrate how it can add real, measurable economic value to businesses:

  1. Strategy – Harness established tools like ‘DCF, real option and game theories’ to draw valuations of strategic alternatives and then monitor performance of the chosen strategy
  2. Finance – Marshal internal and market data to track company valuations and risk premiums
  3. Marketing – Track brand valuation and use dynamic pricing to capture economic surplus
  4. Supply Chains – Emulate what Amazon does so well

Greed rather than fear will establish data science as businesses’ primary value adding spend.


Background Reading (optional):

“The Dark Secret at the Heart of AI”   MIT Technology Review,  April 11, 2017

“How CEOs Can Keep Their Analytics Programs from Being a Waste of Time” Harvard Business Review, July 21, 2016

The Age of Analytics: Competing in a Data-Drivin World,” McKinsey & Company, 2016


Alo Ghosh’ Bio:

Based on my expertise (PhD Wharton + 4 Master’s) and 35-yr. experience (Wharton professor, McKinsey NYC finance expert, PE & VC partner in NY & ASEAN, $4B country fund head for 3 yrs, bootstrapped Silicon Valley fintech startup to $100M in the 1990’s), I will provide an overview of the real-life footprint of data science and a proven way to measure its impact first developed at McKinsey.

MBA-PhD (Finance-Wharton).

Hands-on Silicon Valley co-founder, co-funder, interim CXO and risk manager of FinTech startups in Insurance & Investments, such as AI Labs

  • Co-founder, interim CXO, seed funder to Silicon Valley FinTech & EdTech startups.
  • Advisor to decision makers in hands-on creation of sustainable shareholder value.
  • Trained data science expert in statistics-econometrics, O.R., quantitative finance.
  • Created business plans for projects and ventures securing $’000 million in funding.
  • Sourced large PE deals, headed country wealth fund, consulted widely in S/E Asia.
  • Taught Wharton MBA, co-led McKinsey finance, forged global fintech consultancy.

Thirty-five years of learning & practice at the cutting edges of finance, strategy, technology in some of the world’s most storied institutions as well as with the most diverse of startups (including my own), universities, governments, private equity funds, hedge funds, investment banks & sustainable development non-profits/NGOs in several different parts of the world.

  • Helped start McKinsey’s Corporate Finance practice and its bestseller ‘Valuation’ now in 5th ed: million in print
  • Initiated PE startup of billion-dollar petrochemical complex in India with ecosystem now employing ~200,000
  • Led teams in initiating, structuring, financing, starting sea/air ports, wireless network, sustainable plantations for carbon credit, hydro-power ventures at PNG. Structured financing for Air Niugini, Lihir Gold, microfinance
  • Led software modernization at 6 of the world’s largest financial institutions using Techna’s patented software
  • Applied state-of-the-art quantitative models in economics-finance to unlock huge value in infrastructure PPPs
  • Was among first charter members at TiE, the world’s largest entrepreneurial network of 20,000 in 60+ regions
  • Created & led one of South Asia’s first institutes teaching degrees & diplomas from LSE, Oxford & Cambridge.

To Register go to Eventbrite

To Vote or Submit on sessions, authenticate via [Login with Google] or [Login with Facebook].  We don’t retain authentication info and will not spam you.  See the session proposal guidelines below before submitting a session.

Home Forums Data Science Camp

Sort topics by date
Viewing topic 1 (of 1 total)
Viewing topic 1 (of 1 total)
  • You must be logged in to create new topics.

Session proposal :

  • Topic Description & Bio box
    • Describe your session in a paragraph or two.  Provide any links to related background reading, prior presentations or other material.
    • Provide your bio in a paragraph.
    • We invite links to your LinkedIn profile, Slideshare presentations, past meetup or conference presentations, github or source code contributions, Kaggle competition participation.
  • Topic Tags (use as relevant or invent others)
    • Experience level of the talk for the audience:  beginner, intermediate, advanced (please use at least one of these tags)
    • Talk type:  talk, tutorial, panel (other formats are allowed)
    • Language:  Python, R, Java, Scala, …
    • Library:  TensorFlow, Keras, BigML, StanfordNLP, SciKit-Learn, …
    • Algorithm or family: neural nets, deep learning, RNN, CNN, LSTM, decision tree, random forest, XGboost, SVM, clustering, co-clustering, SVD, NLP, recommender systems…
    • Software tool:  Hadoop, Spark, Zookeeper, Kafka, MQ, HBase, Cassandra, MongoDB, Cloudera, Hortonworks, MapR,..
    • Vertical or application:  web behavior, social networks, finance, retail, advertising, healthcare, autonomous vehicles, security, fraud detection
    • Misc: remote session, hiring, consulting, VC, model description, data visualization, continuous learning, blockchain…


Session Schedule

  • 12:25 – 1:15 Session proposals:  People give a 30 second pitch for their session.  The audience votes for their preferred sessions.
  • 1:15 – 2:00:  Lunch.  The Session Matrix with room locations will posted under this Session Proposal tab by 1:45.
  • 4 time-slots (2-2:50pm, 3-3:50pm, 4-4:50pm, 5-5:50pm) over multiple room tracks
  • 6:00 – 6:30 Session summary, wrap up and feedback
  • 6:45:  Finished 

Session Guidelines:

  • Aim to have 50 min of material for a session. Talks must be technical, educational, no sales demos.
    • If you have only 15 or 20 minutes, tell us your time estimate and we will try to combine you with another short session.  If we can combine similar topics, we will.
  • We expect you to be 5  min early for your session to connect to your laptop and other technical setup. You must yield the room on time to the next speaker.  You can recruit a time keeper and a note taker. Please give us feedback on speakers who don’t show up to their sessions.
  • We suggest you: bring your session materials (incl. videos) on USB memory stick as a backup in case of connectivity, wifi or projection resolution issues. Bring a variety of dongles for video hookup.  Have your laptop charged and bring a power cord.
  • Remote sessions can be proposed, using presenters from another country or time zone. Please use the “remote session” tag.  You must let us know in advance if you would like to try this in the session proposal.  This requires a local volunteer to a) propose the session, b) go to the session and setup a computer with some remote screen sharing software (i.e. Skype, WebEx, Zoom,…), c) possibly hold the microphone up to the computer speaker, d) facilitate any audience questions.
  • Please let us know if you can bring a slide projector, and possibly a screen. The venue has a few rooms without AV.

Register: 0n Eventbrite

    The San Francisco Bay Area ACM (SFbayACM) is a local professional chapter of the ACM.  We are a 501c(3) non-profit, run by unpaid volunteers.  We were founded in the bay area in 1957.   We have about 20-25 events per year, about 6k people active in our Meetup group, coming to our two talks a month, on Data Science (data mining, big data) and General Computing.  We post about 120 of our more recent talks on our YouTube channel.  Let us know if you have a regular audience for live streaming of our talks.

    The Association of Computing Machinery (ACM) is the largest professional computing society, publishing journals and hosting conferences among other activities.  The ACM was founded in the bay area in 1947.

    If you would like to discuss being a sponsor, fill out the form below and make sure to include available times and contact details.


    Venue Sponsor

    PayPal PayPal – Who we are

    Fueled by a fundamental belief that having access to financial services creates opportunity, PayPal (Nasdaq: PYPL) is committed to democratizing financial services and empowering people and businesses to join and thrive in the global economy. Our open digital payments platform gives PayPal’s 203 million active account holders the confidence to connect and transact in new and powerful ways, whether they are online, on a mobile device, in an app, or in person.  Available in more than 200 markets around the world, the PayPal platform, including Braintree, Venmo and Xoom, enables consumers and merchants to receive money in more than 100 currencies, withdraw funds in 56 currencies and hold balances in their PayPal accounts in 25 currencies.

    Platinum Sponsor

    UCSC Extension

    We offer an accredited, convenient, and attractively priced alternative to degree programs, serving the advanced professional education needs of Silicon Valley and beyond. Each year, more than 10,000 adults who live and work in the greater South Bay area study here to earn University of California certified credentials that are widely recognized in a range of industries. We are the region’s leading educator of professionals in more than 40 areas of expertise that are in high demand among Silicon Valley employers.


    KDD provides the premier forum for advancement and adoption of the “science” of knowledge discovery and data mining. KDD encourages:

    • Research in KDD (through annual research conferences, newsletter and other related activities)
    • Adoption of “standards” in the market in terms of terminology, evaluation, methodology
    • Interdisciplinary education among KDD researchers, practitioners, and users
    • KDD activities include the annual conference on Knowledge Discovery and Data Mining  and the  SIGKDD Explorations Newsletter
    • The KDD 2017 conference will be in Halifax, Canada on August 14-17, 2017