Tuesday, March 6, 2018

2 min summary: Notes from HBS AI/ML Investor Panel - Unique Access to Clean Data Essential

Last Saturday, an HBS student club (CODE Club) hosted a Machine Learning Workshop.   Over 100 attendees heard speakers from MIT, large companies (Google, etc.) and several startups.  Below are my notes from the AI/ML investor panel.  The emphasis on unique access to clean data was refreshing to hear and is consistent with my experience.

Notes from the AI/ML investor Panel

Rick Grinnell (Glasswing Ventures)
  • His investment focus is on vertical opportunities.   Verticals are more attractive because of vertical-specific data and a clearer line to customers' budget
  • He encourages startup teams to have a ML knowledgeable person from the beginning.  This is not a market where several business orientated MBA students can start a company with a MVP using offshore development.  

Habbib Haddad (MIT Media Lab E14 Fund)
  • He focuses on investment opportunities with unique and large clean data sets (often vertical plays in agriculture, fintech, etc.) 
  • CLEAN DATA are critical
  • Product managers today must know the new language to talk with data scientist

Mackey Craven (Open View)
  • Attractive investments must have clear path to clean data "moat."   Moats can be created via unique access and contractual terms 
  • Computing power and algorithms are now a commodity and thus are not the basis of competitive advantage.    Defensibility centers on business knowledge and access to clean data that grows uniquely over time and gets to market first, thus flywheel to clean data

Sunil Nagaraj (Ubiquity Ventures) - moderator

Thursday, January 25, 2018

10 Takeaways from AI World Conference in Boston in Dec

Last month I attended the 3 day AI World Conference in Boston with 2,300 others.  The event was cross-industry (digital advertising, health care, autonomous cars, security, etc.) with speakers from leading firms such as MIT, Goldman Sachs, Forrester, Phillips Health Care, Toyota, iRobot, etc.   Below are my key takeaways from my 15 pages of notes and conversations with dozens of people.

AI: artificial intelligence
ML: machine learning
IoT: internet of things

Key Takeaways
  • The category labels (AI/ML/Big Data/IoT) continue to blur and merge, but “AI” is now commonly used as the larger termer umbrella term
    • There are different definitions of AI and there is no trend to consolidate 
    • Two speakers said, “Just substitute AI for ML in all places on my presentation”
    • Many “AI” case studies (health care, etc.) did not involve mechanical or robotic hardware
    • Speakers quoted stats about a large number of “AI” startups but these startups would have been traditionally classified software firms that use ML
    • Everything is called AI now: “AI for the Enterprise”  
    • “As long as the startup has one machine learning model, we call it a ‘AI’ startup“
  • The pace of technology change is relentless and increasing 
    • Many felt the pace of technology change continues to accelerate.  People within the AI/ML community stated that  “it is hard to keep up with AI/ML” and most feel this acceleration will continue.   
    • Companies and consumers will never be able to keep up which provides more opportunities for vendors to package all these capabilities.
  • This increase in technology remains explosive and drives automation
    • Goldman Sachs example
      • 2000: 600 cash equity traders, no software engineers
      • 2017: 2 cash equity traders and 200 software engineers
    • Health insurance open enrollment example  
      • call center=> web => AI
  • While AI and ML are hyped right now, it remains very, very early.  Internet firms (Google, Facebook, Twitter, Amazon) dominate current hiring and case studies
    • 80% [my estimate] of activity, advancement, and energy around AI/ML is from internet firms (Google, Facebook, Twitter, Amazon) which have massive data sets and massive teams and have hired the vast majority of ML talent at high wages
    • The rest of the companies are WAY behind these
  • The successful early applications and ones with the most promise in the next 3 years are those with NARROW feature set that supplement an existing process
    • Examples include
      • Uber: human driver 95% and uber app 5%.  More automation: multiple pickups (human couldn’t do this easily)
      • Social media: scan millions of images for NSFW but have human QA involved for exceptions
      • Internet advertising:  use ML to suggest banner ads but add “guardrail” rules for edge cases (e.g. don’t show ads for coffins when you believe someone has a death in the family)
      • Healthcare: analyze hundreds of MRI scans as second opinion for doctors
    • There were multiple names for this task augmentation with software and ML
      • “Augmented intelligence”
      • “Computer enabled service” 
      • “Co-bots"
  • Fully autonomous AI is decades or more away
    • Yes the computer can beat the human in a chess game, but the computer couldn’t find the table to sit down ;)
    • None of the speakers at the conference believe fully unassisted AI will be rolled out in the next 3 years (autonomous car panel thought it would be 3-10 years before autonomous cars appear in the market beyond current pilots)
    • The media hype is way ahead of realistic adoption
  • Data (and lots of it) are KING. Models are critical but of zero value without data
    • This was a recurring theme, especially from people who have been through any kind of real AI/ML project
    • The data collection challenge and cost is material in many industries.  For example, Electronic Health Records (EHR) is burdened by abbreviations, missing data, inconsistent usage, IT data silo’s and other challenges  
  •  Models can be imprecise and fooled (several examples)
    • Tape on stop sign fools autonomous car

    • Poodle or Ostrich? -  Neural network fooled by removal of several pixels (but human eye can still tell the difference)
    • Is a chihuahua or a muffin?  A labradoodle or fried chicken?
  • The data modeling processes are immature and rudimentary when compared to software engineering and the integration of these two teams is non-trivial
    • No methodology to develop, test, deploy, and document models exist
    • Better tools and processes are the second priority at MIT CSAIL, where they are advocating ModelDB
    • Many firms mentioned difficulty in integrating software engineers and data scientists
    • “Even at MIT, software engineering department is still too stovepiped from data scientist”
  • Neural networks are powerful but many shy away because there is no explainability
    • Neural networks remain very problematic because lack of reasons for prediction
    • Medical diagnostics (why does model say you have cancer?)  why buy this stock?
    • Find way to better explain output from neural networks is top research agenda at MIT per Sam Madden, co-chair of MIT CSAIL
Overall, speakers and attendees are very bullish on the value and competitive differentiation AI/ML bring to the market but the AI/ML market is still in its infancy and the press hype is ahead of realism.  Successful firms that automate more processes in the next 5 years will incorporate AL/ML as the cost of data acquisition and tools continue to decline.

Tuesday, December 12, 2017

Notes from "AI: Beyond the Hype" panel discussion Oct 26, 2017 - Boston

Notes:  "AI: Beyond the Hype" meeting
Oct 26, 2017.  Sponsored by Lewis, a PR firm at The District in Boston


  • Forrester VP.  Mike Gualtier Editor XCO Mag/Northeastern:
  • Mike Farrell Sam Whitmore Media surve
  • AI prof/Tech writer at Boston U:  Joelle Renstom
  • Canvas Ventures partner - Paul Hsiao. First capital in Siri, multiple investments in ML 

Meeting Summary

  • Full people like robots are years away but narrow, pragmatic AI
  • “Augmented AI” is most promising
  • DATA IS NEEDED to get a good model
  • So many examples of applications where there is insufficient data to train the model and a feedback look
  • Speakers/crowd generally agreed that adoption will increase next 5 years and jobs will be displaced, especially low wage jobs
  • Automated Intelligence most promising hear term
  • Truck driving autopilot
  • Robots in burning buildings

Notes by Speaker

Forrester  - Mike Gulatieri

  • Pure AI: sci fi; human like
  • Pragmatic AI: very narrow in scope but beats technical human
  • Google: develop app to beat Go cham
  • Watson beat chess champion
  • AI comprised 9 building blocks
  • ML, 
  • Okay to use one ML and call it AI
  • “Automated intelligence” per Forrester, low skilled workers working with a robot
  • Why is the ONLY home robot the Roomba for sweeping; but no robot for folding laundry
  • Lots of discussion and credit are given to algorithms, but it’s all about the data.
  • There is NO magic in the algorithm
  • Laundry automated with perception learning. We are NOT close. We are many, many years away for this
  • Deep learning in 2012 breakthrough
  • Invideo
  • Deep learning uses neural networks
  • It’s very difficult to test the model to know where it worked
  • All these models are based on probabilities
  • Decision tree can be traced, but neural networks cannot be traced
  • Guardrails or circuit breakers
  • Google has guard rails. When model says “I think someone died, let’s show an add for caskets” and guardrails
  • NO company TRUSTS their ML solely by itself without guardrails
  • CIA and others ask “can we use AI to prevent cyberattacks”.  Forrester says “you need A LOT more breaches” because there is enough data to look
  • Anomaly detection is typically now used instead for security detections but challenge is that there is false positives
  • Alexa is providing Amazon SO MUCH data because of the there are so many people using these  
  • HLMI - high level machine intelligence
  • Survey of 200

VC landscape Paul Hsaio 

  • Must include AI in slide desk to get funding these days
  • commoditization
  • VC focus on proprietary data since so many tools are giving away tools
  • The biggest challenge is lack of engineers coming into this space.  Multiple acquires
  • Very few engineers actually know the AI space
  • Most things have become possible in the 5 years because of CPU, 
  • We are VERY far from AI like person that destroys job
  • Robot advisors has been around 10 years in financials
  • On board of Elance, paid $2B to contractors
  • Automation of trucking
  • Level 1 and level 2: trucks turn on autopilot once on the highway, but off highway is then human.  Like autopilot for plane
  • Uber IA on panel said, 
  • Google puts it’s own cell phones in truck to track them. 99.9% of 
  • Video 4 Berkely PH students trying to figure out how program a robot to fold t-shirt but still not successful.  Automation is hard
  • We’ve seen several startups automated medical records
  • Traction for upcoding; reading the records to figure how to charge the government MORE
  • Maybe we need to start re-evaluating teaching. Some human subjects and others subjects are AI based [I think he means online courses) as part of “Augmented Intelligence” rather than Artifical 
  • Amazon (closed system) vs Google Home (more open)
  • Rapid eco system; innovation happening a RAPID pace
  • What is my bank balance (683 different ways to ask for bank balance.  Machine learning
  • We are at the state today for AI where AOL was 56kp dialup modem
  • We are beginning of a 20-25 year run

Wednesday, November 30, 2016

Use Shared Note Taking and Never Miss an Insight on a Customer

Team note taking is now easy and indispensable.  

We use Word in Box as well as Google Documents.

One particular useful scenario is calls or meetings with customers.  At the beginning of the meeting, our team of, say, 5, (often in-person and remote) start a single shared document and jointly edit it as the customer talks through her needs, introduces attendees with titles, etc.

During the meeting, we keep an going Action Item list at the top of the shared meeting notes.

When the meeting end, we now have a meeting summary, detailed notes and action item list that all 5 attendees have reviewed and agreed.   Moreover, the set of meeting notes is "in the cloud" and easily accessible by everyone and not buried on someone's individual laptop.

I've become fan of Google Hangouts

Google Hangouts are great!  

Google hangouts are really a great tool for multi-site collaboration.  The automatic link in Google calendar makes it easy for to join and the audio quality has improved significantly in the last 12 months.   Desktop users can call people on their cell phone as well.    We've used this for many internal meetings that span from offices in the US and India.    GoToMeeting use has declined significantly.

Tuesday, June 2, 2015

New Blog and Why I Find a Blog Helpful

For several years, I authored a blog, Practical Sustainability, that focused on Sustainability and Energy Management.

This new blog is focused on SaaS/Big Data/Enterprise Software. It also includes a section for personal items.

While it can be time consuming, I find writing blog post useful as it forces me to crystalize my thinking on paper.