Opportunity is missed by most people because it comes dressed in overalls looking like hard work.”
— Thomas Edison
Imagine you’re a banker in 1896 trying to get investors excited about buying
shares of 12 companies that are included in the new Dow Jones Industrial
Average. Most of the companies produce valuable commodities such as
cotton, sugar, tobacco, gas, iron, coal, and rubber.
Many investors are skeptical about one company in the index: Thomas
Edison’s company, General Electric. These investors have heard the hype
about electricity and seen a few examples of electric motors and lights. But
they can’t understand why a company focused on electricity deserves to be
listed next to cotton and sugar, commodities which dominate worldwide life
and commerce. They ask, “How is the average business or family going to
use electricity?”
Of course you can imagine how—as the world approaches a new millennium,
you’ve been following the work of Nikola Tesla and Thomas Edison for years.
You can see an inevitable new world economy emerging as companies learn
how to transmit and leverage electrical power. But you can’t point to power
lines, radios, appliances, or factories, so how can you explain the potential of
electricity to skeptical investors? How can you get them to understand the
business opportunity in a technology which can change everything?
I often feel like this fictitious banker. Every time I speak to a company about
AI, I am asked the same question: “What can our company do with AI?” Most
people recognize that a transformational shift is coming, but they don’t see
a clear connection between AI capabilities and today’s problems. They’re
looking for simple answers to complex questions.
Unfortunately, I don’t have simple answers for them because:
You can get some (mostly technical) insights from company blogs, podcasts, and conference presentations, but there are no easy answers.
Sometimes people give up when I can’t provide easy answers about AI. But not you. You’re reading this book because you want to be a leader in the AI revolution. You realize that fortunes and careers will be made in the next five years by those who put in the effort to connect AI technology with business needs. In Part 2 you will learn how to make this connection. I’m going to share specific tools we use every day to bridge the gap between AI capabilities and your hardest business problems.
Each AI breakthrough creates thousands of opportunities for new
products and services. Unfortunately the connection between fundamental
breakthroughs and specific solutions isn’t always obvious.
For example, consider a few fundamental breakthroughs of the Internet in 1996:
Put all of these breakthroughs together, and what do you get? In 1996 that question was difficult to answer. Today we know that these were the building blocks for Internet services like e-commerce, content marketing, and online publishing.
But how did we get from HTML to an application like e-commerce? TCP/IP to
online publishing? What was the unifying concept of these technologies, and
why did it emerge?
The technologies emerged because whether they knew it or not, businesses
that would survive past 1996 needed a way to interactively and efficiently
communicate with customers in real time. This need was met by the unifying
concept that was referred to as a web site.
Web sites, email, and instant messaging are examples of original Internet
product patterns. Product patterns are practical applications of technology
which solve recurring problems.
A product pattern enables businesses to identify workable solutions that
are based on breakthrough technologies. By considering a broad product
pattern, businesses of the early Internet age could have avoided the
complexity of, “What can I do with HTML?” and instead asked “What can I do
with a web site?”
We need similar tools to connect fundamental AI breakthroughs with workable
solutions. I call these tools AI product patterns. AI product patterns are practical
applications of AI technology that solve recurring business problems.
At present, businesses should consider four basic AI product patterns. But
as researchers continue to produce breakthroughs, I expect to add more
patterns to this list. Currently the AI product patterns include computer
vision, natural-language processing, next-in-sequence predictions, and
collaborative filters. These address problems that arise in a variety of
business contexts.
Computer vision applications use software to generate a high-level
understanding of digital images and videos. “Describe what you see” is a
programming challenge that has vexed computer scientists for decades.
Governments, corporations, and private investors have spent billions trying
to advance the state of the art because the potential payoff is so large.
Interpreting images—to accomplish tasks such as driving a vehicle, selecting
the best photo for a story, and scanning the horizon for an enemy vessel—is
one of the most manually intensive and expensive business processes.
Historically, computer vision solutions haven’t worked very well, and few
made it out of the lab.
In 2014 researchers began making breakthroughs in computer vision by
using deep learning models called convolutional neural networks (CNNs).
Not only did CNN researchers achieve far better results in annual computer
vision competitions, they also published CNN models which could generalize
to many other computer vision problems.
Three years later Apple released a facial recognition solution in iOS 10 using
the same approach. AI researchers are now achieving better-than-human
results on computer vision tasks.
“Dogs vs cats” is a classic computer vision challenge where developers
attempt to build an algorithm which can automatically classify an image
as containing a cat or containing a dog. As recently as five years ago, the
best researchers could achieve no better than 80% accuracy. Today novice
machine learning engineers can achieve better-than-human results using AI.
As advances continue, computer vision now makes significant contributions
to applications like these:
In the following example from Google researchers, a computer vision model
identified specific objects in an image and assigned confidence scores to
their predictions.
Take a closer look at this example. Some of the kites or people are pretty
easy to identify, but how about those on the far right? Why can the computer
predict with 96% certainty that the one small black blob is a kite and with
87% certainty that the other black blob is a person? The deep learning
algorithm learned that kites don’t usually float on water and that people
don’t hover in the sky.
Now imagine a future application which uses video, multiple cameras from
different angles, sound, and multispectral (e.g., ultraviolet) images to power
deep learning algorithms. The image detection capability would make
Superman jealous.
Computer vision is the most mature AI product pattern. Your AI strategy
should start by looking for business processes where you can apply the
computer vision product pattern. First identify business processes in which
images are already used, and then consider processes that could benefit
from the use of images.
If you have business processes which depend on people looking at images,
you are probably already falling behind your competitors. Are analysts
reviewing satellite images? Do customers send you pictures? Do you check
user identification before providing access? Do your employees take pictures
while inspecting equipment?
Consider any business process where images or videos are already used. In
most cases you will be able to identify the goal of the business process. This
goal should align with an output. For example, suppose customers send you
images of damaged products as part of a warranty claim. Claim-processing
specialists review these images and approve or reject the application. The
inputs to this business process are the images (and other claim data), and
the output is the approve/reject decision.
The proliferation of inexpensive drones, satellites, smartphones, and cameras
has led to an explosion in available images. Computer vision techniques
allow you to leverage these new data sources to improve existing business
processes. For example, suppose you are designing the Intelligent Vacation
Planner described in the introduction. You could use computer vision to
ingest images in a customer’s Instagram feed and predict their hobbies.
Your biggest operational challenge will always be building training data for
your AI algorithms. You’ll want to know what to look for in your existing data
and how to prepare it for use in an AI algorithm.
Inputs are the images themselves, along with any associated metadata
(customer ID, timestamp, location, etc.). If your existing business process
uses images, you may already be generating outputs—assigning tags, putting
images into folders, circling objects, etc. If not you probably don’t have
data that’s organized well enough to train computer vision models. To use
your images for this purpose, you’ll need to label them. The costs of image
labeling vary depending on the domain. You can outsource simple image
labeling to crowdsourcing services like Amazon Mechanical Turk for pennies per
image. More complex image labeling will require the time of expert analysts.
For example this chest x-ray has been labeled by a professional
radiologist who identified a region containing pneumonia. Paying skilled radiologists to label x-ray images is obviously more expensive than paying someone to determine whether a picture contains a cat.
Unfortunately there is no easy way to know how much data you need. But data scientists have developed techniques for getting good results from ever smaller quantities of training data. The most useful technique in computer vision is transfer learning, a process that starts with a model already trained on another set of images. Other techniques are image generation, semisupervision, and pseudo-labeling.
You will need less training data if you can start with an easy task. For instance, separating images into two categories (e.g., interesting and not interesting) takes less training data than categorizing images into 1000 categories.
Natural-language processing (NLP) applications process and interact with human-generated (i.e., natural) language data. In 2017 you probably noticed your smartphone getting better at recognizing your speech and turning it into text. Speech recognition is an example of NLP.
NLP is another classically hard computer science problem which has vexed researchers for decades. Although AI advances in NLP lag behind those of computer vision by two or three years, researchers are making rapid progress with deep learning.
NLP research continues to yield results in the following applications:
To date, deep learning has made the biggest impact on traditional NLP
problems like machine translation and speech recognition. The large tech
companies are investing tremendous resources building training data for
mass-market services. But you don’t need to invest in solutions the way they
do, since you can use whatever solutions they create.
For example, in 2016 Google
released a new version of Google
Translate built with deep learning.
Users instantly recognized a
breakthrough, and many observers
consider it to be the moment when
practical AI arrived.6
Now, instead of building your own machine translation solution from scratch,
you can just use Google Translate (APIs).
Before 2018 there were few practical applications for NLP in most enterprises, but the tools and research are advancing rapidly. Now is a great time to explore the many ways NLP will impact your company—because the potential is massive.
How many professionals spend most of their days interacting with natural-language data? Reading, summarizing, categorizing, or generating documents? Writing reports? Answering routine email? A lot.
Consider the US federal government. Seventy percent of the US federal workforce performs professional or administrative tasks, and less than 50% of these employees have a college degree. Many of these tasks will be replaced with AI-powered NLP, and you can probably find many similar applications for the technology in your company.
NLP will also increase human productivity when we offload routine activities to AI. Imagine how much time you can save when a computer can read,
categorize, and summarize your email or attend a meeting and send you a
summary that includes the 10 most important sentences.
The opportunities for NLP are endless, but you’re not reading this book
because you care about distant-future functionality. You want to know
how NLP will impact your company in the next 18 months. Document
classification is the most likely answer.
Most companies spend significant resources reading and classifying
documents. For example:
Practical, inexpensive NLP techniques for classifying documents are evolving
rapidly, and you can expect many to begin hitting the enterprise playing field
in 2018.6 Begin your AI strategy by looking for places where people are
currently reading and classifying documents.
Generating training data can be more challenging in NLP than in computer
vision, for several reasons:
For these reasons you will probably want to start looking for NLP applications
by considering existing business processes where outputs are already being
generated. For example, if you want to automatically process forms, start
by identifying outputs generated by the people who currently read and
processes these forms. The inputs are the text and other data in the forms.
Our third AI product pattern, next-in-sequence prediction, does not fit
into the domain of traditional computer science challenges like computer
vision and NLP do. You probably won’t hear about it at conferences. Few
AI researchers will recognize the importance of the problems it solves, and
even fewer will write papers about it. Nevertheless, it is probably the most
practical, actionable product pattern because even modest results can
generate clear ROI.
Next-in-sequence methods address the common business problem of
predicting “next” results based on previous results in a structured dataset.
Traditional machine learning has been used in next-in-sequence problems
like detecting credit-card fraud for decades. Recently developers have
achieved better results using deep learning techniques which can uncover
complex relationships in data.
Examples of next-in-sequence applications include the following:
All of these applications require a lot of structured data, and you might be
surprised how much of this sort of data you already have access to.
The term structured data can have many different meanings. In this context
structured data refers to the kind of data that is found in a database table or
a parsable format like a comma-separated values (CSV) file. This data can
be either continuous or categorical.
Continuous data has (theoretically) infinite possible values. Examples are
prices, task completion time, and temperature.
Categorical (or discrete) data has one of a finite set of values. Country, state,
and blood type are categorical values.
For our purposes the theoretical definition of structured data isn’t important.
If the data looks like it belongs in a database, it is a good candidate for
next-in-sequence methods. Conversely, data like text documents and
user-generated string variables (e.g., tweets, web comments) is a better
candidate for NLP. Images are not normally found in databases and are
better candidates for computer vision methods.
Since most business applications already have tons of data in databases, you
will encounter many opportunities for next-in-sequence approaches.
Sensors, Internet of Things (IoT), log files, sales events, and online user
behavior events are all good candidates for next-in-sequence applications.
A few techniques will help you identify how to apply next-in-sequence AI to
your business processes. These include reviewing your data dictionaries and
key performance indicators (KPIs) and considering new data sources that
could enhance predictions for your existing business processes.
Your product teams may already have documents which describe the fields
in your databases. These documents are usually called data dictionaries or
metadata repositories. They are designed to be read by a human being who
wants to know what is in the database and what the fields mean.
You can identify opportunities for next-in-sequence predictions by simply
reading the data dictionaries or having someone brief you on every field.
Start with the databases currently used for reporting or metrics. You know
the team that uses Tableau to produce pretty reports? Start with the
database they use.
For example, consider the data dictionary for sales at Corporación Favorita,
Ecuador’s largest grocery store:
In this table, can you identify inputs and outputs for a next-in-sequence
model? To identify an output, consider what result you might want to predict.
The best candidate for helping you plan your inventory is unit_sales.
What plays into that prediction? The inputs here are date, store_nbr, item_
nbr, and onpromotion since a model may be able to use each to predict
unit_sales.
How about id? It isn’t useful as an input or output.
Do your managers already use key performance indicators (KPIs) to track
and manage their operations? Do they track same-store sales, the number
of documents processed, or cases closed per month? These sorts of KPIs are
often derived from high-volume business events, so they can make good AI
outputs to help you predict future performance or operational needs.
Consider new data sources
New data sources are constantly emerging. Look for unobvious data sources
from third parties which may predict future events. Examples of these sorts
of data sources include weather, commodities pricing, and consumer credit
scores. These data sources could be inputs for your next-in-sequence AI models.
Training data for next-in-sequence models
Finding enough training data generally isn’t a problem for next-in-sequence
applications. You won’t need people to hand-label or generate new data,
because outputs can be derived from the data. For instance, Corporación
Favorita’s data dictionary contains all necessary training data; in this case
you would just train your model to predict past unit_sales.
Your challenge is picking the right data and organizing it to allow for efficient
model training. Overcoming this challenge requires data analysis and
engineering work.
Many output options
Next-in-sequence systems often have many potential outputs. You must
identify the best candidates among them. For example, you could predict the
sales volume for a particular item in a store. Or you could predict sales for
the whole store or sales turnover for a region or time until the next sale.
Identifying the right output requires detailed exploration of the data and
knowledge of the business processes. Ideally a cross-functional team
including analysts, data scientists, programmers, and operations managers
will collaboratively select the best outputs.
In practice? For me it is usually faster to build a dozen models which predict
every potentially useful output and then begin testing to identify which
output creates the most value.
Input feature engineering
Once you pick candidate outputs, your data scientists or machine learning
engineers will have to invest time exploring candidate inputs. For example,
you may have to derive new inputs such as “average time between
anomalous events” for every sensor.
The process of creating new inputs is called feature engineering. (Recall
from Part 1 that inputs are also called features.) Your data scientists will be
familiar with these techniques. Feature engineering can require substantial
engineering resources and could be the most expensive part of your next-in-sequence
application.
Feature engineering is not normally required in many deep learning systems
that use computer vision or NLP. As a result many deep learning advocates
claim that feature engineering is no longer necessary for any deep learning
application. But for next-in-sequence applications this currently isn’t true.
For example, consider the feature engineering required for this sales database
of a convenience store:
Why do beer sales vary so much by date? The answer isn’t obvious
from looking at this small dataset until we perform some simple feature
engineering. Let’s try deriving the day of the week for each date:
Convenience store beer sales are higher on Fridays and Saturdays—not
exactly an earth-shattering revelation for anyone who has visited a 7-Eleven
on a Saturday afternoon.
You can imagine that other factors such as weather, holiday schedules, and
paydays would all be potentially useful inputs. If your database doesn’t include
these inputs, you will have to buy or derive them through feature engineering.
Product pattern 3: Collaborative filtering
Our final AI product pattern currently applies
to a smaller niche of applications than the
previous three patterns. Collaborative filters
make predictions about user behavior by
collecting behavioral events for many users.
Online recommendation systems are the
most common example of collaborative
filter applications.
Consider a service which sells digital
products like pictures, music, movies, or
e-books online. The service wants to boost
sales by recommending the most relevant
product to a user. Collaborative filters make
these recommendations by comparing
purchases among users who have
similar interests.
My wife likes watching romantic comedies (aka romcoms) on Netflix. I hate romcoms. Whenever she turns on a romcom I throw a temper tantrum like a five-year-old (unless it stars Vince Vaughn). A collaborative filter an identify preferences like ours as data relationships and make better recommendations so we can both enjoy watching a movie together.
Other examples of collaborative-filter applications include:
Like next-in-sequence predictions, collaborative filters primarily use information that’s already available in your databases. You will have to use similar techniques for building your training data. Traditional approaches to collaborative filtering suffer from challenges of sparse data, scalability, and
adaptability. Deep learning models overcome many of these challenges.
When you’re considering how collaborative filters can improve your business processes, look for processes which give users (or any entity) many competing choices and which record their decisions.
Training data is not normally a big challenge for collaborative filters. A more challenging problem is dealing with new users who have no history of past choices in your system.
In practice, collaborative filters and next-in-sequence predictions require similar engineering efforts. The two approaches have a lot in common. In fact, either collaborative filters or next-in-sequence predictions can be used to build solutions like recommendation engines.
The distinction between the two model approaches is time —next-in-sequence prediction models make predictions based on a time series of events. For example, a content recommendation engine built with next-in-sequence models would track timestamped events of online user behavior (pages visited, past purchases, mouse movements, etc.) and make content recommendations based on that behavior. Collaborative filters make recommendations based on the behavior of other users, without regard to time.
In practice many deep learning applications are a hybrid of collaborative filters and next-in-sequence predictions.
Neural networks are incredibly flexible, and researchers are always
discovering new ways to design them. Current deep learning development
tools can handle multi-input and multi-output models. Thus you can combine
different AI product patterns in a single system. The system outlined in the
following image includes the computer vision, NLP, and next-in-sequence
product patterns:
Google has recently demonstrated how to combine computer vision and NLP
product patterns.10 Do a quick YouTube search for “Looking to Listen” to see
an example.
In the following image of Google’s Looking to Listen model, you can see that
the convolutional networks used for computer vision are connected to an
LSTM, a popular NLP model.
Using the computer vision and NLP product patterns, Google demonstrates
how AI can mimic the “cocktail party effect”: the human capacity to isolate a
voice in a noisy room by looking directly at the speaker.
The realm of AI research covers far more than the product patterns we’ve
discussed here. AI researchers make daily breakthroughs, and some will lead
to new practical applications. But—as is common with primary research—
most discoveries will never make it out of the lab.
Colleagues may discuss the following AI applications that are on the horizon:
Some of these discoveries get enormous coverage in the business press. For
example, Google’s AlphaGo stunned the world by beating world-champion
Go player Lee Sedol in 2016. Two years later I know of no single commercial
application built on AlphaGo technology. Don’t waste resources on these
bleeding-edge AI innovations unless you have a compelling, specific business
case. If you decide to pursue them anyway, you’ll need to build your own AI
research team to help advance the state of the art.
The four product patterns are a great starting point for identifying AI
opportunities in your organization. I use them as a first step in sorting
through a client’s data and business processes. After laying this foundation I
try to learn how other companies are applying AI to business problems that
are similar to my client’s problems.
The usual resources like conferences, PR, blog posts, and job descriptions
can provide valuable insights about your competitors’ current AI initiatives. In
addition to these, I also rely on a few online resources for inspiration. These
resources include Kaggle, AngelList, and arXiv.
Kaggle (now owned by Google) hosts competitions in data science and
machine learning. Many of the competitions are based on real data
submitted by companies that are looking to crowdsource their innovation.
Kaggle is my go-to resource for preparing an AI strategy. I can learn more about
a new industry in three hours on Kaggle than in three days at a conference.
Look into Kaggle to explore:
All of this information can inform your AI strategy, but Kaggle can be a
bit intimidating if you don’t have a data science background. To cut your
teeth here, start by exploring Kaggle competitions and data, and read the
discussions among competitors.
Start by visiting Kaggle.com/Competitions to look for problems similar to
those at your company. Here are a few examples of the kind of competitions
you’ll find on Kaggle:
You can take a look at the competition data after you register and accept
the terms of the competition. Most data is in common file formats like CSV,
TXT, or JPG. Just download and open it in applications like Microsoft Excel or
Word. Ask yourself whether you have similar data in your company.
Kaggle competitors talk about their results, share code, and perform
exploratory data analysis (EDA). Many share their results. In just a few hours
of reading you can get a very good sense of what is possible.
If you are just setting up a new AI engineering team, you can prime their
creativity and infrastructure by re-creating the Kaggle contest scenario and
challenging them to achieve the same results. Look for a completed contest
where the winning team published their results. The exercise will accelerate
your team’s progress by forcing them to get their infrastructure working and
will familiarize them with best practices.
AngelList (Angel.co) is web site where startups can connect with talent
and investors. Most startups create an AngelList account, so it’s a good
clearinghouse for emerging technology that investors are funding. It’s also a
great source for learning about emerging AI ideas.
You’ll find an online repository of scientific papers at arXiv.org. While most of
the information on arXiv is too speculative or untested for practical solutions,
you can get an understanding of what problems researchers are exploring.
Sometimes they reference specific datasets or results of their studies, and
you might find these helpful. For example, Google’s “Looking to Listen”
research results are available on arXiv.
Get access to the Part 2 Quiz when you download the eBook.