Machine Learning: A Primer for the Technically Challenged, Part 3

Stephen DeAngelis

July 8, 2015

This article is a continuation of a discussion on machine learning that I began in two other articles with the same title (Part 1 and Part 2). “As data is transformed into insight,” writes Narendra Mulani (@npmulani), senior managing director at Accenture Analytics, “it becomes a strategic asset for a business. To cash-in on this currency of the new economy, organizations are taking steps to make analytics a core competency across the enterprise.”[1] Jason Hannula (@jason_hannula) agrees with Mulani that big data analytics can be a boon for any business; but, he reports that many midsize businesses don’t have access to analytics.[2] He explains:

“With the current focus on acquiring data scientists to lead business analytics programs across all industries, midsize businesses are having difficulty competing against larger organizations with deeper pockets in the limited talent pool. As Fredric Paul notes on NetworkWorld, anyone with ‘data science’ in their job title and a few years of experience is getting 100 recruiter emails per day. With both more money and perhaps more interesting problems to tackle, top data talent is regularly poached by top firms.”

Interestingly, the focus of both Mulani’s and Hannula’s articles is the same — machine learning. Mulani explains that swimming through an ocean of data searching for insights is best left to machines that don’t tire and can work 24/7. “Machine learning,” she writes, “is a topic that clients are bringing up more frequently when discussing how to pursue this data-driven advantage. Forty-one percent of organizations are using machine learning, 36 percent are experimenting with it, and 16 percent are considering using it, according to a recent Accenture survey.” On the other hand, Hannula suggests that machine learning can be an affordable alternative to hiring and trying to retain sought after data scientists. “Even if a midsize business is able to recruit data science talent with both data and business domain knowledge,” he explains, “retention is an ongoing concern and significant business risk. In this competitive human resources climate, data analysis automation through machine learning becomes more attractive.”

 

In the following video, Demis Hassabis, cofounder of DeepMind (which was acquired by Google in 2014), explains that a computer can be programmed to help solve problems in two ways. First, by hard-wiring solutions into the computer and, second, by programming the computer with algorithms that allow the computer itself to discover solutions through machine learning.

 

 

Peter Fingar (@PeterFingar) notes that machine learning involves both the right kinds of algorithms and the capability to ingest data that is buried in all sorts of formats.[3] He explains:

“Machine reading capabilities have a lot to do with unlocking ‘dark’ data. Dark data is data that is found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making. Typically, dark data is complex to analyze and stored in locations where analysis is difficult. The overall process can be costly. It also can include data objects that have not been seized by the enterprise or data that are external to the organization, such as data stored by partners or customers. IDC, a research firm, stated that up to 90 percent of big data is dark.”

Lars Hård (@larshard) asks the million dollar question, “What exactly is machine learning, how is it being applied within organizations today, and what does it mean for the future of business?”[4] He adds, “It is becoming ever more crucial for enterprise leaders to understand machine learning, particularly the benefits that it can provide for companies today. Machine learning today is already allowing many businesses to achieve higher productivity and efficiency, innovating their business, and those that do not begin to explore this new tool ultimately are at risk for falling behind their competition.”  Mike Matchett (@smworldbigdata), a senior analyst and consultant at Taneja Group, agrees. He writes, “Machine learning is a key part of how big data brings operational intelligence into our organizations. But while machine learning algorithms are fascinating, the science gets complex very quickly. We can’t all be data scientists, but IT professionals need to learn about how our machines are learning.”[5] He adds, “The basic premise of machine learning is to train an algorithm to predict an output value within some probabilistic bounds when it is given specific input data.” To help people unfamiliar with machine learning to understand it better, Dr. Jason Brownlee (@TeachTheMachine) explains, “There are different ways an algorithm can model a problem based on its interaction with the experience or environment or whatever we want to call the input data. It is popular in machine learning and artificial intelligence text books to first consider the learning styles that an algorithm can adopt.”[6] I mentioned these learning styles in Part 1 of this series, but didn’t elaborate on them. Brownlee goes into a little more detail about them. He writes:

  • Supervised Learning: Input data is called training data and has a known label or result such as spam/not-spam or a stock price at a time. A model is prepared through a training process where it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data. Example problems are classification and regression. Example algorithms are Logistic Regression and the Back Propagation Neural Network.
  • Unsupervised Learning: Input data is not labelled and does not have a known result. A model is prepared by deducing structures present in the input data. Example problems are association rule learning and clustering. Example algorithms are the Apriori algorithm and k-means.
  • Semi-Supervised Learning: Input data is a mixture of labeled and unlabeled examples. There is a desired prediction problem but the model must learn the structures to organize the data as well as make predictions. Example problems are classification and regression. Example algorithms are extensions to other flexible methods that make assumptions about how to model the unlabeled data.
  • Reinforcement Learning: Input data is provided as stimulus to a model from an environment to which the model must respond and react. Feedback is provided not from of a teaching process as in supervised learning, but as punishments and rewards in the environment. Example problems are systems and robot control. Example algorithms are Q-learning and Temporal difference learning.

I noted in my previous article that I would add a fifth learning style — a hybrid approach that involves semantic reasoning to bridge the gap between pure mathematical techniques and semantic understanding.

 

Hannula concludes, “The application of machine learning to business data is showing promise in generating actionable business insights — the end goal for predictive data analysis. In application, AI can become adept at working with real-time data from the Internet of Things or customer social engagement to guide future business decisions. Intelligent machines also have the ability to scale effectively as data volumes increase in these business areas.” Hård explains what some of the current business applications for machine learning involve. He writes:

“By incorporating the data that an organization already has at hand, and applying predictive algorithms, organizations are beginning to be able to create adaptive pricing models depending on customer behavior. In turn, they also can predict what future customer demand may be and adapt their inventory of product as such. Machine learning allows companies to take their data to the next level and develop even more intelligent insights by gathering, processing and analyzing the data and improving and learning over time.”

In Part 4 of this article, I will discuss how machine learning lies at the heart of cognitive computing and also discuss some of the popular machine leaning algorithms that are currently being used to generate insights.

 

Footnotes

[1] Narendra Mulani, “Demystifying and Adopting Machine Learning,” Information Management, 10 June 2015.
[2] Jason Hannula, “Machine Learning: Faster Than a Data Scientist,” PivotPoint, 14 August 2014.
[3] Peter Fingar, “Peter Fingar: The Cognitive Computing Era is Upon Us,” PSFK, 20 May 2015.
[4] Lars Hård, “Artificial intelligence in the enterprise — what you need to know,” BetaNews, September 2014.
[5] Mike Matchett, “Intro to machine learning algorithms for IT professionals,” TechTarget, June2015.
[6] Jason Brownlee, “A Tour of Machine Learning Algorithms,” Machine Learning Mastery, 25 November 2013.