Gaining Big Insights from Big Data

Stephen DeAngelis

June 17, 2019

Data is the lifeblood running through the veins of the digital economy. Nevertheless, pooled data, like pooled blood, is of little value unless it’s put to use. The question is: How does a company extract the greatest value from its data? Extracting insights from large data sets isn’t easy. According to Scott E. Page (@Scott_E_Page), a professor of complex systems, political science and economics at the University of Michigan, many companies view large data sets as a “hairball of data.”[1] He adds, “The problems and challenges that we confront are complex. And by that, I mean high-dimensional, lots of interdependencies, difficult to understand. So, what do we do? How do we use that data to confront the complexity?” Good questions all. Page’s short answer to those questions is this: “You have to arrange that data on some sort of model. You want to think of a model as Charlie Munger, the famous investor, describes it — a latticework of understanding on which you can array the data.”

 

Identifying the best models to obtain the best insights

 

Business leaders often lack an understanding about how analytics work. They hear a lot about artificial intelligence (AI), machine learning, and cognitive computing and simply assume you put data in one end and get insights out the other — sort of like putting pay dirt in the hopper of a mining trommel and getting gold in the sluice box. They don’t appreciate everything that goes on in the middle (i.e., the models). I like the gold mining analogy. Good miners don’t just throw any old dirt into the trommel. They find the right dirt. The dirt with the most potential for containing gold. Companies need to look for the data having the greatest potential for containing the insights they are hoping to find. William Schmarzo (@schmarzo), CTO of IoT and Analytics at Hitachi Vantara, explains he like the definition of data science found in the book Moneyball: “Data Science is about identifying those variables and metrics that might be better predictors of performance.” Schmarzo writes:

“This straightforward definition sets the stage for defining the roles and responsibilities of the business stakeholders and the data science team:

  • Business stakeholders are responsible for identifying (brainstorming) those variables and metrics that might be better predictors of performance, and
  • The Data Science team is responsible for quantifying which variables and metrics actually are better predictors of performance

This approach takes advantage of what the business stakeholders know best — which is the business. And this approach takes advantage of what the data science team knows best — which is data transformation, data enrichment, data exploration and analytic modeling. The perfect data science team!”

Notice both Page and Schmarzo emphasize the importance of variables, metrics, and models. Identifying those things often isn’t as easy as it might seem. Referring back to the Moneyball definition of data science, Schmarzo writes, “Note: the word ‘might’ is probably the most important word in the data science definition. Business stakeholders must feel comfortable brainstorming different variables and metrics that might be better predictors of performance without feeling like their ideas will be judged.” To help business leaders identify the right variables, metrics, and models, digital futurist Chunka Mui (@ChunkaMui), draws five lessons from Vince Barabba’s book entitled Wise Decision Making: Through the Systemic Use of Knowledge and Imagination.[3] The first two lessons deal with identifying the right metrics and variables. They are:

 

1. Get behind the curtain. Mui writes, “Behind every enterprise are people who actually design and make things, and who ultimately translate market intelligence into products. These are the people who create value — everyone else just shuffles it around. … Who are the wizards behind the curtain in your organization?” The “wizards” can help you determine the best metrics and variables.

 

2. Understand the operating design of your enterprise. Understanding your business and understanding your operating design are a bit different. According to Mui, “There are a range of possible operating designs. Does the enterprise operate in an environment characterized by relatively slow and evolutionary change, for example? That puts it in a ‘make and sell’ operating model where success depends on economies of scale and correctly predicting demand. … Or, to illustrate the other end of the spectrum, does the enterprise operate in a very complex and uncertain environment with imminent disruptive opportunities and challenges due to technology, regulation, demographics, etc.? Success in these circumstances requires an ‘anticipate and lead’ operating design where the enterprise needs to reinvent itself, based on both customer’s articulated and unarticulated needs, and lead customers there.”

 

The third lesson deals with selecting and using the right model.

 

3. Never say “The Model says”. Once the right metrics and variables are selected, the right model still needs to be applied. Schmarzo explains, “The challenge for the data science team is to not settle on the first model that ‘works.’ The data science team needs to constantly push the envelope and as a result, fail enough in their testing of different combinations of variables to feel personally confident in the results of the final model.” Business challenges may require different algorithms to gain desired insights. To ensure the right model is used, my company, Enterra Solutions®, leverages the Representational Learning Machine™ (RLM) created by Massive Dynamics™. The RLM helps determine what type of analysis is best-suited for the data involved in a high-dimensional environment. Mui writes, “Remember that no model can accurately capture the complexity of a real situation and, therefore, can never provide definitive answers. Wise decision making depends on analyzing and interpreting model results into valid conclusions.”

 

The final two lessons deal with insight implementation.

 

4. Make sure those who will implement the plan are involved in developing the strategy. Mui writes, “How many times have you seen talented strategy teams work hard to develop good plans, and then present them to a management team that was too experienced and comfortable in the old ways of doing things? … Strategy teams should not make plans; instead, they should guide the process that engages those responsible for reaching a strategy, allocating resources and implementing the associated course of action. Otherwise, it will never be their strategy — no matter how good. The strategy team’s job is to provide the relevant information, track underlying assumptions and ensure progress.”

 

5. Don’t just close the loop; do double-loop learning. Mui writes, “Numerous strategic failures might have been averted if the principals had followed the simple rule of making core assumptions explicit and knowing when those assumptions had changed.” By “closing the loop” (i.e., continually reevaluating the situation), strategies can pivot as circumstances change. Mui continues, “Closing the loop is not enough. … Closing the loop is critical but only helps to detect and correct errors for the strategy in question. It doesn’t help transfer learning across the organization. … Double-loop learning encourages and helps the entire enterprise learn to actively question and modify existing values, norms, processes, policies and objectives.”

 

Concluding thoughts

 

Hopefully, I’ve made it clear that getting the metrics, variables, and model correct is essential to gain big insights from big data. Page suggests companies should consider a multi-model strategy. He explains, “If you really unpack what’s going on in those sophisticated algorithms, they really are ensembles of little algorithms and little rules. The idea is, any one model is going to be wrong, but many models are going to be not only a lot of coverage, but also a collection of coherent understandings of a complex phenomenon.” Cognitive technologies are not magic; nevertheless, they can do things previous analytic platforms haven’t been able to do — like provide insights involving ambiguous situations.

 

Footnotes
[1] Scott E. Page, “How to Get the Best Results from Big Data Analysis,” Knowledge@Wharton, 20 February 2019.
[2] William Schmarzo, “Identifying Variables That Might Be Better Predictors,” KDnuggets, February 2017.
[3] Chunka Mui, “5 Lessons On Making The Leap From Big Data To Wise Decision Making (Part 2 Of 2),” Forbes, 9 October 2017.