The Right Questions for Big Data Yield the Best Answers

Stephen DeAngelis

October 19, 2015

“Some organizations are driving more value out of big data than others,” writes Lisa Morgan (@lisamorgan). “They’re the ones redefining how businesses interact with their customers. They’re the ones using data to transform their business models and to innovate.”[1] She goes on to point out that just as important as the data are the questions that are asked of that data. She explains:

“Asking better questions of data is both an art and a science, and it’s an iterative process. The most sophisticated and competitive companies are constantly striving to improve their understanding of what data can tell them, and what they can ask of the data.”

Every analyst and researcher knows — the better the question the better the answer. Bernard Marr (@BernardMarr) agrees that asking the right question is the key to unlocking better answers.[2] He discusses how one of the story lines in Douglas Adams’ The Hitchhiker’s Guide to the Galaxy is a good parable for the dilemma in which many companies find themselves today. In the book, Arthur Dent and his friend Ford Prefect manage to escape as the Earth is destroyed and they eventually find themselves on the planet Magrathea. While there, they learn that a supercomputer named Deep Thought had determined the ultimate answer to life. The answer was the number 42. And, like a galactic version of IBM’s Watson playing the game show Jeopardy, Earth had been created as an even greater computer to calculate the question to which 42 is the answer; unfortunately Earth was destroyed before the question was formulated. Marr writes:

“This is a wonderful parable for big data because it illustrates one quintessential fact: data on its own is meaningless. Remember the value of data is not the data itself — it’s what you do with the data. For data to be useful you first need to know what data you need, otherwise you just get tempted to know everything and that’s not a strategy, it’s an act of desperation that is doomed to end in failure. Why go to all the time and trouble collecting data that you won’t or can’t use to deliver business insights? You must focus on the things that matter the most otherwise you’ll drown in data. Data is a strategic asset but it’s only valuable if it’s used constructively and appropriately to deliver results. This is why it’s so important to start with the right questions. If you are clear about what you are trying to achieve then you can think about the questions to which you need answers.”

Of course companies aren’t looking for a single answer; they are looking for lots of answers to lots of different questions. As Fabio Luzzi, Viacom’s Vice President of advanced analytics and data science, told Morgan, “Most things don’t start and end with a single question. The quality of your questions gets better along the way.” Or at least they should. Morgan offers six steps to help you ask better questions. They are:

1. Start with a goal. “The company with the most data doesn’t necessarily win,” Morgan writes. “It’s the one that understands how to use data, and for what purpose. Having a clearly stated objective helps narrow the universe of possibilities into a set of relevant choices that can be explored and refined.” Like Marr, she stresses that you need to understand what data sources are going to help you achieve your goal.

2. Understand the forest and the trees. Morgan points out that today companies are able to do analyses that were not possible just a few years ago. One of the technologies that has emerged to make this possible is cognitive computing (i.e., systems like IBM’s Watson and the Enterra Solutions® Enterprise Cognitive System™). Olly Downs (@mathandporsches), chief scientist at Globys, told Morgan, “There’s a push to get to the nano level of targeting, because the more specific you can get targeting audiences and the way your interactions address those audiences, the more effective they are.” One of the benefits of a cognitive computing system like Enterra’s ECS, which uses a hypotheses engine, is that it can ask questions on its own — questions that decision makers might not have realized would yield valuable insights. Shahram Rahimi (@SharamRahimi), a professor at Southern Illinois University Carbondale, is “working on algorithms to help computers apply artificial intelligence to figure out what important answers are buried in big data — even though the users may not know what questions to ask.”[3] In Rahmi’s case, he is concentrating on analyzing medical records to discover insights that could improve care and satisfaction rates among hospital patients.

3. Collaborate with goals in mind. “Data teams and business units need to work together to meet business goals,” writes Morgan. A cognitive computing system can ingest, integrate, and analyze all of the data that a company uses and make it available (as a single version of the truth) to the entire enterprise. That is exactly what business units need to collaborate successfully.

4. Take advantage of machine learning. Machine learning, of course, is one of the core capabilities of all cognitive computing systems. Morgan explains, “Machine learning allows companies to discover patterns, develop new and better models, and improve their predictive capabilities, among other things. The massive scale and speed allow organizations to explore problems in ways that would not otherwise be feasible, which sometimes leads to intriguing new questions.”

5. Align questions with data. I would turn that recommendation on its head — align data with questions. Once you know what questions need answering, you should look for data sets that are most likely to contain the data you need. “Sometimes it’s not possible to answer certain questions,” Morgan writes, “because the data is not available. Even when the data is available, companies aren’t always sure they’re asking the right questions of it.”

6. Don’t skew the results. Morgan observes, “Human frailties and non-representative data sets tend to skew results and lead to faulty conclusions.” In the PBS Masterpiece Mystery drama Arthur & George, Sir Arthur Conan Doyle tries to convince a judge that a man found guilty of a crime and imprisoned for three years is actually innocent. Sir Arthur presents the magistrate with evidence he finds logically leads to that conclusion. The judge, however, tells Sir Arthur that he understood that when he wrote his famous Sherlock Holmes stories he always started with the conclusion and then wove his story backwards so that the facts of the case fit the conclusion he wanted. Arthur admitted that was his methodology. The judge then asked if it wasn’t possible that he was doing the same thing with the evidence he was trying to present about the innocence of the party in question (i.e., wasn’t he selectively providing evidence that led to a previously desired conclusion). Morgan explains, “Confirmation bias, a form of cognitive bias, influences the approach to problem solving, as well as the way individuals view data and results. When the purpose of an analysis is to prove a hypothesis, the bias influences the data sets, tests, and outcomes.” We can all be guilty of wanting the facts to fit our conclusions; but, in a business, that can be a tragic mistake.

Marr concludes:

“Really successful companies today are making decisions based on facts and data-driven insights. Whether you have access to tons of data or not, if you start with strategy and identify the questions you need answers to in order to deliver your outcomes then you will be on track to improve performance and harness the primary power of data. Every manager now has the opportunity to use data to support their decision-making with actual facts. But without the right questions, all those ‘facts’ can conceal the truth. A lot of data can generate lots of answers to things that don’t really matter; instead companies should be focusing on the big unanswered questions in their business and tackling them with big data.”

Business executives should be excited about the opportunities that big data analyses create for them. Because of the emergence of cognitive computing technologies, they no longer need to have computer science degree to access the power of big data. Cognitive computing systems use natural language processing to make engagement easy and insights clearly understood.

Footnotes
[1] Lisa Morgan, “6 Ways To Ask Smarter Questions Of Big Data,” InformationWeek, 9 September 2015.
[2] Bernard Marr, “Big Data: Too Many Answers, Not Enough Questions,” Forbes, 25 August 2015.
[3] Dian Schaffhauser, “Big Data Research Project Looks for Answers to Questions Nobody Knows to Ask,” Campus Technology, 8 September 2015.