Big Data is Not About Size

Stephen DeAngelis

August 6, 2015

“The future of Big Data depends on Smart Data,” writes Jelani Harper.[1] The term “smart data” is a bit of a misnomer because data lying fallow in a data set is not smart at all. Data only becomes smart when it is analyzed for actionable insights. Harper takes that notion one step further by insisting that semantics must be added into the analytical mix in order to achieve truly “smart” results. “The power of Semantics,” she writes, “is inexorably transforming the notion of Big Data into Smart Data.” She claims that when semantics is added to other traditional forms of analytics the combination facilitates:

  • Unstructured and structured data aggregation and analytics: Smart Data supports rapid integration of either unstructured or semi-structured data (as most Big Data is), enabling organizations to expedite analytics and derive composite value from all of their data — even recently acquired Big Data.
  • Simplified and accelerated Data Modeling: The complexity and foresight of most Data Modeling jobs are significantly reduced by Smart Data, decreasing time to insight and time to value for Big Data applications.
  • Access and Data Governance: Smart Data provides valuable access control aligned with principles of Data Governance for integrated data sources, preserving the order and security that are vital to integration and data access in the long term.

Harper reports that Sean Martin, the Chief Technology Officer at Cambridge Semantics, explained to participants at the Enterprise Data World 2015 Conference that semantics is essential to provide context for data integration. Harper paraphrased his argument this way:

“Before the current prevalence of Smart Data, ‘dumb’ or non-Semantic data derived its meaning and contextual relevance to the enterprise via specific applications and the utilization of schema, programs, databases, etc. Outside of those particular applications the data inherently lost its meaning, which made data integration a tiresome chore and greatly accounts for the culture of silos.”

Data silos are seldom a good thing. They present a hurdle to corporate alignment and allow various departments to maintain their own version of the truth about what’s happening within an enterprise. Martin asserted the opposite is true for Smart Data. He told his audience:

“The big change when you move to a set of standards and smarter data is the data starts to contain what’s needed to identify it and explain what it means. And that is independent of any application, which makes it very powerful … in a world where we’ve got an enormous amount of silo information and integration is very expensive, difficult and time consuming. Having information that self identifies and can really carry with it everything you need to do integration and can be used with software that understands those standards but has no preconception of the underlying data model — it’s a huge difference.”

According to Harper, “The prominence of Big Data in Data Management lies in the ability to implement action from real-time analytical insight and consolidate all of one’s data in the process.” Not all insights need to be generated in real-time; but, the crux of her argument is valid. The importance of big data lies not in its size but in the insights that can be derived from it using the right combination of analytics. With data sets predicted to become exponentially larger in the years ahead, the emphasis will inevitably shift from the size of the data set to the analytics that can be applied to it. Jef Cozza (@jefcozza) reports, “Big data is often heralded as a transformative force that will usher in a new era of data-driven decision making for executives and business managers. However, many enterprises are finding themselves drowning in data, but with no better insight into the issues confronting their businesses, according to new report from Forrester Inc.”[2] The report referred to by Cozza is titled “Digital Insights are the New Currency of Business,” and it was written by Forrester analysts Ted Schadler (@TedSchadler) and Brian Hopkins (@practicingEA). Drawing from the Forrester report, Cozza explains, “The firms that are most successful in leveraging the mountains of data gathered by big data applications use a combination of people, processes and technologies to systematically analyze their data sets.” Schadler writes:

“To harness the power of all your data to attract and serve customers — to be a digital business — you also need a new way of consistently harnessing insights that matter: insights teams using an insights-to-execution process anchored by a new digital insights architecture.”

Cozza adds, “In particular, the report’s authors found that successful firms go beyond big data and business intelligence practices to build the business discipline and technology to harness insights and convert their data sets into action. The approach works by linking business actions back to data and discovering and testing insights, before taking action.” Accenture’s latest technology vision entitled “From Digitally Disrupted to Digital Disrupter,” insists that the technology businesses are looking for solve their big data challenges is cognitive computing. The report states:

“What if … machines could be taught to leverage data, learn from it, and, with a little guidance, figure out what to do with it? That’s the power of machine learning — which is a major building block of the ultimate long-term solution: cognitive computing. Rather than being programmed for specific tasks, machine learning systems gain knowledge from data as ‘experience’ and then generalize what they’ve learned in upcoming situations. Cognitive computing technology builds on that by incorporating components of artificial intelligence to convey insights in seamless, natural ways to help humans or machines accomplish what they could not on their own. At its most advanced, cognitive computing will be the truly intelligent data supply chain — one that masks complexity by harnessing the power of data to help business users ask and answer strategic questions in a data-driven way.”

Dr. Bob Hayes (@bobehayes), Chief Research Offier of AnalyticsWeek and president of Business Over Broadway, puts its simply, “Big Data is less about the data itself and more about what you do with the data.”[3] He adds, “To get value from the data, you need to make sense of it, do something with it.” Mary Shacklett (@MaryShacklett), president of Transworld Data, writes, “The big data surge has fueled the adoption of Hadoop and other big data batch processing engines, but it is also moving beyond batch and into a real-time big data analytics approach.”[4] As I noted above, not all big data analysis needs to be real-time; but, when real-time or near-real-time analysis is required, analytic latency can be a problem. Shacklett continues:

“Organizations want real-time big data and analytics capability because of an emerging need for big data that can be immediately actionable in business decisions. An example is the use of big data in online advertising, which immediately personalizes ads for viewers when they visit websites based on their customer profiles that big data analytics have captured.”

The emergence of the Internet of Things (IoT) will only heighten the need for real-time analytics. Jeff Kelley, a big data analytics analyst from Wikibon, told Shacklett, “The Internet of Things will enable sensor tracking of consumer type products in businesses and homes. You will be collect and analyze data from various pieces of equipment and appliances and optimize performance.” Having a system capable of integrating all of that IoT data, putting it in context, and analyzing it for actionable insights will provide a competitive edge for businesses in the years ahead. Although the data itself will remain a valuable resource, its real value comes from the nuggets of information that lie within it.

 

Footnotes
[1] Jelani Harper, “The Evolution of Big Data to Smart Data,” Dataversity, 26 May 2015.
[2] Jef Cozza, “Without Big Insight, Big Data Is Useless,” Top Tech News, 8 May 2015.
[3] Bob Hayes, “Statistics: Is This Big Data’s Biggest Hurdle?Business 2 Community (B2C), 29 April 2015.
[4] Mary Shacklett, “Surge in real-time big data and IoT analytics is changing corporate thinking,” TechRepublic, 17 April 2015.