Big Data, Hadoop, and Cognitive Computing

Stephen DeAngelis

January 27, 2015

Anyone involved in business, technology, and/or analytics is well aware of the hype surrounding the term Big Data. Eugene Kim (@eugenekim222) writes, “For years, big data has been one of the hottest buzzwords across all industries. … But despite its hype, big data is still considered a relatively obscure concept.” [“‘Big Data’ Is One Of The Biggest Buzzwords In Tech That No One Has Figured Out Yet,” Business Insider, 20 August 2014] Because there still seems to be confusion over just what big data entails, some pundits recommend doing away with the term altogether. I’m sympathetic with that position, but, like lots of buzzwords, Big Data is a term that is likely to have legs no matter how much we would like to find another term to describe analytic challenges associated with the mountains of data that are being generated each and every day. Certainly no one has come up with a better term.  So let’s stop obsessing about the term and discuss more important issues, like how to get the most out of analytical processes normally associated with Big Data. Elizabeth Dwoskin (@lizzadwoskin) writes, “Underpinning the big-data craze is Hadoop, a software suite named for a toy elephant belonging to the son of a Yahoo programmer who helped develop the software in the mid-2000s.” [“The Joys and Hype of Software Called Hadoop,” The Wall Street Journal, 16 December 2014] The reason that Hadoop has been such a big hit is that can help deal with data stored in non-traditional formats (i.e., is not found in formatted rows and columns like a typical spreadsheet). This is extremely important since most of the data being generated today is this kind of unstructured data. “Hadoop can spread uncategorized data across a network of thousands of cheap computers,” Dwoskin explains, “making it a less costly, more scalable way to catalog multiplying streams of input. The software, distributed under an open-source license, is free to use, share and modify, and many vendors, from database stalwarts like Microsoft Corp. to analytics services like Splunk Corp., have embraced it to push big data beyond its Silicon Valley stronghold.”

This all sounds great; but, as Dwoskin admits, “Companies that have tried to use Hadoop have met with frustration. Bank of New York Mellon used it to locate glitches in a trading system. It worked well enough on a small scale, but it slowed to a crawl when many employees tried to access it at once, and few of the company’s 13,000 information-technology workers had the expertise to troubleshoot it.” Michael Walker, a partner at Rose Business Technologies, told Dwoskin that scalability and speed are not the only problems with Hadoop and the current state of Big Data analytics. “The dirty secret,” he told her, “is that a significant majority of big-data projects aren’t producing any valuable, actionable results.” Dwoskin concludes, “It turns out that faith in Hadoop has outpaced the technology’s ability to bring big data into the mainstream. Demand for Hadoop is on the rise, yet customers have found that a technology built to index the Web may not be sufficient for corporate big-data tasks.” So what’s the solution? My answer is: Cognitive Computing. Of course, as president and CEO of a cognitive computing company, you would expect that. Fortunately, I’m not alone in that assessment.

In Accenture’s latest technology vision entitled “From Digitally Disrupted to Digital Disrupter,” the consulting firm provides an insightful tour d’horizon of trends occurring in the digital world and how they are going to affect businesses. The study’s cover asserts, “Every Business is a Digital Business.” Although that may sound a bit hyperbolic, the fact is that every business is somehow affected by data. So getting the most out of that data should be a priority, if not an imperative, for every business. The Accenture study notes, “Data is the lifeblood of every digital organization, but businesses are struggling to access, share, and analyze much of the data they already have. Through 2015, 85 percent of Fortune 500 organizations will be unable to exploit big data for competitive advantage.” Given the challenges companies face, they should welcome the fact that cognitive computing is coming to the rescue. Accenture indicates that artificial intelligence and machine learning will play a significant role in solving the data analytics problem for most companies. The study concludes, however, that the “ultimate long-term solution” is “cognitive computing.” The study explains:

“Rather than being programmed for specific tasks, machine learning systems gain knowledge from data as ‘experience’ and then generalize what they’ve learned in upcoming situations. Cognitive computing technology builds on that by incorporating components of artificial intelligence to convey insights in seamless, natural ways to help humans or machines accomplish what they could not on their own. At its most advanced, cognitive computing will be the truly intelligent data supply chain — one that masks complexity by harnessing the power of data to help business users ask and answer strategic questions in a data-driven way.”

In other words, cognitive computing will help eliminate the “dirty little secret” revealed by Walker by ensuring that decision makers actually receive actionable insights from the analysis conducted on their data. During a 2013 interview that Niaz Uddin (@niazudin), founder of eTalks, conducted with James Kobielus, Senior Program Director of Product Marketing and Big Data Analytics Solutions at IBM, he asked, “What are some of the most interesting uses of big data out there today?” [“James Kobielus: Big Data, Cognitive Computing and the Future of Product,” eTalks, 12 December 2013] Kobielus responded:

“Where do I start? There are interesting uses of Big Data in most industries and in most business functions. I think cognitive computing applications of Big Data are among the most transformative tools in modern business. … One way I like to describe cognitive computing is as the engine behind ‘conversational optimization.’ In this context, the ‘cognition’ that drives the ‘conversation’ is powered by big data, advanced analytics, machine learning and agile systems of engagement. Rather than rely on programs that predetermine every answer or action needed to perform a function or set of tasks, cognitive computing leverages artificial intelligence and machine learning algorithms that sense, predict, infer and, if they drive machine-to-human dialogues, converse. Cognitive computing performance improves over time as systems build knowledge and learn a domain’s language and terminology, its processes and its preferred methods of interacting. This is why it’s such a powerful conversation optimizer. The best conversations are deep in give and take, questioning and answering, tackling topics of keenest interest to the conversants. When one or more parties has deep knowledge and can retrieve it instantaneously within the stream of the moment, the conversation quickly blossoms into a more perfect melding of minds. That’s why it has been deployed into applications in healthcare, banking, education and retail that build domain expertise and require human-friendly interaction models.”

The best cognitive systems combine advanced computations and semantic reasoning to create a system that can Sense, Think, Act and Learn™ much like humans. The Enterra Solutions© Cognitive Reasoning Platform™ (CRP) does that; but, unlike humans, it performs at machine speed employing vast amounts of data and across a broad spectrum of time scales. Additionally, cognitive computer systems are able to linearly scale in proportion to the geometric explosion of data. As a result, cognitive computing systems can dynamically integrate and analyze data involving people, processes, and technologies and interpret the analysis for insights and causality.

The International Data Corporation (IDC) predicts that we are on the cusp of an era that will see an explosion of innovation and growth. IDC calls the launch pad for this new era the “3rd Platform.” The 1st Platform is the mainframe computer system that has in one form or another been around since the 1950s. The 2nd Platform is the client/server system that was introduced in the 1980s with PCs tapping into mainframe databases and applications. The 3rd Platform, according to IDC, is “built on the technology pillars of mobile computing, cloud services, big data and analytics, and social networking.” [“IDC Predicts the 3rd Platform Will Bring Innovation, Growth, and Disruption Across All Industries in 2015,” IDC Press Release, 2 December 2014] Those pillars set the stage for Cognitive Computing, which could well become one of the pillars for a future 4th platform.