The Symbiotic Relationship between Big Data and Artificial Intelligence

Stephen DeAngelis

July 2, 2018

Business metaphors often contain biological references. For example, we refer to “product families” and talk about the “next generation.” We talk about businesses “evolving” and “product lifecycles.” We find some companies “on the bleeding edge” of new technologies. In the Digital Age, we find data running through veins of companies and the Internet of Things providing the nervous system of the digital enterprise. Metaphors can be taken too far; but, I don’t think it’s a stretch to say Big Data and Artificial Intelligence (AI) enjoy a symbiotic relationship. In biology, symbiosis is the interaction between two different organisms living in close physical association, typically to the advantage of both. Without Big Data, AI would have nothing to train on; and, without AI, insights locked inside large datasets would remain undiscovered.

Big Data and Artificial Intelligence

The symbiotic relationship between Big Data and AI is so close some people have difficulty separating the two. For example, journalist Saikumar Talari (@SaikumarTrn) defines Big Data as “a combination of technology and data that integrates, reports and accesses all available data filtering, reporting and correlating insights achievable with previous data technologies.”[1] Such definitions force some people, like Andy Patrizio (@apatrizio), to ask, “What are the similarities and differences between artificial intelligence and Big Data? Do they have anything in common? Are they similar? Can a valid comparison even be made?”[2] He adds, “Those are two buzzwords you are hearing an awful lot lately, perhaps to the point of confusion.” Confusing or not, Patrizio admits interest in Big Data and AI is keen. He reports, “A survey about Big Data and AI by NewVantage Partners of c-level executives found 97.2% of executives stated that their companies are investing in, building, or launching Big Data and AI initiatives. More significantly, 76.5% of executives feel AI and Big Data are becoming closely interconnected and that the greater availability of data is empowering AI and cognitive initiatives within their organizations.”

Launching Big Data and AI initiatives without a plan isn’t a good idea. Guy Powell (@GuyPowell), Founder and Managing Partner at ProRelevant, observes, “If your data and analytics aren’t clearly aligned with an important business question whose answer leads to greater profit, brand recognition, or market share, then the efforts and investments will be for naught.”[3] Data capture and management is becoming more critical as regulators crack down on questionable use or careless protection of personal data. Get it wrong and your company can suffer significant monetary and reputation setbacks. Powell notes, “Identifying, capturing and tracking the right data is the first step in building a credible data model. … Once data has been captured, the right analytic methods need to be applied.”

Many of today’s cognitive platforms can help with applying the right analytic methods. My company’s entry in the field is the Artificial Intelligence Learning Agent™ (AILA®) — a system that can Sense, Think, Act and Learn®. For many of our solutions, we complement AILA’s capabilities with Massive DynamicsRepresentational Learning Machine™ (RLM). The RLM can help determine what type of analysis is best-suited for data found in a high-dimensional environment. And, unlike traditional machine learning, in which predictions are generated in “black box” fashion, the RLM explains to users the drivers and patterns underlying its analytic insights.

Benefiting from the Big Data/AI Symbiotic Relationship

Companies desiring to benefit from Big Data and AI initiatives need to understand the technology ecosystem involved. Talari explains the environment for the Big Data ecosystem developed as a result of “rapidly increasing the amount of data available”; “accelerating data storage capacity and computing power at low cost”; and, “evolution in [the] Machine Learning approach to analyze convoluted datasets.” Anand Venugopal (@AnandRealTime), Head of Product at StreamAnalytix and a data evangelist at Impetus Technologies, insists the emergence of cloud computing was another important development. He explains, “Enterprises are steadily moving their on-premise IT and data processing to the public cloud. This trend is expected to accelerate through this year, driven by the growing availability of pre-built, reliable, scalable platforms-as-a-service (PaaS) for every possible application development and deployment need across the organization. Developers and everyday business users will use these cloud application platforms to design and operate applications, easier and faster, with minimal coding, while focusing on the core business logic.”[4]

With all the pieces in place, you’d think you were set to conquer the business world. Unfortunately, it’s not that easy. “Getting it right” is a never-ending pursuit. Dinesh Nirmal (@DineshNirmalIBM), a Vice President for Analytics Development at IBM, reminds us, “Data changes as the world changes. Building an AI or machine learning model means building a way of looking at the world. But as the world and the data change, the models need to adapt.”[5] In other words, to gain the maximum benefit from the Big Data/AI symbiotic relationship, you must continue to work the process. Nirmal suggests five traits found in adaptive AI systems. They are:

1. Managed. “For AI and machine learning to do real and lasting work, they need thoughtful, durable, and transparent infrastructure. That starts with identifying the data pipelines and correcting issues with bad or missing data. It also means integrated data governance and version control for models. The version of each model — and you might use thousands of them concurrently — indicates its inputs. You’ll want to know, and so will regulators.”

2. Resilient. “Being fluid means accepting from the outset that AI models fall out of sync. That ‘drift’ can happen quickly or slowly depending on what’s changing in the real world. Do the data science equivalent of regression testing, and do the testing frequently, but without burning up your time. That takes a system that allows you to set accuracy thresholds and automatic alerts to let you know when models need attention.”

3. Performant. “Most AI is computationally intense — both during training and after deployment. And most models need to score transactions in milliseconds, not minutes, to prevent fraud or leverage some fleeting opportunity. Ideally, you can train models on GPUs and then deploy them on high-performance CPUs, along with enough memory for real-time scoring. And of course you want everything to run fast and error-free regardless of where you deploy: on-prem, cloud, or multicloud.”

4. Measurable. “Think from the outset about how you’ll quantify and visualize what you’re learning and how it changes: improvements in data access and data volume, improvements in model accuracy, and ultimately improvements to the bottom line. Don’t just think about what you need to measure now but also about what you’ll want to measure in the future as your data science work matures.”

5. Continuous. “Data doesn’t sit still. The fifth and final aspect of fluid AI is about continuous learning as the world changes. Make sure to use tools like Jupyter and Zeppelin notebooks that can plug into processes for scheduling evaluations and retrain models.”

With a continuous process in place, Matt Asay (@mjasay), Head of Developer Ecosystem for Adobe, asserts companies will find “more incremental ways to improve how enterprises do business.”[6] As a result, he states,” there will be, “Less hype, more real. All good.”

Summary

Asay observes, “The hype cycle for big data, it would seem, has played itself out. As often happens, however, just as big data has lost its hype, it’s actually accelerating in terms of real enterprise adoption.” Venugopal adds, “Data is a powerful corporate asset that executives wish to fully harness. … With the many investments in cloud migration, data lakes, in-memory computing, modern business intelligence and data science technologies, 2018 will be the year when a large number of enterprises will derive breakthrough value from these technologies and go through a transformation into future-ready data-driven real-time enterprises.” It’s amazing what a symbiotic relationship can do.

Footnotes
[1] Saikumar Talari, “3 Trends Enabling The Big Data Revolution,” TechNative, 19 March 2018.
[2] Andy Patrizio, “Big Data vs. Artificial Intelligence,” Datamation, 30 May 2018.
[3] Guy Powell, “Harnessing The Power Of Big Data For Your Business,” Forbes, 24 April 2018.
[4] Anand Venugopal, “Cloud + Streaming Analytics + Data Science = Five Big Data Trends Now,” RTInsights, 24 March 2018.
[5] Dinesh Nirmal, “IBM outlines the 5 attributes of useful AI,” VentureBeat, 21 April 2018.
[6] Matt Asay, “Why the reality of big data is finally catching up to its hype,” TechRepublic, 10 January 2018.