Cognitive Computing: Beyond Deep Learning

Stephen DeAngelis

March 5, 2015

The media has been full of articles recently about the rise of artificial intelligence (AI). Some pundits predict that AI will destroy humankind. Other pundits insist AI-powered machines will take over all of our jobs. More optimistic pundits believe that AI will enhance human abilities an open up a brighter future for all humankind. No one can be sure which future trajectory will actually be followed, but we do know that AI is going to be found in every one of them. One of the things that AI systems do well is recognize patterns. And if an AI system is built to learn, the more data it ingests the better it gets at recognizing those patterns. Businesses are interested in machine learning because they are now gathering oceans of data about their operations and their customers and they want to know what they can learn from analyzing that data. Lars Hård () reports, “AI for businesses is today mostly made up of machine learning, wherein algorithms are applied in order to teach systems to learn from data to automate and optimize processes and predict outcomes and gain insights.” [“Artificial intelligence in the enterprise — what you need to know,” BetaNews, 13 August 2014] Hård goes on to note:

“[Using an AI system] simplifies, scales and even introduces new important processes and solutions for complex business problems as machine learning applications learn and improve over time. From medical diagnostics systems, search and recommendation engines, robotics, risk management systems, to security systems, in the future nearly everything connected to the internet will use a form of a machine learning algorithm in order to bring value.”

The type of machine learning that has perhaps received the most attention is called deep learning (sometimes called cognitive computing). Gary Marcus (), a professor of cognitive science at New York University, writes, “There is good reason to be excited about deep learning, a sophisticated ‘machine learning’ algorithm that far exceeds many of its predecessors in its abilities to recognize syllables and images.” [“Is ‘Deep Learning’ a Revolution in Artificial Intelligence?The New Yorker, 25 November 2012] He adds, “But there’s also good reason to be skeptical.” Zachary Chase Lipton, a PhD student in the Computer Science Engineering department at the University of California, San Diego, agrees with Marcus that there are good reasons to be skeptical. “A few well-publicized recent papers have tempered the hype surrounding deep learning,” Lipton writes. “The papers identify both that images can be subtly altered to induce misclassification and that seemingly random garbage images can easily be generated which receive high confidence classifications.” [“(Deep Learning’s Deep Flaws)’s Deep Flaws,” KDnuggets, 25 January 2015] He continues:

“It’s worth noting that nearly all machine learning algorithms are susceptible to adversarial chosen examples. Consider a logistic regression classifier with thousands of features, many of which have non-zero weight. If one such feature is numerical, the value of that feature could be set very high or very low to induce misclassification, without altering any of the other thousands of features. To a human who could only perceive a small subset of the features at a time, this alteration might not be perceptible. Alternatively, if any features had very high weight in the model, they might only need to have their values shifted a small amount to induce misclassification. Similarly, for decision trees, a single binary feature might be switched to direct an example into the wrong partition at the final layer. … Deep learning’s great successes have attracted a wave of hype. The recent wave of negative publicity illustrates that this hype cuts both ways. The success of deep learning has rightfully tempted many to examine its shortcomings. However, it’s worth keeping in mind that many of the problems are ubiquitous in most machine learning contexts.”

What Lipton is basically saying is: Don’t throw the baby out with the bath water. If you are interested in learning more about deep learning, Ivan Vasilev provides an excellent tutorial that you can access via this link. When deciding to apply deep learning technology, it is important to consider three things. First, most deep learning systems are “black box” systems, meaning users have no ability to disentangle the rules from the network of interconnections. So while Artificial Neural Networks may self-learn some “rules” embedded within the interconnections of the network, the rules will not be explicitly stated, so you will not know what the rules are. David Karger (), a Computer Science Professor at MIT, puts it this way: “With these incredibly powerful algorithms, you can solve really hard problems, but while the computer knows the answer it just works like magic. You don’t really know *why* that’s the answer.” [“What Is The Future Of Machine Learning?Forbes, 12 February 2015] In contrast, solutions developed by my company, Enterra Solutions®, involve rules that are tuned over time in an observable and understandable way, which is a substantially more business-friendly solution.

The second thing you need to consider about deep learning technologies is that, although they are great at many classification and pattern recognition problems, they have not proven effective at simulating thinking/reasoning or making logical inferences. So although they may be able to classify a cat within a picture, they can make no inferences, such as if a cat is drinking something white it is probably milk. They will be able to determine that they see wheels in a picture of a car, but will have no idea what wheels do or how they interrelate to the function of the car. While deep learning systems, like IBM’s Watson, have many excellent use cases, there are other common use cases they can’t solve. Finally, deep learning systems require MASSIVE data and/or documents to train the classifiers well. As a result, data scientists train on hundreds of thousands of images. This scale of data is not easy for most companies to achieve.

Enterra’s Cognitive Reasoning Platform™ (CRP) uses various techniques to overcome the challenges associated with most deep learning (or cognitive computing) systems. Like deep learning systems, the CRP gets smarter over time and self-tunes by automatically detecting correct and incorrect decision patterns. But, the CRP also bridges the gap between a pure mathematical technique (like deep learning) and semantic understanding. The CRP has the ability to do math, but also understands and reasons about what was discovered. Marrying advanced mathematics with a semantic understanding is critical — we call this “Cognitive Reasoning.” The Enterra CRP cognitive computing approach — one that utilizes the best (and multiple) solvers based on the challenge to be solved and has the ability to interpret those results semantically — is a superior approach for many of the challenges to which deep learning is now being applied.