Jennifer Ouellette reports that systems like CERN’s Large Hadron Collider (LHC) create such enormous amounts of data in such a short period of time that traditional processing methods are simply too inadequate to analyze it. “The LHC captures 5 trillion bits of data,” she reports, “more information than all of the world’s libraries combined — every second.” [“The Future Fabric of Data Analysis,” Qanta Magazine, 9 October 2013] She notes that, even after applying “filtering algorithms, more than 99 percent of those data are discarded.” But even the remaining amount of data is so large that “LHC scientists rely on a vast computing grid of 160 data centers around the world, a distributed network that is capable of transferring as much as 10 gigabytes per second at peak performance.” Although the distributed network used by LHC scientists is state of the art, better and faster processing is still required. That’s why Ouellette reports that new processing frameworks are being developed all the time. She explains:
“Since 2005, many of the gains in computing power have come from adding more parallelism via multiple cores, with multiple levels of memory. The preferred architecture no longer features a single central processing unit (CPU) augmented with random access memory (RAM) and a hard drive for long-term storage. Even the big, centralized parallel supercomputers that dominated the 1980s and 1990s are giving way to distributed data centers and cloud computing, often networked across many organizations and vast geographical distances. These days, ‘People talk about a computing fabric,’ said Stanford University electrical engineer Stephen Boyd. These changes in computer architecture translate into the need for a different computational approach when it comes to handling big data, which is not only grander in scope than the large data sets of yore but also intrinsically different from them.”
Ouellette reports that processer speed, which used to be the Holy Grail of computer design, is no longer the main focus. One reason, according to MIT Professor Scott Aaronson, is that “Moore’s law has basically crapped out; the transistors have gotten as small as people know how to make them economically with existing technologies.” As a result, Boyd told Ouellette, “Processing speed has been completely irrelevant for five years. The challenge is not how to solve problems with a single, ultra-fast processor, but how to solve them with 100,000 slower processors.” That’s the “computing fabric” which he mentions above. Aaronson, however, believes that even such a computing fabric may be reaching the limits of usability in some circumstances. He told Ouellette “that many problems in big data can’t be adequately addressed by simply adding more parallel processing. These problems are more sequential, where each step depends on the outcome of the preceding step.” California Institute of Technology physicist Harvey Newman told Ouellette “that if current trends hold, the computational needs of big data analysis will place considerable strain on the computing fabric. ‘It requires us to think about a different kind of system,’ he said.” That begs the question: Where do we go from here? Ouellette continues:
“One possible solution to this dilemma is to embrace the new paradigm. In addition to distributed storage, why not analyze the data in a distributed way as well, with each unit (or node) in a network of computers performing a small piece of a computation? Each partial solution is then integrated to find the full result.”
Apparently that is the next step according to Boyd. Ouellette explains:
“Boyd’s system is based on so-called consensus algorithms. ‘It’s a mathematical optimization problem,’ he said of the algorithms. ‘You are using past data to train the model in hopes that it will work on future data.’ Such algorithms are useful for creating an effective SPAM filter, for example, or for detecting fraudulent bank transactions. This can be done on a single computer, with all the data in one place. Machine learning typically uses many processors, each handling a little bit of the problem. But when the problem becomes too large for a single machine, a consensus optimization approach might work better, in which the data set is chopped into bits and distributed across 1,000 ‘agents’ that analyze their bit of data and each produce a model based on the data they have processed. The key is to require a critical condition to be met: although each agent’s model can be different, all the models must agree in the end — hence the term ‘consensus algorithms.'”
Although learning machines (i.e., cognitive computers) are likely to enjoy long and useful lives, even when used in parallel they are likely to be insufficient to solve many optimization problems. That’s where quantum computing enters the picture. As I’ve noted in previous posts, Lockheed Martin and Google/NASA have both purchased D-Wave quantum computers this past year in order to tackle some of those optimization problems. The following video is a good introduction to that subject.
Ouellette mentions the on-going debate about whether the D-Wave is a true quantum computer. She then notes, “A true quantum computer could encode information in so-called qubits that can be 0 and 1 at the same time. Doing so could reduce the time required to solve a difficult problem that would otherwise take several years of computation to mere seconds. But that is easier said than done, not least because such a device would be highly sensitive to outside interference: The slightest perturbation would be equivalent to looking to see if the coin landed heads or tails, and thus undo the superposition.”
The challenges that remain in the quantum computing world are significant, but breakthroughs are being made all the time. I’ll discuss some of the breakthroughs in future posts. Google scientist Alon Halevy told Ouellette that he believes “the real breakthroughs in big data analysis are likely to come from integration — specifically, integrating across very different data sets.” He told her:
“You must understand the relationship between the schemas before the data in all those tables can be integrated. That, in turn, requires breakthroughs in techniques to analyze the semantics of natural language. It is one of the toughest problems in artificial intelligence — if your machine-learning algorithm aspires to perfect understanding of nearly every word. But what if your algorithm needs to understand only enough of the surrounding text to determine whether, for example, a table includes data on coffee production in various countries so that it can then integrate the table with other, similar tables into one common data set? According to Halevy, a researcher could first use a coarse-grained algorithm to parse the underlying semantics of the data as best it could and then adopt a crowd-sourcing approach like a Mechanical Turk to refine the model further through human input. ‘The humans are training the system without realizing it, and then the system can answer many more questions based on what it has learned,’ he said.”
The Enterra Solutions® Cognitive Reasoning Platform™ uses this approach on a non-quantum basis by complementing artificial intelligence with a large, common sense ontology. That’s why I’m convinced that the approach discussed by Halevy holds great promise for quantum computing as well. The greatest difference is that a quantum computer could analyze much larger datasets looking for insights and relationships than traditional computers (and much more quickly).