This is the final segment of a 10-part series on big data. That does not mean, however, that I won’t discuss big data in future posts. The series was intended to provide a good background on the subject and a basis for future discussions. The first nine parts of the series can be accessed by clicking on the following links: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9.
Quentin Hardy rhetorically asks, “Is Big Data a Bubble?” His answer, “In case you’re in a hurry: Of course it is. And that is good.” [“The Big Business of ‘Big Data’,” New York Times, 24 October 2011] Hardy’s brief answer raises a lot of questions. Fortunately, Hardy goes on to provide a longer version of his answer. The question, it seems, arose in his mind while he was attending a conference at which big data was one of the topics. He admits in his longer answer that big data isn’t a bubble in the sense that it will grow so big that it will eventually burst. After all, he writes, it is “now so easy to digitize and put on the Internet all kinds of information — things as diverse as the measurements of passive sensors, most or all the world’s books, 200 million tweets a day and most of the world’s significant financial transactions — that the data is growing enormously.” If data is the air filling the balloon, its supply is only going to grow.
The problem, of course, is that the term bubble has been pejoratively used to describe unsustainable trends (e.g., the dot.com bubble in the early 2000s and the housing bubble more recently). Hardy admits that “Big Data is really about … the benefits we will gain by cleverly sifting through it to find and exploit new patterns and relationships.” Obviously, therefore, he needs to explain what he means when he claims that big data is a bubble. He writes:
“[At] … the Web 2.0 conference, … sometimes there were overreaching conclusions. In a memorable 10 minutes, Alex Rampell, the chief executive of TrialPay, made a case that credit card companies should not charge their 2 percent fees on a transaction, since ‘the value of the transaction isn’t in the fees, it’s in the data that is generated.’ When you know what someone has purchased, you can make a case of what ad to put in front of them next. Citing Amazon.com’s relentless upselling approach (‘people who bought X also bought Y’), Mr. Rampell said, ‘There’s an Amazon.com for everything, it’s called Visa, it’s called American Express.’ Mr. Rampell may be right, but there was no proof in his admittedly brief talk that this is actually true.”
That is where Hardy’s “big data bubble” is really to be found — in the myriad of companies that will lay claim to being able to make sense of the growing mountains of data. There will be promises made that simply can’t be kept and claims asserted that can’t be proven. He continues:
“Is it really easier and better to move a 2 percent business, with relatively fixed costs of technology and insurance, over to a much more variable ad-based business? If all advertising heads toward this model, and we don’t purchase particularly more stuff, doesn’t the value of the technology start to diminish, and simply turn from a competitive edge into a must-have?”
The dot.com bubble that burst early this century was caused by “irrational exuberance.” It was the irrational part, rather than the exuberant part, that caused the bubble to burst. Hardy is basically saying the same thing about big data, namely: Don’t get irrational when you discuss its potential benefits. Good business principles have not left the building. Hardy continues:
“Often people won’t know exactly what hidden pattern they are looking for, or what the value they extract may be, and therefore it will be impossible to know how much to invest in the technology. Odds are that the initial benefits, as it was with Google’s Adwords algorithm, will lead to a frenzy of investments and marketing pitches, until we find the logical limits of the technology. It will be the place just before everybody lost their shirts. This is a common characteristic of technology that its champions do not like to talk about, but it is why we have so many bubbles in this industry. Technologists build or discover something great, like railroads or radio or the Internet. The change is so important, often world-changing, that it is hard to value, so people overshoot toward the infinite. When it turns out to be merely huge, there is a crash, in railroad bonds, or RCA stock, or Pets.com. Perhaps Big Data is next, on its way to changing the world.”
Using good business sense and a little common sense, companies can avoid getting caught up in the hype. When it comes to the manipulation of big data, companies need to ensure that a good business case can be made for adopting proffered technological solutions. Hardy, for example, talks about Josh James, another speaker at the Web 2.0 conference, who takes a pragmatic approach to delivering big data information to executives. Hardy explains:
“Mr. James … has started a company called Domo. Rather than search for new patterns in the big piles of data, Domo will focus on delivering to a top executive simple existing data, like how large a bank’s deposits are on a given day, or how many employees a company has, that are still hard to locate. ‘Everyone is saying that the team with the best data analysts will win,’ he said. ‘We have all the data we need. The focus ought to be on good design, and telling the vendors the simple things you really need to see.'”
James is obviously trying to sell the products his company is offering; but, he’s wrong in claiming that “we have all the data we need.” That’s like claiming, “We have all the history we need” or, “We have all the knowledge we need.” It simply isn’t so. The past is not necessarily prelude to the future. Although James’ company many provide a worthwhile service, it is not the only service that companies need that uses big data. I do agree with him that “good design” and visualization are important. The best analysis in the world is virtually worthless if it’s not presented in an easily understood way at the right time.
In the end, Hardy admits, “Big Data is clearly big business, adding a new level of certainty to business decisions, and promoting new discoveries about nature and society. That is why over the past two years I.B.M., E.M.C. and Hewlett-Packard have collectively invested billions of dollars in the field.” Still, Hardy admits that these are still early days in the era of big data. He concludes, “Expect to see a lot more before it all gets sorted out.”
One of the areas that is ripe for experimentation with big data, according to C. N. Sajit Kumar, is supply chain management. [“Supply Chain Management- The Ideal Breeding Ground for Cloud?” Supply Chain Management, 19 August 2011] He writes:
“The market for cloud-based services is expected to reach nearly $150bn by 2014. Gartner expects that one-fifth of all businesses will own absolutely no IT assets by 2012. Manufacturing companies around the world, with their inherent penchant for low IT budget, are paying much closer attention to cloud computing and its potential value to supply chain processes – from sourcing to after-sale service. [The] Supply Chain Management space is … the ideal breeding ground for cloud computing.”
Kumar indicates that, along with the benefits that can be derived from cloud-based big data systems, there are some “pitfalls that should be avoided”; however, he describes the benefits for various supply chain sectors beginning with procurement. He writes:
“Enterprise applications in [the] supply chain space, and especially in procurement domain, are mostly about B2B or inter-company coordination and collaboration among hundreds of supplier companies on a global scale. This geographical spread and need for collaboration makes it an ideal candidate for Cloud computing. One of the main value-proposition from Cloud computing is said to be reduction in ‘total cost of ownership’ and it is also the most commonly cited success metric in sourcing and procurement.”
Another supply chain area in which Kumar believes that cloud-based big data systems can pay big benefits is planning. He writes:
“Production planning and forecasting are not normally the core components of companies’ ERP systems. Clients therefore can run one vendor’s ERP application and can leverage another’s best-of-breed planning/forecasting application via the Internet. (Of course, need for a proper coupling or integration between the two systems can never be undermined! …).
I think he means “overstated” rather than “undermined.” Data integration is potentially a challenge whenever legacy systems are involved. Large ERP vendors are concerned about the challenge presented by cloud computing. “SAP AG’s $3.4 billion agreement to acquire SuccessFactors Inc. shows just how big a threat online products are becoming to the kings of conventional software, and points to the possibility of more such acquisitions.” [“SAP Deal Shows Rise Of Online Software,” by Ben Worthen and Christopher Lawton, Wall Street Journal, 5 December 2011] For more on that subject of online services and integration, see my post entitled SaaS: Speeding Up or Delaying the Future? The next area that Kumar believes is ideally suited for cloud-based computing is overall supply chain management — so-called Control Tower Systems. He writes:
“Now coming to Supply Chain Execution and visibility, Control Tower Systems, the most recent addition to supply chain visibility tools is now available in cloud. Control Tower technology for supply chain, in simple terms, is a Single-Version of Supply Chain Truth. A platform that makes that truth available across the value chain in an agile manner and that connect trading partners and service providers to create a vibrant, ‘always on’ electronic community. When you want something to be always on, where else other than internet to host the application?”
Kumar does eventually get to some of the pitfalls that need to be avoided (in addition to the integration issue mentioned earlier). He recommends moving “non-core functions to [the] cloud initially and if they are OK with it, move the rest of the stuff also into [the] cloud.” He also points out that there could be some “firewall and security issues – which at times can get very complex or can go wrong.” For more on those topics, read my posts entitled The “Big Data” Dialogues, Part 8: Cybersecurity and The “Big Data” Dialogues, Part 9: Physical Security. Kumar’s final caution is that even though “there is [a] very low entry barrier to get into cloud – no hardware, capital investment” – it’s not as easy to get out as it was to get in. He concludes:
“Switching is not that easy. At least, not yet. The cloud vendors are not showing the same enthusiasm in creating a global standard for cloud as much as they showed in moving applications in cloud. The reason is obvious- 100% portability is not in the interest of the vendors – that each one of them wants to protect the investments they have made. Once such a standard (like Open Cloud Manifesto) comes into being, we can see a further shift in the bargaining power towards the buyer. … This inevitable and imminent adoption of [the] cloud is going to bring about a paradigm shift in the way enterprise applications are dealt within manufacturing industry. The supply chain application vendors who are quick to internalize this change are going to hugely benefit from this.”
The bottom line is that big data is here to stay. Because the big data era is young, vendors and clients have yet to sort out exactly how cloud-based big data systems can best be leveraged to improve business processes. However, the learning curve is steep and companies that hesitate to act could lose ground to their competition.