Big Data and Big Brother

Stephen DeAngelis

May 2, 2013

“In our haste to study larger and larger amounts of data and find information,” writes Christopher Taylor, “there’s a point getting lost in the excitement ... those are people who often haven’t given their permission for their data to be used for just any purpose. This isn’t a small problem or isolated problem. The use of consumer data to understand the marketplace is one thing, but we’ve gone beyond the question of what’s selling well.” [“Privacy Rules are the First Casualty of Big Data,” DZone, 8 March 2013] In numerous previous posts, I’ve noted that privacy issues are the 600-pound gorilla in the big data room. Taylor insists, “The data about the purchase has become as important as the purchase itself and less likely to be scrutinized for privacy than ever before.” Because so much analysis is being done behind corporate firewalls, protecting privacy rests on the shoulders of corporate data scientists and statisticians performing the analysis. Because self-regulation is the norm, Taylor asserts that “new levels of responsibility” have been created “for those who collect, hold and use all of that data.” Unfortunately, not all of these individuals (or the bosses who direct them) are scrupulous. Taylor concludes, “The biggest challenge of Big Data isn’t the storage or processing. The single biggest challenge, a challenge many don’t see or aren’t talking about, is the need to govern data in ways that protect the individual from manipulation, fraud or worse.”

Quentin Hardy writes, “Even without knowing your name, increasingly, everything about you is out there. Whether and how you guard your privacy in an online world we are building up every day has become increasingly urgent.” [“Rethinking Privacy in an Era of Big Data,” New York Times, 4 June 2012] Data collection strategies have become so sophisticated I’m not sure that individuals are now able to guard their online privacy with any assurance. Hardy reports that Danah Boyd, a senior researcher at Microsoft Research, last year told participants attending a conference at Berkeley, “Privacy is a source of tremendous tension and anxiety in Big Data. It’s a general anxiety that you can’t pinpoint, this odd moment of creepiness.” The “creepiness” factor is primarily attached to the notion of big brother watching your every online move. Hardy continues:

“Privacy is not a universal or timeless quality. It is redefined by who one is talking to, or by the expectations of the larger society. In some countries, a woman’s ankle is a private matter; in some times and places, sexual orientations away from the norm are deeply private, or publicly celebrated. Privacy, Ms. Boyd notes, is not the same as security or anonymity. It is an ability to have control over one’s definition within an environment that is fully understood. Something, arguably, no one has anymore.”

That’s the rub. I appreciate the fact that Hardy points out that there are differences between anonymity, security, and privacy. The first two are more easily dealt with than the latter. We all understand that companies want to collect and analyze data to help them sell the right product to the right person at the right time. And most of us know that, like ants, we leave a scent of activity that follows us through our online surfing. Companies must legally tell us how they intend to use the data they collect and most of us click the button indicating that we have read the privacy notice without really having done so. We rely on geeks, nerds, and lawyers to read them for us and we trust they are forcing companies to be fair and reasonable. That could be a mistake. Mary E. Shacklett writes, “In many cases, the data collecting is far more comprehensive than people imagine it to be, thanks to new ways to plumb the data, such as reality mining, which gathers and analyzes information from machine-generated data that is capable of predicting individuals’ social behavior.” To learn more about reality mining, watch the attached video, which features MIT’s Professor Alex “Sandy” Pentland.

Despite his work on reality mining, Professor Pentland believes that companies can establish mutually beneficial relationships with consumers so that they willingly provide access to important marketing data. For more on his views, see my post entitled Big Data Dilemmas. Even Shacklett agrees that there are times when reality tracking can provide valuable benefits. She explains:

“If you’re an online retailer, you’ve got a matter of seconds (while the customer is on your website) to consummate a sale. The better you are at predictive selling, the better your sales numbers will be. If you’re a healthcare services provider monitoring Alzheimer’s patients in their homes, the more real-time data you can collect based on their whereabouts, activity sensors placed in their homes, and on their cellphones, the easier it will be to intervene if patients need help and are unable to request it. If you’re a metropolitan area monitoring city traffic flows, the more up-to-the-minute data you can collect about developing traffic jams, the better you can preempt complications by sending out messages to roadside e-billboards that inform motorists of upcoming congestion and advise them on alternate routes. You may be able to control traffic lights more effectively to control the flow of vehicles in the congested area. If you’re a financial institution, the more you know about a person’s buying patterns (and potentially, even the people they associate with in social media channels) the likelier you will be able to reach out to more customers with the right products. The balance enterprises need to strike in using this data is how far they should go to exploit this vast morass of data. This is the balance organizations must strike between helping the customer (and improving profits) and maintaining trust (which also builds profits).”

That balance between use and abuse of data is not easy to achieve. Shacklett concludes, “The data is out there and tools are available to pull it all together, but organizations must walk a fine line so they don’t frighten prospects or existing customers away. It is a balancing act, and corporate reputations depend on getting it right.” Professor Viktor Mayer-Schönberger, from Oxford University, and Kenneth Neil Cukier, from The Economist, agree that the risk of abuse is increasing. They call it the “dark side” of big data. [“Big Data – and its Dark Side,” Berkman Center for Internet & Society, 6 March 2013] They believe, “The power of big data — analyzing huge swaths of information to uncover insights and make predictions that were largely impossible in the past — is poised to transform business and society. Fueling it is the realization that data has a value beyond the primary purpose for which it was collected. Yet there is a dark side. Privacy is eroded like never before. And a new harm emerges: predictions about human behavior that may result in penalties prior to actual the infraction being committed.” That proposition should sound very familiar to anyone who watched the movie thriller “Minority Report.”

Concerns about privacy continue to rise and will only continue to do so as the so-called “Internet of Things” becomes a reality. Richard Nieva comments, “As the landscape around connected devices begins to take shape, … the copious amounts of data that devices will be collecting on consumers ad nauseam … present a sticky situation for those concerned with privacy.” [“Privacy and the Internet of things,” PandoDaily, 10 April 2013] Nieva believes “there is still a way to shape the discussion.” Vrunda Rathod, from Engine.is, an organization that works with startups and government to create public policy, told Nieva, “Consumers are going to give up privacy if they know what they are giving it up for.” That pretty much mirrors Pentland’s views.

Apparently a lot of people are coming to the same conclusion. That may be one reason that Fred Guterl reports that more and more people now agree that “users CAN and SHOULD have some measurable control over either collection or use of their data.” [“How Much Control Will We Have over Our Personal Data?” Scientific American, 25 January 2013] He explains:

“We just saw the world move. Huge companies – and in particular, the large number of massive global companies that are not hide-bound by their attachment to traditional Internet advertising business lines – see clearly the vast revenue opportunities in the data stores they have been accumulating for decades about people, places, and things. And now – for reasons of both regulation and principle – they are highly motivated to start monetizing those data sets in ways that are clearly user-centric and pro-user. There’s a tremendous amount of valuable data that companies can’t presently access or deploy for corporate or user gain – untapped gold, a resource of tremendous revenue generation. With the right model — a ‘good guy’ paradigm that is user-centric — progressive companies could open the door to a huge new source of wealth creation. With a Field of Dreams-like focus on incentives (‘if you offer them, they will come’), consumers will embrace the shift to owning and benefiting from their data.”

If Guterl’s observations are correct (and people like Pentland seem to believe they are), then individuals may find themselves having much more control over their personal data than they do today. There is no putting the genie of big data back in bottle; so providing people with more control over how their data is used is probably the best compromise that can be reached. It should be encouraging that large organizations are starting to support this concept.

On the Road to AI Superintelligence

New knowledge is being generated at such a dramatic rate that humans can no longer be expected to absorb and understand it. Pippa Malmgren, Founder

The Rise of A.I. Is Not Like the Dotcom Bubble

Nearly three decades ago, the world experienced what became known as the dotcom bubble. Many of the start-ups that popped up during that time raised