Saturday, April 11, 2015

Big Data

Everything we do in the digital realm - from surfing the Web
 to sending an e-mail to conducting a credit card transaction
   to, yes, making a phone call - creates a data trail. And if tha
   trail exists, chances are someone is using it - or will be soon.

                                                                   Douglas Rushkoff

A revolution that compares with the impact of the Internet is changing the way that business, politics, health, education – almost everything – is being conducted. It is pervasive to the extent that everyone knows that it’s there, but no one can do anything to stop encroachment.

Data Revolution

The term "Big Data" was coined in 2008 and caught on quickly as a blanket term for any collection of large and complex data sets that are difficult to handle using traditional data processing. Everything that surrounds everybody at all times generates data. Every digital process and social media exchange produces it: messages, updates, images posted to social networks; readings from sensors; GPS signals from cell phones. Enormous streams of data are tied to people, activities, and locations. (1)

Data arrives from multiple sources at high speed, huge volume and variety, often unstructured and unwieldy. But there’s a huge amount of signal in the noise, simply waiting to be used. Big-data analytics brings decision-making that is at once simpler and more powerful. (2)

It is not the sheer quantities of data that is revolutionary; the revolution is that something can now be done with the data. It does not require more storage or computational capacity, but rather improved statistical methods that can be used to solve problems thousands of times faster than conventional computer methods.  New techniques of data analysis add astonishing new insights and value. (3)

Old-style Data Processing Obsolete

Today there is burgeoning ability to crunch vast collections of information analyze it almost instantly and draw conclusions that are often very surprising. Most commercial transactions and events are transformed into searchable formats to find correlations that could never have been known before.

The structured databases that stored most corporate information until recently are not suited to storing and processing big data – the results are woefully inadequate. So, large computer banks, and large data-processing staffs, are quickly becoming obsolete; processing power is shifting to the Cloud, and new data-intensive approaches are quickly becoming much more economical.

Big Data Applications

Familiar applications of big data include “recommendation engines” such as those used by Netflix and Amazon to offer purchase suggestions based on prior interests of specific customer compared to millions of others. (4)

Consider the emergence and growth of Amazon. Once shopping moved online, the understanding of customers increased dramatically. Online retailers could track not only what customers bought, but also what else they looked at; how they navigated through the site; how much they were influenced by promotions, reviews, and page layouts; and similarities across individuals and groups.

Soon Amazon developed algorithms to predict what products individual customers would like – algorithms that performed better every time the customer responded to or ignored a recommendation. Traditional retailers simply couldn’t access this kind of information, let alone act on it in a timely manner. It’s not surprising that Amazon keeps putting so many brick-and-mortar retailers out of business.

Machine learning

The super-abundance of new data, in turn, accelerates advances. Machine-learning algorithms learn from data and the more data, the more the machines learn. (5)

Take Siri, the talking, question-answering application in iPhones. Apple bought Siri in 2010, and kept feeding it more data. Now, with people supplying millions of questions, Siri is an increasingly adept personal assistant, offering reminders, weather reports, restaurant suggestions and answers to an expanding universe of questions.

Just recently, Amazon Web Services (AWS) unveiled its first product for machine learning - simply called Amazon Machine Learning - to make it easier for AWS developers to extract value from the troves of transactional and operational data their hosted systems collect.

The big data revolution is far more powerful than the analytics that were used in the past. Management can be more precise than ever before, with better predictions and smarter decisions. Areas that have been dominated by intuition can now utilize rigorous data insights.

Research & Government Applications

In the public realm, there are all kinds of applications: finding associations between air quality and health; or using genomic analysis to speed the breeding of crops like rice for drought resistance; allocating police resources by predicting where and when crimes are most likely to occur. Do you remember the futuristic movie, “Minority Report” where a special police unit is able to arrest murderers before they commit their crimes?

At the 2012 World Economic Forum in Davos, Switzerland, Big Data was a major topic and was declared data a new class of economic asset, like currency or gold. (6) The potential for channeling huge amounts of data into actionable information that can be used to identify needs & provide services for the benefit of low-income populations. There was a call for concerted action to ensure that big data helps the individuals and communities who create it. 

Big Data in Political Campaigns

The goal of political campaigns is to maximize the probability of victory. Every activity in a campaign is evaluated by how many votes it can generate and at what cost. To make this cost–benefit analysis, campaigns need accurate predictions about the preferences of voters, their expected behaviors, and their responses to campaign outreach. For instance, efforts to increase voter turnout are counterproductive if the campaign mobilizes people who support the opponent.

Over the past six years, campaigns have become increasingly reliant on analyzing large and detailed datasets to create the necessary predictions. While the adoption of these new analytic methods has not radically transformed how campaigns operate, the improved efficiency gives data-savvy campaigns a competitive advantage. This has led the political parties to engage in a race to leverage ever-growing volumes of data to create votes. The techniques used as recently as a decade or two ago by political campaigns to predict the tendencies of citizens appear extremely rudimentary by current standards. (7)

Competitive Advantage

Analyzing “big data” is becoming a key competitive advantage, generating waves of productivity growth, innovation and consumer surplus. Every business will have to grapple with the implications. The increasing amount and detail of information captured by enterprises, the rise of multimedia and social media and the Internet of Things will fuel exponential growth. McKinsey Research reports that Big Data is now an important factor of production, along with labor and capital. (8)

The use of big data will become a key basis of competition and growth. Every company must take big data seriously. Most industries will leverage data-driven strategies to innovate, compete, and capture value from wide-ranging, deep and real-time information. (8)

Big Data Problems

By combining the power of modern computing with the plentiful data of the digital era, Big Data promises to solve virtually any problem just by crunching the numbers. But, precisely because of its popularity and growing use, we need to be levelheaded about what big data can, and cannot, do.  A NY Times Op-ed points out several fallacies and trends that tend develop significant inaccuracies. (9)

Several issues will need to be addressed to capture the full potential of big data. Policies related to privacy, security, intellectual property, and even liability will need to be re-evaluated in the big data world.

Says Wired magazine, science has a problem in not doing nearly enough to encourage and enable the sharing, analysis and interpretation of the vast swatches of data that individual researchers are collecting. If more credit were given to open sharing of research data, scientific progress would accelerate. (10)

Talent Shortage

To exploit the data flood, the McKinsey Global Institute projects that the United States needs 140,000 to 190,000 more workers with “deep analytical” expertise and 1.5 million more data-literate managers, whether retrained or hired. Clearly there will be a shortage of talent necessary for organizations to take advantage of big data.

Organizations need not only to put the right talent and technology in place but also structure workflows and incentives to optimize the use of big data.

Good Book

In this excellent book on Big Data, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing. (11) Read it.


  1. Wikipedia – Big Data:
  2. NY Times – The Age of Big Data:
  3. Why Big Data Is a Big Deal:
  4. Big Data: The Management Revolution:
  5. Here comes Amazon Machine Learning:
  6. WEF Davos – Big Data, Big Impact:
  7. Changed Political Campaign With Big Data:
  8. The next frontier for Innovation:
  9. Problems With Big Data:
  10. Science’s Big Data Problem:
  11. Book – Revolution That Will Transform:

Jim Pinto
Carlsbad, CA.
12 April 2015

1 comment:

  1. This has been going on for about 40 years, the difference today is that most people allow their activities to be traced. Previously we used credit cards and after the data is collected, usually over a period of a month, it was possible to trace where the card holder had been. With more modern toys such as Applepay and cloud computing, the data is available in minutes. If you don't want to be traced, use cash....
    This whole data mining is because we allow it, it is our choice. I personally don't like it, but like most people, I am too lazy to do anything about it....