Watson Teaches 'Big Analytics'
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
IBM's Watson's impressive "Jeopardy!" win demonstrated the awesome strides in computing power and ingenuity, but just as impressive was the way in which Watson's creators attacked an avalanche of information to come out victorious. Notably, Watson wasn't concerned with big data alone.
"Big data" is often cited as the core problem holding back companies from gaining a competitive advantage in this age of information overflow. Most organizations are fairly adept at capturing that information, but what ultimately matters is what they do with it, how quickly they utilize it to glean value. This is "big analytics." And though Watson is clearly a different animal than database analytics solutions for business, fundamentally, Watson is big analytics.
Working from just a single terabyte of data, Watson performed complex analyses at incredibly high speeds to come up with correct answers. For those of us in the business of data storage and analytics -- in fact, most companies -- this illustrated the power and challenge of big analytics, not just big data.
A Combination Problem
For years, big data was considered a critical problem for businesses trying to capture information and then deliver new products or solutions to customers based on that knowledge. Initially, the costs in storage alone could get out of hand quickly and admittedly, the numbers associated with data collection look and sound daunting.
Retailers regularly collect massive amounts of information about customers from online, in-store and even social media sources. Financial institutions gather millions of daily credit card and bank transactions, and rely on multiple terabytes of historical data to create new business insights. A recent IDC report predicts data will grow some 44 times over the course of the next decade!
Too often, the industry focuses its attention primarily on this piece of the data problem. But today, those are simply big numbers. But the second piece, often ignored or pushed aside, is the problem of big analytics, because even 100 terabytes of data is entirely useless if companies haven't solved the big analytics problem.
This of course includes the aforementioned problems of scale. But modern analytic platforms must also be extremely fast in answering creative, often difficult questions drawn from multiple sources in a variety of programming languages. That is, these platforms require velocity, agility and the capacity to deal with complexity.
Velocity, first and foremost, is about brute speed and power. Watson was not only able to come up with answers with a required level of confidence but also physically buzz in before his human competitors. In business, vast stores of data -- customer information, social media feeds, financial records -- have diminishing returns as time goes by. If the information is not acted on immediately, its value plummets. For instance, financial institutions attempt to identify trades just 30 seconds ahead of the competition to maximize returns, or attempt to identify fraudulent patterns as they occur. They can't predict an event, however, if they must wait for big analytics to come back with an answer. Critically, businesses must now get from problem to question to answer in a drastically reduced timeframe.
Agility is the capacity to have a "conversation" with data. Watson was in a sense having a conversation with "Jeopardy!" host Alex Trebek. But the computer was also having a conversation with its data store, creating a series of answers with varying degrees of certainty.
Businesses successfully utilizing big analytics can take this process of knowledge discovery even further, identifying questions, exploring the answers and asking new questions based on those answers. This iterative quality of data analysis, rather than incremental exploration, can lead to a deeper understanding of business and markets, and begin to answer questions never before considered.
GARTNER REPORT: BI, analytics software spending jumps 13.4%
Watson was also able to understand the intricacies of human language, in many cases even the semantics of puns and wordplay. While database analytics solutions of course can't understand language, the ability to understand complex questions, and explore gargantuan data stores, is indeed critical.
Bringing these big analytics traits together for enterprise risk management (ERM), now a central focus for companies, is one example. Exploring risk across the organization, companies glean the "risk web" that shows causality and not simply correlation among various risky actions. Analyzing this risk web in order to make sound decisions, often in a short timeframe, requires an analytic platform delivering on the promise of big analytics.