'Big Data' Prep: 5 Things IT Should Do Now
Let business needs drive data dives
It sounds like a broken record, but the concept of IT/business alignment is absolutely critical to an initiative as big and varied as big data, IT analysts say.
Many of the initial big-data opportunities have been seeded in areas outside of IT, they say -- marketing, for example, has been early to tap into social media streams to gain better insights into customer requirements and buying trends.
While the business side may understand the opportunities, it is IT's responsibility to take charge of the data sharing and data federation concepts that are part and parcel of a big-data strategy.
"This is not something IT can go out and do on its own," says Dave Patton, principal of information management industries at PricewaterhouseCoopers LLP. "It will be hard to turn this into a story of success if [the initiative] is not aligned to business objectives."
Early in its big-data initiative, Catalina Marketing's Williams brought business managers together with its financial planning and analysis (FPA) group in a team effort to make a business case for information architecture investments.
The business side identified areas where new insights could deliver value -- for example, in determining subsequent purchases based on shopping cart items or through a next-buy analysis based on product offers -- and the FPA team ran the numbers to quantify what the results would mean in terms of enhanced productivity or increased sales.
Re-evaluate infrastructure and data architecture
Big data will require major changes in both server and storage infrastructure and information management architecture at most companies, Gartner's Beyer and other experts contend. IT managers need to be prepared to expand the IT platform to deal with the ever-expanding stores of both structured and unstructured data, they say.
That requires figuring out the best approach to making the platform both extensible and scalable and developing a roadmap for integrating all of the disparate systems that will be the feeders for the big-data analysis effort.
"Today, most enterprises have disparate, siloed systems for payroll, for customer management, for marketing," says Anjul Bhambhri, IBM's vice president of big-data products. "CIOs really need to have a strategy in place for bringing these disparate, siloed systems together and building a system of systems. You want to be asking questions that flow across all these systems to get answers."
To be sure, not every system will need to be integrated; approaches will vary depending on the size of company, the scope of the business problem, and the data requirements. But Bhambhri and others say the overarching goal should be to create an information management architecture that ensures data flow between systems. To create this foundation, companies will leverage technologies like middleware, service-oriented architecture, and business process integration, among others.
In the meantime, traditional data warehouse architectures are also under pressure. Gartner's Beyer says that 85% of currently deployed data warehouses will, in some respect, fail to address the new issues around extreme data management by 2015.
Even so, he says, "we don't want to give the idea that rip-and-replace is even on the table." Instead, existing repositories can be expanded and adapted to encompass built-in data processing capabilities.
"The warehouses of the past have been focused on determining what kind of data repository you have and where you have it. The new mindset is that data warehouses will be a combination of new and existing repositories plus data processing and delivery services," Beyer explains.
Bone up on the technology
The big-data world comes with a big list of new acronyms and technologies that have likely never graced a CIO's radar screen.
Open-source technology is getting most of the attention with technologies like Hadoop, MapReduce, and NoSQL taking credit for helping Web-based giants like Google and Facebook churn through their reservoirs of big data. Many of these technologies, while starting to be offered in more commercial forms, are still fairly immature and require people with very specific skills sets.
Beyond the new open-source options, IT groups will also have to ensure they are up to speed on other technologies important to the big-data world, such as in-database analytics, columnar databases and data warehouse appliances.
IT managers and their staffs need to dive in and at least familiarize themselves with these new tools in order to be properly situated to make big-data decisions going forward.
Prepare to hire or retrain staff
Whether it's a Hadoop expert or a data scientist, most IT organizations are sorely lacking the right talent to take the next steps with big data. The analytic skill sets are perhaps the most crucial, and they represent the area where the gap is currently largest.
McKinsey projects that in the U.S. alone, there will be a need by 2018 for between 140,000 and 190,000 additional experts in statistical methods and data-analysis technologies, including the widely hyped emerging role of "data scientist."
In addition, McKinsey anticipates the need for another 1.5 million data-literate managers, on either the business or tech side of the house, who have formal training in predictive analytics and statistics.
Under the IT department's jurisdiction, traditional data warehouse and BI professionals will require some retraining.
And in addition to traditional skills in information management, governance and database structure, the new big-data professionals need an understanding of semantics and mathematical disciplines -- not to mention expertise in the new predictive analytics tools and data management platforms that comprise big data.
"The people who built the databases of the past are not necessarily the people who will be building the databases of the future," says Catalina's Williams. "Don't underestimate the complexity in trying to produce something like this."
For some companies, especially those in less populated areas, staffing will likely complicate the challenge. "[Big data] definitely requires a different mindset and skills in a host of areas," says Rick Cowan, CIO at True Textiles, in Guilford, Maine, a contract manufacturer of interior fabrics for the commercial market.
"As a medium-sized business, it's been a challenge to be able to get staff and keep them up to speed with the ever-changing environment." To address the need, Cowan has begun formally retraining programmers and database analysts to come up to speed on advanced analytics.
IT department heads will have to do some transforming of their own to excel in this brave new world. While the best tech leaders of the past have been partly information librarian and partly infrastructure engineer, the IT managers of the future will be a combination of data scientist and business process engineer, says Gartner's Beyer.
"CIOs have been used to managing infrastructure based on a given instruction set from the business, as opposed to a CIO that is able to identify the opportunity and therefore push towards innovative use of information," he explains. "That's the transformation that needs to happen."
Stackpole, a frequent Computerworld contributor, has reported on business and technology for more than 20 years.