Now that many enterprises are seeing value in big data analysis, it may be time for their database administrators and data warehouse managers to get involved.
Oracle has released a new extension for its Oracle Data Integrator middleware that allows DBAs and data warehouse experts to treat big data repositories as just another data source, alongside their structured databases and data warehouses.
The Oracle Data Integrator for Big Data “makes a non-Hadoop developer instantly productive on Hadoop,” said Jeff Pollock, Oracle vice president of product management.
Big data platforms such as Hadoop and Spark were initially geared more towards programmers than DBAs, using languages such as Java and Python, Pollock said. Yet traditional enterprise data analysis has largely been managed by DBAs and experts in ETL (Extract Transform and Load Tools), using tools such as SQL and drag-and-drop visually-oriented interfaces.
The Data Integrator for Big Data extends Oracle’s ODI product to handle big data sources.
ODI provides the ability for organizations to pull together data from multiple sources and formats, such as relational data hosted in IBM or Microsoft databases, and material residing in Teradata data warehouses. So it was a natural step to connect to big data repositories to ODI as well.
With the extension, “you don’t have to retrain a database administrator on Hive for Hadoop. We can now give them a toolkit that they will be naturally familiar with,” Pollock said. The administrator can work with familiar concepts such as entities and relations, and 4GL data flow mapping. The software “automatically generates the code in the different underlying languages,” needed to complete the job, Pollock said.
The software can work with any Hadoop or Spark deployment, and doesn’t require software installation on any of the data nodes. Using the power of distributed computing, Data Integrator for Big Data uses the nodes where the data is stored to carry out all the computations needed.
A retail organization could use the software to analyze its customers’ purchasing histories. Real-time data capture systems such as Oracle GoldenGate 12c could move transactional data into a Hadoop cluster, where it then can be prepared for analysis by ODI.
Oracle is not alone in attempting to bridge the new big data tools with traditional data analysis software. Last week, Hewlett-Packard released a software package that allows customers to integrate HP’s Vertica analysis database with HP Autonomy’s IDOL (Intelligent Data Operating Layer) platform, providing a way for organizations to speedily analyze large amounts of unstructured data.