Oracle has partnered with Cloudera to bring Apache Hadoop to its Oracle Big Data Appliance, which the company officially released Tuesday.
The newly released appliance comes with Cloudera's Distribution Including Apache Hadoop (CDH), along with the Cloudera Manager software. The rack also comes with a copy of the Oracle NoSQL Database. Oracle announced the Big Data Appliance, along with the Oracle NoSQL database, at OpenWorld last September.
"A lot of organizations have become very interested in big data. There is tremendous business value in analyzing new types of business data," said George Lumpkin, Oracle's vice president of data warehousing product management.
Oracle is positioning the appliance for managing and analyzing large sets of data that may be too large, or otherwise unsuitable for keeping in databases, such as telemetry data, click-stream data or other log data. "You may not want to keep the data in a database, but you do want to store it and analyze it," Lumpkin said. The appliance is intended for those organizations that want to undertake Big Data-style analysis but may not have the in-house expertise to assemble large Hadoop or NoSQL-based systems.
Along with the release, Oracle also released Oracle Big Data Connectors, a set of drivers for exchanging data between the Big Data Appliance and other Oracle products, such as the Oracle Database 11g, the Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud and Oracle Exalytics In-Memory Machine.
"We are positioning this as something that runs alongside" other Oracle-based systems, Lumpkin said. "Big data is more than just a cluster of hardware running Hadoop. It is an overall information architecture for enabling companies to analyze data and make decisions."
Oracle will provide initial customer support for the appliance, though Cloudera engineers will handle tougher Hadoop-based challenges, Lumpkin said.
The market for commercial Hadoop has grown competitive of late, as Cloudera has been joined by Yahoo spinoff Hortonworks and MapR in offering commercial support for the open-source data processing platform. Cloudera Chief Operating Officer Kirk Dunn declined to answer whether Oracle and Cloudera would extend their cooperation to additional offerings, though he expressed optimism that the partnership would be a long and fruitful one.
The appliance consists of 18 Oracle x86 Sun servers, all running Oracle Linux, featuring 216 processor cores, 864GB of working memory and 648TB of raw disk storage.
The package includes 40Gb/s InfiniBand connectivity among the nodes, a rarity among Hadoop deployments, many of which use Ethernet to connect the nodes. Lumpkin said InfiniBand would speed data transfers within the system. Multiple racks can be tethered together in a cluster configuration. There is no theoretical limit to how many racks can be clustered together, though configurations of more than eight racks would require additional switches, Lumpkin said.
The appliance comes with the community edition of the Oracle NoSQL Database, though users can also upgrade to the enterprise edition. The appliance also comes with a copy of the Oracle Java HotSpot Virtual Machine, a wise inclusion given that Java is among the most widely used languages to write Hadoop jobs.