Aster Data Updates 'frontline' Analytic Database

Startup Aster Data Systems released the 3.0 version of its nCluster analytic database on Tuesday, framing it as ideal for "frontline" data warehousing.

"Traditionally, we think of data warehousing as a back-office task," Aster CEO Mayank Bawa wrote in a blog post Tuesday. "The data warehouse can be loaded in separate load windows; loads can run late (the net effect is that business users will get their reports late); loads, backups, and scale-up can take data warehouses offline -- which is OK since these tasks can be done on non-business hours (nights/weekends)."

But Aster's customers, which include aCerno, an Internet advertising delivery network, "rely on data analytics for their revenue," Bawa said.

Aster's nCluster 3.0 spreads workloads over a number of servers and makes it easy to add additional machines for more power. The software also splits up the various components of a data-analysis workload into discrete pieces.

A "loader" tier deals with data loading and export to and from external sources; a "worker" layer stores data on locally attached disks for querying; and a layer of "queen" nodes performs intelligent query planning and processing.

Meanwhile, users work with the cluster as if it were a single entity.

The ability to selectively scale segments of the cluster means users can add resources in areas where they're needed most, Aster says.

To these core capabilities, the 3.0 release adds a number of functions for "always-on" use, including the ability to add capacity, rebalance data and recover data while the system is live.

Aster also worked to add parallelization throughout the system, according to a company official.

"We want to build systems that can handle 10x, 100x more data than any other system today. But this is too much data for any single commodity server," said CTO Tasso Argyros in a blog post. "So we put a lot of R&D effort into parallelizing every single function of the system -- not only querying, but also loading, data export, backup, and upgrades. Plus, we allow our users to choose how much they want to parallelize all these functions, without having to scale up the whole system."

The release also includes support for MapReduce, a programming technique originally developed by Google that makes it easier for developers to write programs for analyzing large sets of data. Aster's competitor Greenplum also recently announced MapReduce support.

Additional features include the ability to select data compression levels for individual tables, a "one-click" upgrade tool; and better security features, such as LDAP (Lightweight Directory Access Protocol) for authentication and the ability to manage user privileges at the cluster, database and table levels.

Many of Aster's initial customers, which also include MySpace, are Web-focused, said Curt Monash, president of Monash Research.

"A tremendous fraction of the growth and opportunity in data warehousing lies in dealing with relatively new kinds of data," he said. "There are large data warehouses dealing with traditional OLTP, transactional data, but Aster is not necessarily a leading competitor in analyzing that. The sweet spot in analyzing large amounts of data is currently Web data and associated network events."

Monash cited parallel processing administration and support for MapReduce as Aster's key strengths: "Aster is a startup with a relatively immature product, but they've put a lot of thought into to how to make parallel processing easy to administer."

NCluster 3.0 runs on standard x86 servers. Pricing is based on how much data is being managed. The company previously said that costs start at US$100,000.

To comment on this article and other PCWorld content, visit our Facebook page or our Twitter feed.
Related:
  
Shop Tech Products at Amazon