Flash is rocketing into big-data analytics

Pure Storage's upcoming FlashBlade will store petabytes of data at less than $1 per gigabyte

pure storage flashblade front chassis

The FlashBlade all-flash system from Pure Storage.

Credit: Pure Storage

At one time, all-flash storage arrays were used for a single mission-critical application with a need for speed, usually in big IT shops. Now they're poised to take over many more parts of IT.

Systems are being scaled down and tuned to the requirements of medium-sized enterprises, while larger, petabyte-scale flash platforms are about to take on big-data number crunching with unprecedented performance.

Falling costs are the main reason. Flash media gets cheaper as it packs more bits into the same amount of space, so its speed advantage over spinning disks is within reach for more enterprises. And at larger scale, it boosts data-center efficiency in ways that can multiply the savings.

Pure Storage has been one of the most fervent promoters of this trend. All its products have been all-flash since the company's founding in 2009. On Monday, Pure is upping its game to address large-scale analytics workloads as well as reaching out to enterprises that haven't been able to afford all-flash systems before.

At its inaugural Pure//Accelerate user conference in San Francisco, Pure is announcing the FlashBlade, a platform designed to store petabytes of unstructured data, like images and social media posts. 

The FlashBlade is aimed at emerging applications that require fast access to data for almost real-time decision-making. This is the kind of technology an athletic-wear company needs to deliver product offers related to the star players in a soccer match that's still being played, said IDC analyst Eric Burgener. It allows the manufacturer to analyze social media posts to determine whose shoes to promote.

The system is in beta testing now and should ship commercially by early next year, Pure says. Among the beta testers are car companies using it for things like airflow simulations. Another early adopter is a Web site that takes in media posted by users, rapidly transcodes it and analyzes it, then makes it available for others to view.

There are ways to do big-data analytics with current flash technology, but the FlashBlade and other emerging products, including EMC's recently announced DSSD D5 array, are purpose-built for the task and should make it easier to manage, Burgener said.

The FlashBlade will pack 1.6PB of effective capacity in a 4U (7-inch-high) rack unit. In-line compression and deduplication help it achieve that density. Users can scale out the FlashBlade by adding more nodes, gaining both capacity and computing power along the way. The nodes connect over 40-Gigabit Ethernet.

The system uses pure flash media, not SSDs (solid-state drives), and a single software base that runs everything, including flash controller functions and software-defined networking between the nodes. That code runs on standard x86 processors.

Pure's big-data launch comes just weeks after EMC's DSSD D5 announcement. One difference between the platforms is that EMC is using NVME (non-volatile memory express) instead of Ethernet to interconnect systems. That's likely to give Pure a price advantage, said Gartner analyst Joe Unsworth.

Pure says the FlashBlade will cost less than $1 per gigabyte of effective storage, closer than ever to the cost of hard disk drives. But at petabyte scale, flash gets even cheaper than arrays of spinning disks, all things considered, analysts say. Flash takes less energy and data-center space, a difference that starts to add up when there's a lot of data to accommodate. In addition, flash feeds data to servers so quickly they don't have to wait for the bits to process, so companies don't need as much computing capacity, IDC's Burgener said.

Also on Monday, Pure is introducing the FlashArray//m10, a scaled-down version of its standard array. The system will have as much as 30TB of usable capacity and cost less than $50,000. It can be upgraded to the FlashArray//m20 and larger systems as a company's needs grow. The FlashArray//m10 will also form the basis of the FlashStack Mini, a converged infrastructure system that bundles it with Cisco Systems servers and virtualization software from either VMware or Microsoft. The 9U systems will start at less than $100,000. The array and the converged systems will ship in June.

Subscribe to the Power Tips Newsletter