Facebook open-source cache squeezes more from flash disks
Facebook continues to push the boundaries of storage and server technology in order to more quickly serve its billion users, and the results are being offered as open-source technology that can also benefit other companies.
Recently, Facebook updated its internally developed caching software, called Flashcache, to more efficiently use the thousands of solid-state drives (SSDs) that the social networking giant deploys to store frequently consulted data.
The newly released Flashcache 3.0 is able to make better decisions about what data to cache, while reducing the amount of wear and tear on expensive flash disks.
“With these improvements, Flashcache has become a building block in the Facebook stack,” wrote Domas Mituzas, a Facebook database engineer who authored a blog post explaining the updates to the open-source software.
The work aims to improve overall Facebook performance without unduly driving up operating costs.
“While the cost per GB for flash is coming down, it’s still not where it needs to be,” Mituzas wrote. Given the premium prices commanded for SSDs, Facebook doesn’t want to wear out these disks too quickly. “SSDs have limited write cycles, so we have to make sure that we’re not writing too much.”
Other open-source projects
Flashcache is one of a number of software projects that Facebook originally developed in house that the company has also released as open source. Earlier this year, for instance, the company also released a virtual machine, called HipHop, that speeds the processing of PHP code.
The company hopes that other organizations could reuse such programs as HipHop and Flashcache and eventually contribute to their further development. Like other open-source caching software such as memcache and Redis, Flashcache can be used to speed the responsiveness of a heavily visited website or popular Web application.
Facebook originally created Flashcache to boost the responsiveness of the MySQL databases that store user data. The software can be loaded onto the Linux kernel as a module without making any changes to the kernel itself.
The idea behind Flashcache is to use SSDs to hold the material that is most requested by users. SSDs tend to be faster than traditional rotating platter hard drives, though they are also more expensive by the GB when compared to hard drives. So it would not be cost-effective for Facebook to store all of its data on SSDs, especially if the vast majority of Facebook user data is rarely consulted.
Although designed to work with MySQL and the MySQL InnoDB database storage engine, Flashcache can be used as a general caching mechanism for Linux systems.
Flashcache can also speed times it takes to write data to disk, from the user’s perspective, by saving newly updated data on SSD first and then writing it to the hard drives later.
The updated Flashcache module improves performance in read-write distribution, cache eviction and write efficiency.
Analyzing Flashcache performance, Facebook had found that most of its caches have a small subset of data that is read much more frequently than most of the other data.
With the previous version of Flashcache, 50 percent of a cache’s contents accounted for 80 percent of disk operations. Such a concentration of frequently consulted material could cause performance bottlenecks.
To improve Flashcache’s read-write distribution, the engineers developed a number of techniques to automatically position the data so that cache reads are distributed more evenly across the SSD. Now 50 percent of the cache accounts for 50 percent of the disk operations.
To improve the process of determining which data to move off the cache, a process called cache eviction, Flashcache switched from using the FIFO (first in first out) algorithm—in which the oldest data in the cache is removed first to make room for new data—to a LRU (least recently used) algorithm, which discards the data that hasn’t been requested for the longest period of time.
Improvements were also made in write efficiency.
Previously the software would write to disk only when it had a certain amount of data that was ready to be written. This resulted in uneven performance across different caches, however. So, Facebook engineers developed an approach that would write the cached data to disk whenever a copy of that data was requested by a user, which resulted in a smoother flow of write operations.
Thanks to these improvements, the updated caching mechanism has an average hit rate—or information that is requested by users that resides in cache—of 80 percent, up from 60 percent in the previous version. This means more data is served more quickly.
Updating the software has also slashed server I/O (input/output) required to read data by 40 percent, and reduced the I/O required to write data by 75 percent. For a company that is running thousands of servers, such a reduction in traffic can help make more efficient use of servers and keep hardware costs manageable.