Facebook speeds PHP by crafting a PHP virtual machine
By Joab Jackson
Social networking giant Facebook has taken another step at making the PHP Web programming language run more quickly. The company has developed a PHP Virtual Machine that it says can execute the language as much as nine times as quickly as running PHP natively on large systems.
“Our goal is to make PHP run really, really quickly,” said Joel Pobar, a Facebook engineering manager. Facebook has been using the virtual machine, called the HipHop Virtual Machine (HHVM), across all of its servers since earlier this year.
Pobar discussed the virtual machine at the O’Reilly Open Source Conference (OSCON) being held this week in Portland, Oregon.
Shares its development tools
HHVM is not Facebook’s first foray into customizing PHP for faster use. PHP is an interpreted language, meaning that the source code is executed by the processor directly. Generally speaking, programs written in interpreted languages such as PHP tend not to run as quickly as languages, such as C or C++, that have been compiled beforehand into machine language byte code. Facebook has remained loyal to PHP because it is widely understood by many of the Web programmers who work for the company.
While Facebook enjoyed considerable performance gains of this first version of HipHop for several years, it sought other ways to speed the delivery of the dynamically created Web pages to its billion or so users. “Our performance strategy for that was going to tap out,” Pobar admitted.
HHVM is the next step for Facebook. Under development for about three years, HHVM actually works on the same principle as the Java Virtual Machine (JVM). HHVM has a just-in-time (JIT) compiler that converts the human readable source code into machine-readable byte code when it is needed. (The previous HipHop, renamed HPHPc, has now been retired within Facebook.)
This JIT approach allows the virtual machine to “make smarter decisions at runtime,” Pobar said. For instance, if a call is made to the MySQL database to read a row of data, the HHVM can, on the fly, figure out what type of data it is, such as an integer or a string. It then can generate or call code on the fly that would be best suited for handling this particular type of data.
With the old HipHop, “the best it can do is analyze the entire Facebook codebase, reason about it and then specialize code based on its reasoning. But it can’t get all of the reasoning right. There are parts of the code base that you can not simply infer about or reason about,” Pobar said.
Virtual system speedier
Pobar estimated that HHVM is about twice as fast as HPHPc was, and about nine times as fast as running straight PHP.
Facebook has posted the code for HHVM on GitHub, with the hopes that others will use it to speed their PHP websites as well.
HHVM is optimized for handling very large, and heavily used, PHP codebases. Pobar reckoned that using HHVM for standard sized websites, such as one hosting a WordPress blog, would gain only about a fivefold performance improvement.
“If you take some PHP and run it in on HipHop, the CPU execution time [may] not be the limiting factor for performance. Chances are [the system is] spending too much time talking to the database or spending too time talking to [the] memcache” caching layer, Pobar said.