It was shocking to learn yesterday that Kernel.org was hacked last month. News like that is routine in the world at large, but not in the home of the all-important heart of Linux.
Three separate explanations of why that's the case have appeared since the hack was first discovered. In essence, they boil down to the fact that kernel development is done using Linux creator Linus Torvalds' own Git distributed revision control system. Here's why that makes such a big difference.
'A Cryptographically Secure Hash'
“The potential damage of cracking kernel.org is far less than typical software repositories,” reads the note on the Kernel.org website.
“For each of the nearly 40,000 files in the Linux kernel, a cryptographically secure SHA-1 hash is calculated to uniquely define the exact contents of that file,” the note explains. “Git is designed so that the name of each version of the kernel depends upon the complete development history leading up to that version. Once it is published, it is not possible to change the old versions without it being noticed.”
Furthermore, those files and their associated hashes exist in numerous places: on the kernel.org machine and its mirrors as well as on the hard drives of many thousand kernel developers, distribution maintainers and others involved with kernel.org, the site adds.
“Any tampering with any file in the kernel.org repository would immediately be noticed by each developer as they updated their personal repository, which most do daily.”
'No Need to Worry'
Jonathan Corbet, executive editor at LWN.net and a Linux kernel contributor, had similarly reassuring words.
While admitting that the breach was “disturbing and embarrassing,” Corbet wrote that “there is no need to worry about the integrity of the kernel source or of any other software hosted on the kernel.org systems.
Git's hash function produces 160-bit numbers, Corbet noted, and any time the contents of a file change, the hash does too. “An attacker would be unable to change a file without changing its hash as well. Git checks hashes regularly, so a simplistic attempt to corrupt a file would be flagged almost immediately,” he pointed out.
'It Would Be Immediately Apparent'
Then, too, there's the fact that “for any given state of the kernel source tree, git calculates a hash based on (1) the hashes of all the files contained within that tree, and (2) the hashes of all of the previous states of the tree,” Corbet added. “So, for example, the hash for the kernel at the 3.0 release is 02f8c6aee8df3cdc935e9bdd4f2d020306035dbe. There is no way to change any of the files within that release - or within any previous release - without changing that hash. If anybody (even the kernel.org repository) were to present a 3.0 kernel with a different hash, it would be immediately apparent that something was not right.”
Further explanation can be found in a blog post from Git developer Junio C. Hamano, as noted on The H, providing even more technical detail.
Bottom line? If the words of these experts are anything to go by--and I'm pretty sure they are--the Linux kernel is safe and sound.