If you’ve become accustomed to thinking of CPUs and GPUs that can fit in the palm of your hand, think again—Cerebras Systems says that it’s designed the largest chip ever, one that’s as large as your computer’s keyboard.
Cerebras is scheduled to unveil this whopper of a chip tonight at the Hot Chips conference at Stanford University. While the largest GPU includes 21.1 billion transistors, and requires 815 square millimeters of die space, the Cerebras Systems chip includes 1.2 trillion transistors, and requires 46,225 square millimeters of silicon. That’s 71.64 square inches, or a rectangular chip that would measure about 8 inches by 9 inches. It’s so large that Cerebras published photos comparing it to a PC keyboard!
No, the Cerebras chip isn’t designed for your PC. But is it real? Cerebras says it is, already running customer workloads. However, the company also held back crucial details, such as the number of cores and the amount of memory, as well as how expensive it was to manufacture. Interestingly, it’s manufactured in a relatively old 16nm process technology from TSMC.
Cerebras designed its chip for deep learning, what it calls a large and growing portion of data center workloads. Cerebras characterized its chip as a “wafer scale” implementation of a neural network, essentially clustering the logic, interconnect and memory onto a single piece of silicon. While manufacturing it may be costly, the company feels that on-chip interconnects are both faster and less expensive than building and connecting discrete cores.
It’s the eye-popping size of the chip that can’t help but raise eyebrows, however. Consider: Intel’s original Itanium processor, code-named McKinley, was considered to be absolutely massive when it debuted in 2002 with 221 million transistors. But the Cerebras chip has over 5,000 times the transistors that McKinley did, at over 56 times the die size! Even IBM’s upcoming Power9 iteration, announced here at Hot Chips, has only 8 billion transistors.
Manufacturing such an enormous chip carries with it a number of significant challenges, including merely manufacturing and cooling it. Also, fabricating such an enormous chip without any manufacturing defects is simply impossible, Cerebras admits—every chipmaker suffers manufacturing defects, and a certain number of “bad” chips on every wafer are simply discarded. In the case of Cerebras, the company designs in redundant processing cores, anticipating that defects will render some of them unusable. (How many, though, the company hasn’t said. In total, there are over 400,000 cores.) An I/O fabric connecting one core to the next can route around any defective cores.
Counterintuitively, designing a single monolithic chip actually saves manufacturing costs. "Because it's just one chip, you get all of it at vastly lower power and less space," said Sean Lie, the former chief hardware architect for SeaMicro, later acquired by AMD, as well as the Data Center Server Solutions business at AMD.
Cooling such a massive chip requires more than just a heat sink and a fan. Cerebras says a “cold plate” is attached above the silicon, using multiple vertically-mounted water pipes to cool the chip directly. Because the chip is too large to fit within any traditional package, Cerebras has designed its own, combining a PCB, the wafer, a custom connector linking the two, and the cold plate.
Numerous chips debut at the Hot Chips conference and are never seen again. Cerebras almost certainly will fall into the same category—you’ll never see anything like it in a PC, and it may not even survive long-term in the server space. But the appeal of Hot Chips is that you never know exactly what you’ll see—and the Cerebras processor is certainly amazing.
This story was updated at PM with additional details.