The five technical challenges Cerebras overcame in building the first trillion transistor chip
Superlatives abound at Cerebras, the until-today stealthy next-generation silicon chip company looking to make training a deep learning model as quick as buying toothpaste from Amazon. Launching after almost three years of quiet development, Cerebras introduced its new chip today — and it is a doozy. The "Wafer Scale Engine" is 1.2 trillion transistors (the most ever), 46,225 square millimeters (the largest ever), and includes 18 gigabytes of on-chip memory (the most of any chip on the market today) and 400,000 processing cores (guess the superlative).
It's made a big splash here at Stanford University at the Hot Chips conference, one of the silicon industry's big confabs for product introductions and roadmaps, with various levels of oohs and aahs among attendees. You can read more about the chip from Tiernan Ray at Fortune and read the white paper from Cerebras itself.
Also at BBC, VentureBeat, and PCWorld.
(Score: 2) by Rupert Pupnick on Wednesday August 21 2019, @01:48AM (2 children)
Presumably the cores are arrayed in some kind of neural network topology with memory distributed throughout. Would love to know more if anyone has any other relevant links.
Thermal problem is huge as already pointed out by SNers. So bad, that this special “Z direction” cooling is required. They can’t use fluid flow parallel to the surface of the chip as in a traditional cooling design because the “downstream” edge of the chip would run too hot. If it’s silicon technology you can’t go above 150C anywhere on the chip.
(Score: 1) by NickM on Wednesday August 21 2019, @02:19AM (1 child)
I a master of typographic, grammatical and miscellaneous errors !
(Score: 2) by Rupert Pupnick on Wednesday August 21 2019, @10:11AM
Yeah, read that. Was asking about the ideal failure-free topology, not how easy it is to reconfigure when a piece fails.