The Trillion-Transistor Chip That Just Left a Supercomputer in the Dust:
So, in a recent trial, researchers pitted the chip—which is housed in an all-in-one system about the size of a dorm room mini-fridge called the CS-1—against a supercomputer in a fluid dynamics simulation. Simulating the movement of fluids is a common supercomputer application useful for solving complex problems like weather forecasting and airplane wing design.
The trial was described in a preprint paper written by a team led by Cerebras's Michael James and NETL's Dirk Van Essendelft and presented at the supercomputing conference SC20 this week. The team said the CS-1 completed a simulation of combustion in a power plant roughly 200 times faster than it took the Joule 2.0 supercomputer to do a similar task.
The CS-1 was actually faster-than-real-time. As Cerebrus wrote in a blog post, "It can tell you what is going to happen in the future faster than the laws of physics produce the same result."
The researchers said the CS-1's performance couldn't be matched by any number of CPUs and GPUs. And CEO and cofounder Andrew Feldman told VentureBeat that would be true "no matter how large the supercomputer is." At a point, scaling a supercomputer like Joule no longer produces better results in this kind of problem. That's why Joule's simulation speed peaked at 16,384 cores, a fraction of its total 86,400 cores.
Previously:
Cerebras More than Doubles Core and Transistor Count with 2nd-Generation Wafer Scale Engine
Cerebras Systems' Wafer Scale Engine Deployed at Argonne National Labs
Cerebras "Wafer Scale Engine" Has 1.2 Trillion Transistors, 400,000 Cores
(Score: 3, Interesting) by BsAtHome on Monday November 23 2020, @06:35PM (1 child)
Only partially purpose built. You can better describe it with the operational method of a generic system for a subset of problems. It will do very well with many simulations. Maybe primarily related to fluid-dynamics problems, but that is not a limitation in many science problems. You can also build a nice ray-tracer ;-)
(Score: 2) by Hartree on Tuesday November 24 2020, @05:15AM
Yes, I should have read it deeper before answering. The big plus is the higher rate of communication between cores. That should help with more tightly coupled physical systems that don't bust up as well into largely independent elements. I'd be interested to see how well it works on something viciously that way and nonlinear like general relativity simulations.
On the other hand, the proof is in the profits. Gene Amdahl and Trilogy Systems crashed very hard in the early 80s when they tried wafer scale integration.