Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday August 19 2020, @01:36AM   Printer-friendly
from the Amdahl's-law? dept.

342 Transistors for Every Person In the World: Cerebras 2nd Gen Wafer Scale Engine Teased

One of the highlights of Hot Chips from 2019 was the startup Cerebras showcasing its product – a large 'wafer-scale' AI chip that was literally the size of a wafer. The chip itself was rectangular, but it was cut from a single wafer, and contained 400,000 cores, 1.2 trillion transistors, 46225 mm2 of silicon, and was built on TSMC's 16 nm process.

[...] Obviously when doing wafer scale, you can't just add more die area, so the only way is to optimize die area per core and take advantage of smaller process nodes. That means for TSMC 7nm, there are now 850,000 cores and 2.6 trillion transistors. Cerebras has had to develop new technologies to deal with multi-reticle designs, but they succeeded with the first gen, and transferred the learnings to the new chip. We're expecting more details about this new product later this year.

Previously: Cerebras "Wafer Scale Engine" Has 1.2 Trillion Transistors, 400,000 Cores
Cerebras Systems' Wafer Scale Engine Deployed at Argonne National Labs


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by takyon on Wednesday August 19 2020, @01:43AM (1 child)

    by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Wednesday August 19 2020, @01:43AM (#1038641) Journal

    https://www.cerebras.net/wp-content/uploads/2019/08/Cerebras-Wafer-Scale-Engine-An-Introduction.pdf [cerebras.net]

    The 400,000 cores on the Cerebras WSE are connected via the Swarm communication fabric in a 2D mesh with 100 Petabits per second of bandwidth. Swarm provides a hardware routing engine to each of the compute cores and connects them with short wires optimized for latency and bandwidth. The resulting fabric supports single-word active messages that can be handled by the receiving cores without any software overhead. The fabric provides flexible, all-hardware communication.

    Swarm is fully configurable. The Cerebras software configures all the cores on the WSE to support the precise communication required for training the user-specified model. For each neural network, Swarm provides a unique and optimized communication path. This is different than the approach taken by central processing units and graphics processing units that have one hard-coded on-chip communication path into which all neural networks are shoehorned.

    Swarm’s results are impressive. Typical messages traverse one hardware link with nanosecond latency. The aggregate bandwidth of the system is measured in tens of petabytes per second. Communication software such as TCP/IP and MPI is not needed, avoiding associated performance penalties. The energy cost of communication in this architecture is well below one picojoule per bit, which is nearly two orders of magnitude lower than central processing units or graphics processing units. As a result of the Swarm communication fabric, the WSE trains models faster and uses less power.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 0) by Anonymous Coward on Wednesday August 19 2020, @11:56AM

    by Anonymous Coward on Wednesday August 19 2020, @11:56AM (#1038774)

    I wonder if they thought about the hypercube topology developed for The Connection Machine?
        https://en.wikipedia.org/wiki/Connection_Machine [wikipedia.org]
    Maybe Cerebras could configure the programmable hardware to emulate a really big Connection Machine...and run some of the software that was prototyped back then?