Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday August 20 2019, @09:11PM   Printer-friendly
from the not-going-to-fit-in-a-cell-phone dept.

The five technical challenges Cerebras overcame in building the first trillion transistor chip

Superlatives abound at Cerebras, the until-today stealthy next-generation silicon chip company looking to make training a deep learning model as quick as buying toothpaste from Amazon. Launching after almost three years of quiet development, Cerebras introduced its new chip today — and it is a doozy. The "Wafer Scale Engine" is 1.2 trillion transistors (the most ever), 46,225 square millimeters (the largest ever), and includes 18 gigabytes of on-chip memory (the most of any chip on the market today) and 400,000 processing cores (guess the superlative).

It's made a big splash here at Stanford University at the Hot Chips conference, one of the silicon industry's big confabs for product introductions and roadmaps, with various levels of oohs and aahs among attendees. You can read more about the chip from Tiernan Ray at Fortune and read the white paper from Cerebras itself.

Also at BBC, VentureBeat, and PCWorld.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by Rupert Pupnick on Wednesday August 21 2019, @01:48AM (2 children)

    by Rupert Pupnick (7277) on Wednesday August 21 2019, @01:48AM (#882901) Journal

    Presumably the cores are arrayed in some kind of neural network topology with memory distributed throughout. Would love to know more if anyone has any other relevant links.

    Thermal problem is huge as already pointed out by SNers. So bad, that this special “Z direction” cooling is required. They can’t use fluid flow parallel to the surface of the chip as in a traditional cooling design because the “downstream” edge of the chip would run too hot. If it’s silicon technology you can’t go above 150C anywhere on the chip.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 1) by NickM on Wednesday August 21 2019, @02:19AM (1 child)

    by NickM (2867) on Wednesday August 21 2019, @02:19AM (#882913) Journal
    According to the Fortune arctic in the summary

    Wafers incur defects when circuits are burned into them, and those areas become unusable. Nvidia, Intel, and other makers of “normal” smaller chips can get around that by cutting out the good chips in a wafer and scrapping the rest. You can’t do that if the entire wafer is the chip. So Cerebras had to build in redundant circuits, to route around defects in order to still deliver 400,000 working cores, like a miniature internet that keeps going when individual server computers go down. The wafers were produced in partnership with Taiwan Semiconductor Manufacturing, the world’s largest chip manufacturer, but Cerebras has exclusive rights to the intellectual property that makes the process possible.

    --
    I a master of typographic, grammatical and miscellaneous errors !
    • (Score: 2) by Rupert Pupnick on Wednesday August 21 2019, @10:11AM

      by Rupert Pupnick (7277) on Wednesday August 21 2019, @10:11AM (#883042) Journal

      Yeah, read that. Was asking about the ideal failure-free topology, not how easy it is to reconfigure when a piece fails.