Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by LaminatorX on Tuesday April 14 2015, @01:54PM   Printer-friendly
from the sound-of-one-hand-clapping dept.

Fudzilla have 'obtained' a slide showing details of a forthcoming APU from AMD based on their new "Zen" architecture.

The highest end compute HSA part has up to 16 Zen x86 cores and supports 32 threads, or two threads per core. This is something we saw on Intel architectures for a while, and it seems to be working just fine. This will be the first exciting processor from the house of AMD in the server / HSA market in years, and in case AMD delivers it on time it might be a big break for the company.

Each Zen core gets 512 KB of L2 cache and each cluster or four Zen cores is sharing 8MB L3 cache. In case we are talking about a 16-core, 32-thread next generation Zen based x86 processor, the total amount of L2 cache gets to a whopping 8MB, backed by 32MB of L3 cache.

This new APU also comes with the Greenland Graphics and Multimedia Engine that comes with HBM memory on the side. The specs we saw indicate that there can be up to 16GB of HBM memory with 512GB/s speed packed on the interposer. This is definitely a lot of memory for an APU GPU, and it also comes with 1/2 rate double precision compute, enhanced ECC and RAS and HSA support.

The new APU sports quad-channel DDR4 support, with up to 256GB per channel at speeds of up to 3.2GHz. No information yet on which processor socket this APU will use, but it's safe to assume the DDR4 support alone will render it incompatible with all AMD's current motherboards. Support is also included for secure boot and AMD's encryption co-processor.

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by tibman on Tuesday April 14 2015, @08:23PM

    by tibman (134) Subscriber Badge on Tuesday April 14 2015, @08:23PM (#170530)

    AMD already ships 16core server processors. This looks like an existing piledriver with more SMT. I read that bulldozer already had SMT for FPU and L2 cache. So this doesn't seem like a massive anything to me. To make room it looks like they cut the cache down L1 from 1MB to 512KB. It's an incremental upgrade but to a new milestone. Which means it may also fit the G-34 socket. Or at a minimum a socket with a very similar footprint.

    http://www.amd.com/en-us/products/server/opteron/6000/6300 [amd.com]

    --
    SN won't survive on lurkers alone. Write comments.
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 3, Informative) by gman003 on Tuesday April 14 2015, @09:41PM

    by gman003 (4155) on Tuesday April 14 2015, @09:41PM (#170564)

    That could be the case, but if so, their terminology seems weird.

    Piledriver (and Steamroller, etc.) were based on a "module" as the fundamental unit, and each module contained two "cores", that shared a few elements and cache. The effect was that each "module" behaved roughly like a single big core with very rough SMT. Adding SMT to what, in Piledriver parlance, is termed a "core" would really not work - there's too little superscalar performance for any real gains from SMT.

    Here, a "core" has SMT, so it quite clearly is not the same thing as a Piledriver "core". Maybe that's just a change in terminology, but if Zen is based on Piledriver at all, I expect its "cores" are derived from Piledriver "modules", perhaps trimmed down and with more resources shared. But I find that mildly implausible.

    There is one other possibility - that this SMT is not truly SMT, but instead a primitive form of barrel processing, as employed on late SPARC chips. In that case, the cores could very well be Piledriver cores, or even Jaguar cores. But I also find that implausible, because barrel processing is what you turn to when memory latency is high and cache misses frequent - quite unlikely given their memory system.