Stories
Slash Boxes
Comments

SoylentNews is people

posted by LaminatorX on Tuesday April 14 2015, @01:54PM   Printer-friendly
from the sound-of-one-hand-clapping dept.

Fudzilla have 'obtained' a slide showing details of a forthcoming APU from AMD based on their new "Zen" architecture.

The highest end compute HSA part has up to 16 Zen x86 cores and supports 32 threads, or two threads per core. This is something we saw on Intel architectures for a while, and it seems to be working just fine. This will be the first exciting processor from the house of AMD in the server / HSA market in years, and in case AMD delivers it on time it might be a big break for the company.

Each Zen core gets 512 KB of L2 cache and each cluster or four Zen cores is sharing 8MB L3 cache. In case we are talking about a 16-core, 32-thread next generation Zen based x86 processor, the total amount of L2 cache gets to a whopping 8MB, backed by 32MB of L3 cache.

This new APU also comes with the Greenland Graphics and Multimedia Engine that comes with HBM memory on the side. The specs we saw indicate that there can be up to 16GB of HBM memory with 512GB/s speed packed on the interposer. This is definitely a lot of memory for an APU GPU, and it also comes with 1/2 rate double precision compute, enhanced ECC and RAS and HSA support.

The new APU sports quad-channel DDR4 support, with up to 256GB per channel at speeds of up to 3.2GHz. No information yet on which processor socket this APU will use, but it's safe to assume the DDR4 support alone will render it incompatible with all AMD's current motherboards. Support is also included for secure boot and AMD's encryption co-processor.

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Interesting) by gman003 on Tuesday April 14 2015, @05:33PM

    by gman003 (4155) on Tuesday April 14 2015, @05:33PM (#170465)

    The only solid reasonable conclusion I can make from this information is that it's going to be a massive die size, unless they somehow leapfrog to 10nm or something.

    Two threads per core implies fairly wide, beefy cores. You don't gain anything from SMT unless you're already superscalar enough that you have idling execution units during single-threaded operation. So this Zen core is almost certainly bigger than Jaguar, possibly bigger than Steamroller, and I can't rule out it being bigger than Haswell, at least not just based on the SMT.

    The high memory bandwidth (both internal and external), large caches and generous PCIe lane allocation also seems to indicate this is a high-end chip being discussed, not a desktop-grade one.

    Four-core modules generally implies that each core is very weak, though, if that module cannot be subdivided or reduced. But they didn't say that it cannot, so my guess is that the module organization is just for cache organization and maybe internal routing, and that dual-core or single-core chips could still be made.

    Sixteen cores on one chip isn't impossible for big server-grade chips, and I'm sure AMD would love HSA-in-the-datacenter to become the next big thing. But such server chips do tend to be rather big in the die-size department, and that's without having the GPU there as well.

    All in all, there's not enough data to say whether it's good or not, but what we do know is not incompatible with AMD becoming good again.

    Starting Score:    1  point
    Moderation   +3  
       Interesting=2, Informative=1, Total=3
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   5  
  • (Score: 2) by wantkitteh on Tuesday April 14 2015, @08:22PM

    by wantkitteh (3362) on Tuesday April 14 2015, @08:22PM (#170529) Homepage Journal

    I'm hoping the CPU portion of this APU is aiming to kick the top-end Xeons off the top spot, while the GPU portion sounds like the kind of powerhouse you'd pair with a Xeon in a workstation like the Mac Pro - maybe AMD are pushing the APU up a few rungs on the ladder? *shrugs* Going to be interesting to find out.

  • (Score: 2) by tibman on Tuesday April 14 2015, @08:23PM

    by tibman (134) Subscriber Badge on Tuesday April 14 2015, @08:23PM (#170530)

    AMD already ships 16core server processors. This looks like an existing piledriver with more SMT. I read that bulldozer already had SMT for FPU and L2 cache. So this doesn't seem like a massive anything to me. To make room it looks like they cut the cache down L1 from 1MB to 512KB. It's an incremental upgrade but to a new milestone. Which means it may also fit the G-34 socket. Or at a minimum a socket with a very similar footprint.

    http://www.amd.com/en-us/products/server/opteron/6000/6300 [amd.com]

    --
    SN won't survive on lurkers alone. Write comments.
    • (Score: 3, Informative) by gman003 on Tuesday April 14 2015, @09:41PM

      by gman003 (4155) on Tuesday April 14 2015, @09:41PM (#170564)

      That could be the case, but if so, their terminology seems weird.

      Piledriver (and Steamroller, etc.) were based on a "module" as the fundamental unit, and each module contained two "cores", that shared a few elements and cache. The effect was that each "module" behaved roughly like a single big core with very rough SMT. Adding SMT to what, in Piledriver parlance, is termed a "core" would really not work - there's too little superscalar performance for any real gains from SMT.

      Here, a "core" has SMT, so it quite clearly is not the same thing as a Piledriver "core". Maybe that's just a change in terminology, but if Zen is based on Piledriver at all, I expect its "cores" are derived from Piledriver "modules", perhaps trimmed down and with more resources shared. But I find that mildly implausible.

      There is one other possibility - that this SMT is not truly SMT, but instead a primitive form of barrel processing, as employed on late SPARC chips. In that case, the cores could very well be Piledriver cores, or even Jaguar cores. But I also find that implausible, because barrel processing is what you turn to when memory latency is high and cache misses frequent - quite unlikely given their memory system.

  • (Score: 2) by TheRaven on Wednesday April 15 2015, @11:12AM

    by TheRaven (270) on Wednesday April 15 2015, @11:12AM (#170892) Journal

    Two threads per core implies fairly wide, beefy cores. You don't gain anything from SMT unless you're already superscalar enough that you have idling execution units during single-threaded operation

    Or if you're not out-of-order / speculative enough that you can saturate your execution units. SMT is a big win for simple in-order processors, just switching when one thread issues a load instruction. If power and overall throughput are more important to you than single-threaded performance, then stalling a thread when the branch predictor determines that a branch is unpredictable (data dependent or just not enough profiling built up yet) can give you better ALU utilisation.

    --
    sudo mod me up