Slash Boxes

SoylentNews is people

posted by takyon on Thursday May 07 2015, @02:15AM   Printer-friendly
from the making-a-comeback dept.

Today was Advanced Micro Devices' (AMD) 2015 Financial Analyst Day. The last one was held in 2012. Since then, the company has changed leadership, put its APUs in the major consoles, and largely abandoned the high-end chip market to Intel. Now AMD says it is focusing on gaming, virtual reality, and datacenters. AMD has revealed details of upcoming CPUs and GPUs at the event:

Perhaps the biggest announcement relates to AMD's x86 Zen CPUs, coming in 2016. AMD is targeting a 40% increase in instructions-per-clock (IPC) with Zen cores. By contrast, Intel's Haswell (a "Tock") increased IPC by about 10-11%, and Broadwell (a "Tick") increased IPC by about 5-6%. AMD is also abandoning the maligned Bulldozer modules with Clustered Multithreading in favor of a Simultaneous Multithreading design, similar to Intel's Hyperthreading. Zen is a high priority for AMD to the extent that it is pushing back its ARM K12 chips to 2017. AMD is also shifting focus away from Project Skybridge, an "ambidextrous framework" that combined x86 and ARM cores in SoCs. Zen cores will target a wide range of designs from "top-to-bottom", including both sub-10W TDPs and up to 100W. The Zen architecture will be followed by Zen+ at some point.

On the GPU front, AMD's 2016 GPUs will use FinFETs. AMD plans to be the first vendor to use High Bandwidth Memory (HBM), a 3D/stacked memory standard that enables much higher bandwidth (hence the name) and saves power. NVIDIA also plans to use HBM in its Pascal GPUs slated for 2016. The HBM will be positioned around the processor, as the GPU's thermal output would make cooling the RAM difficult if it were on top. HBM is competing against the similar Hybrid Memory Cube (HMC) standard.

Although High Bandwidth Memory is on track for 2016, it will actually be featured in an AMD desktop GPU to be released this quarter. AnandTech expects HBM to become a standard feature in AMD APUs, which benefit from higher memory bandwidth:

Coupled with the fact that any new GPU from AMD should also include AMD's latest color compression technology, and the implication is that the effective increase in memory bandwidth should be quite large. For AMD, they see this as being one of the keys of delivering better 4K performance along with better VR performance.

Finally, while talking about HBM on GPUs, AMD is also strongly hinting that they intend to bring HBM to other products as well. Given their product portfolio, we consider this to be a pretty transparent hint that the company wants to build HBM-equipped APUs. AMD's APUs have traditionally struggled to reach peak performance due to their lack of memory bandwidth – 128-bit DDR3 only goes so far – so HBM would be a natural extension to APUs."

AMD's Carrizo APUs will be released beginning this quarter, but it may be worth it to wait:

Badging aside, AMD still will have to face the fact that they're launching a 28nm notebook APU versus Intel's 14nm notebook CPUs, the company is once again banking on their strong GPU performance to help drive sales. Coupled with the combination of low power optimizations in Carrizo and full fixed-function hardware decoding of HEVC, and AMD will be relying on Carrizo to carry them through to 2016 and Zen.

AMD also announced Radeon M300 discrete GPUs for notebooks, promising "refined efficiency and power management" as well as DirectX 12 support.

One of the more interesting chips on AMD's roadmap may be a "high-performance server APU" intended for both high-performance computing and workstations.

Alternate coverage at Tom's Hardware and The Register.

Related Stories

AMD Shares More Details on High Bandwidth Memory 14 comments

Advanced Micro Devices (AMD) has shared more details about the High Bandwidth Memory (HBM) in its upcoming GPUs.

HBM in a nutshell takes the wide & slow paradigm to its fullest. Rather than building an array of high speed chips around an ASIC to deliver 7Gbps+ per pin over a 256/384/512-bit memory bus, HBM at its most basic level involves turning memory clockspeeds way down – to just 1Gbps per pin – but in exchange making the memory bus much wider. How wide? That depends on the implementation and generation of the specification, but the examples AMD has been showcasing so far have involved 4 HBM devices (stacks), each featuring a 1024-bit wide memory bus, combining for a massive 4096-bit memory bus. It may not be clocked high, but when it's that wide, it doesn't need to be.

AMD will be the only manufacturer using the first generation of HBM, and will be joined by NVIDIA in using the second generation in 2016. HBM2 will double memory bandwidth over HBM1. The benefits of HBM include increased total bandwidth (from 320 GB/s for the R9 290X to 512 GB/s in AMD's "theoretical" 4-stack example) and reduced power consumption. Although HBM1's memory bandwidth per watt is tripled compared to GDDR5, the memory in AMD's example uses a little less than half the power (30 W for the R9 290X down to 14.6 W) due to the increased bandwidth. HBM stacks will also use 5-10% as much area of the GPU to provide the same amount of memory that GDDR5 would. That could potentially halve the size of the GPU:

By AMD's own estimate, a single HBM-equipped GPU package would be less than 70mm × 70mm (4900mm2), versus 110mm × 90mm (9900mm2) for R9 290X.

HBM will likely be featured in high-performance computing GPUs as well as accelerated processing units (APUs). HotHardware reckons that Radeon 300-series GPUs featuring HBM will be released in June.

Samsung Announces Mass Production of HBM2 DRAM 10 comments

Samsung has announced the mass production of dynamic random access memory (DRAM) packages using the second generation High Bandwidth Memory (HBM2) interface.

AMD was the first and only company to introduce products using HBM1. AMD's Radeon R9 Fury X GPUs featured 4 gigabytes of HBM1 using four 1 GB packages. Both AMD and Nvidia will introduce GPUs equipped with HBM2 memory this year. Samsung's first HBM2 packages will contain 4 GB of memory each, and the press release states that Samsung intends to manufacture 8 GB HBM2 packages within the year. GPUs could include 8 GB of HBM2 using half of the die space used by AMD's Fury X, or just one-quarter of the die space if 8 GB HBM2 packages are used next year. Correction: HBM2 packages may be slightly physically larger than HBM1 packages. For example, SK Hynix will produce a 7.75 mm × 11.87 mm (91.99 mm2) HBM2 package, compared to 5.48 mm × 7.29 mm (39.94 mm2) HBM1 packages.

The 4GB HBM2 package is created by stacking a buffer die at the bottom and four 8-gigabit (Gb) core dies on top. These are then vertically interconnected by TSV holes and microbumps. A single 8Gb HBM2 die contains over 5,000 TSV holes, which is more than 36 times that of a 8Gb TSV DDR4 die, offering a dramatic improvement in data transmission performance compared to typical wire-bonding based packages.

Samsung's new DRAM package features 256GBps of bandwidth, which is double that of a HBM1 DRAM package. This is equivalent to a more than seven-fold increase over the 36GBps bandwidth of a 4Gb GDDR5 DRAM chip, which has the fastest data speed per pin (9Gbps) among currently manufactured DRAM chips. Samsung's 4GB HBM2 also enables enhanced power efficiency by doubling the bandwidth per watt over a 4Gb-GDDR5-based solution, and embeds ECC (error-correcting code) functionality to offer high reliability.

TSV refers to through-silicon via, a vertical electrical connection used to build 3D chip packages such as High Bandwidth Memory.

Update: HBM2 has been formalized in JEDEC's JESD235A standard, and Anandtech has an article with additional technical details.

AMD Teases x86 Improvements, High Bandwidth Memory GPUs
AMD Shares More Details on High Bandwidth Memory
Samsung Mass Produces 128 GB DDR4 Server Memory

Original Submission

CERN Engineer Accidentally Leaks Details of AMD 32-Core Zen Dual CPU 19 comments

A CERN engineer has leaked a few details of an unreleased 32-core AMD "Zen" processor featuring support for 8 channels of DDR4 memory. The processor connects two 16-core CPUs with an on-die interconnect, and could be a replacement for older AMD Opteron chips and competitor to Intel's Xeon chips. Zen is the name of AMD's upcoming 14nm architecture:

AMD is long overdue for a major architecture update, though one is coming later this year. Featuring the codename "Zen," AMD's already provided a few details, such as that it will be built using a 14nm FinFET process technology and will have high core counts. In time, AMD will reveal all there is to know but Zen, but in the meantime, we now have a few additional details to share thanks to a computer engineer at CERN.

CERN engineer Liviu Valsan recently gave a presentation on technology and market trends for the data center. At around 2 minutes into the discussion, he brought up AMD's Zen architecture with a slide that contained some previously undisclosed details (along with a few things we already knew). One of the more interesting revelations was that upcoming x86 processors based on Zen will feature up to 32 physical cores.

Before you get too excited about the high core count, there are two things to note. The first is that AMD is employing a "bit of a trick," to use Valsan's words. To achieve a 32-core design, Valsan says AMD will use two 16-core CPUs on a single die with a next-generation interconnect, presumably one that would reduce or be void of bottlenecks.

The second thing to consider is that it's highly unlikely AMD would release a 32-core processor into the consumer market. Zen-based Opterons aren't out of the question—servers and workstations could take real advantage of the additional cores—but as far as FX processors go, it's more realistic to expect offerings to boast up to 8 cores, maybe even 16 at some point.

Previously: AMD's Upcoming "Zen" APU - 16 Cores, 32 Threads

Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by Gravis on Thursday May 07 2015, @03:22AM

    by Gravis (4596) on Thursday May 07 2015, @03:22AM (#179759)

    i'm not sure if this is the right move but what they are saying is that they are making improvements on Zen and then taking what they got right and putting it into the K12 as well. if this means overall power consumption for the K12 decreases and/or it's speed increases, i'm all for waiting for it to be ready. i just hope they aren't waiting too long and thus miss their window to cut into the market.

  • (Score: 1, Funny) by Anonymous Coward on Thursday May 07 2015, @03:34AM

    by Anonymous Coward on Thursday May 07 2015, @03:34AM (#179761)

    First I've heard of that, it sounds like a strategy right out of business school. Too bad they haven't found any compelling applications (yet).

    • (Score: 2) by takyon on Thursday May 07 2015, @04:25AM

      by takyon (881) <{takyon} {at} {}> on Thursday May 07 2015, @04:25AM (#179768) Journal

      Compelling application: running a pure Android app within Windows 11.

      [SIG] 10/28/2017: Soylent Upgrade v14 []
      • (Score: 2) by LoRdTAW on Thursday May 07 2015, @12:35PM

        by LoRdTAW (3755) on Thursday May 07 2015, @12:35PM (#179867) Journal

        Is that really an issue? I have yet to come across a situation where I said: Damn, I wish this android app ran on my desktop." Besides, most apps are crappier versions of a website. Why would I want that kind of unnecessary redundancy? All that engineering effort spent mashing two incompatible CPU architectures together to run a "low resolution" website.

      • (Score: 1, Informative) by Anonymous Coward on Thursday May 07 2015, @01:59PM

        by Anonymous Coward on Thursday May 07 2015, @01:59PM (#179911)

        > Compelling application: running a pure Android app within Windows 11.

        x86 is now a standard target for android, there are tons of x86 android tablets out there using intel atom processors.
        I'm sure there are still apps that are only compiled for arm, but at this rate by the time Win11 is out, there won't be many.

  • (Score: 2) by gnuman on Thursday May 07 2015, @03:54AM

    by gnuman (5013) on Thursday May 07 2015, @03:54AM (#179767)

    I'm still waiting for the unified memory for the CPU and the GPU. I should not have to copy to and from GPU part of the APU just to run OpenCL code on the GPU bits.

    All I want is to do an aligned malloc, normal malloc not special OpenCL malloc, then tell OpenCL to just use that and the APU does its stuff.

    • (Score: 3, Informative) by gman003 on Thursday May 07 2015, @04:44AM

      by gman003 (4155) on Thursday May 07 2015, @04:44AM (#179770)

      Uh, they already have zero-copy, coherent, unified memory. That came out with Kaveri, IIRC.

      • (Score: 2, Informative) by Anonymous Coward on Thursday May 07 2015, @06:51AM

        by Anonymous Coward on Thursday May 07 2015, @06:51AM (#179786)

        Unified memory (specifically Huma, heterogeneous uniform memory access) shipped first on the PS4, and is now out on consumer APUs (As parent mentioned with Kaveri). Its all part of HSA which is well underway. Its not even new anymore really.

  • (Score: -1, Troll) by Anonymous Coward on Thursday May 07 2015, @04:43AM

    by Anonymous Coward on Thursday May 07 2015, @04:43AM (#179769)

    AMD hasn't made anything significant since x64 (perhaps the only significant thing they created). AMD fails in comparison to Intel in every way except the price point. For $75 more, why not buy Intel instead of piece of shit AMD.

    Fuck you AMD and all you swindlers that bought that junk, built "computers" out of it, and sold it to people. It's shit, and you are shit.

    • (Score: 5, Insightful) by TheRaven on Thursday May 07 2015, @08:23AM

      by TheRaven (270) on Thursday May 07 2015, @08:23AM (#179804) Journal
      Troll, but with some valid points. From their 8086 clones right up until the Core 2, AMD was highly competitive. Their designs were even more impressive considering that Intel was an entire process generation ahead of them for most of this time. The K6 series was slightly slower clock-for-clock than the PII / PIII, but was much faster dollar-for-dollar, especially when Slot 1 motherboards for the Pentium were twice the price of Socket 7 (or Super 7) boards for the AMD CPUs - you could buy a reasonable AMD CPU and motherboard for the price of the Intel motherboard.

      When Intel went insane with the P4 (tying it to expensive RAMBUS memory and then pushing ahead with a design that assumed that it could be quickly scaled to 10GHz when it was clear that over 2GHz would be problematic) they were amazing in comparison. The K7 was a solid chip and the Opterons (with their on-die memory controllers when the Xeons still had them off-chip in the northbridge) were impressive.

      Since the Core 2 launched, AMD has struggled a lot. Their main advantage now is that they don't try to do artificial market segmentation. For example, when I bought the board for my NAS, it had an AMD CPU because Intel wouldn't sell Atom boards with more than 2 SATA ports, AMD would happily sell their equivalent with 4 or 6. Intel would disable the virtualisation extensions on their low-end chips, AMD kept them enabled.

      Laptops have almost always been Intel dominated, but since the Pentium M there really hasn't been an AMD chip that's seemed competitive and that's been the largest growth area for the past decade. At the high end, we've been buying Intel systems since the first i7s for places that used to be exclusively Opterons.

      Maybe AMD's ARM processors will make a difference to the company. A 40% IPC improvement sounds more impressive than a 10% improvement, but only if you don't consider how far behind they were before.

      sudo mod me up
      • (Score: 2) by wantkitteh on Thursday May 07 2015, @08:38AM

        by wantkitteh (3362) on Thursday May 07 2015, @08:38AM (#179809) Homepage Journal

        Great minds think alike ;)

        I'm building a cheap G3258 rig to tide me over until Zen comes out and give me an LGA1150 upgrade path if it's a performance flop, but I'll give AMD the benefit of the doubt. I'm no fanboi, but over the years I've built systems around the K6-233, Celeron 400, Duron 750, Duron 1400, Ath64 3700+, Pentium E2180 and Phenom II X6. None of them have been high-flyers exactly, but it illustrates how AMD have historically offered better value than Intel; they still do today, even if they don't even compete at the top end of the market any more.

        • (Score: 2) by bzipitidoo on Thursday May 07 2015, @01:41PM

          by bzipitidoo (4388) on Thursday May 07 2015, @01:41PM (#179896) Journal

          Don't forget, AMD was first with 64bit x86.

          I've been rooting for AMD for years. Kept getting pushed back into the Intel fold. First attempt was a K6 on a motherboard that could go either way. But, the K6 would hang, and the Pentium would not, so back to Intel. Tried again when Intel released the Pentium III with those unique identifiers burned into each processor. No way was I going to have a computer with a new "feature" that could help rat me out for supposed piracy. But by the time of the Pentium 4, Intel chips did floating point math every clock cycle, while AMD chips were only every other clock cycle, and I needed fast math, so back to Intel I went.

          I've also been watching ATI vs Nvidia, ready to go with whichever one would finally allow an open graphics driver for Linux that has decent 3D acceleration. I'm still waiting for that. Meantime, Intel really improved their integrated graphics. Their HD 4000 actually has decent performance, for a low end graphics offering. And they offer open source drivers. I didn't like having to turn back to Intel, again, but Nvidia and ATI/AMD weren't delivering. As for other graphics companies, who is there? Matrox? They don't do fast 3D. 3dfx with the fondly remembered Voodoo line? Long gone, as are many other graphics card companies. SiS? Way more hostile to Linux than NVidia and AMD.

    • (Score: 2) by wantkitteh on Thursday May 07 2015, @08:29AM

      by wantkitteh (3362) on Thursday May 07 2015, @08:29AM (#179807) Homepage Journal

      Haters gotta hate. But there is something to take away from this - AMD's market share is the lowest it's been since the GHz Wars, primarily because the hobbyist/enthusiast system builder market likes to get "The Best, as seen in Benchmarks", and AMD have only had price/performance value products on the market for a while now. Even if AMD's next product is superior to Intel's gear, AMD have to face Intel's entrenched position in the hearts, minds and gaming rigs of this core target market. Let's hope the 40% IPC improvement they're touting doesn't turn out to be overhype like the 50% benchmark lead they claimed pre-release for the FX 8150.

  • (Score: 3, Informative) by bob_super on Thursday May 07 2015, @04:22PM

    by bob_super (1357) on Thursday May 07 2015, @04:22PM (#179969)

    > HBM is competing against the similar Hybrid Memory Cube (HMC) standard

    Well... no.
    No, and no, and actually no. And in case you were wondering, NO!

    HMC is a discrete chip with lots of transceivers which provides memory at Meh! latency. If you wanted 16GB of RAM, you'd still need quite a few of them. They are bulky and somewhat hot. You also don't control the controller, so you get data whenever its wants (usually, pretty dang fast). Indications I saw say that you cannot exceed 70% bandwidth efficiency until Micron bothers to do a Si spin. It's a lot smaller footprint to route multiple HMC to a CPU than multiple DRAMs, but you gotta know your 15-gigabit signal integrity.

    HBM is on-substrate memory. Either on top of the CPU via TSVs or next to it, depending on the CPU heat dissipation. It's like having a PCB with the CPU die and the Memory dies inside the "CPU" package. It's very limited by physical size and CPU heat, but current tech allows for thousands of parallel low-voltage connections to the RAM, so it allows fantastic bandwidth at lower latency. The PCB doesn't have to even know it exists, because it's all taken care of by the CPU vendor. On the other hand, you still need external RAM of other kinds because capacity isn't there yet.

    They are in no way, shape or form "similar". They could actually be complementary.

    • (Score: 3, Informative) by takyon on Thursday May 07 2015, @05:13PM

      by takyon (881) <{takyon} {at} {}> on Thursday May 07 2015, @05:13PM (#179988) Journal

      Yes, yes, yes, and yes.

      They could actually be complementary. []

      Hybrid Memory Cube (HMC) is a high-performance RAM interface for through-silicon vias (TSV)-based stacked DRAM memory competing with the incompatible rival interface High Bandwidth Memory (HBM). []

      At the foundation of HMC is a small logic layer which sits below vertical stacks of DRAM die connected by through-silicon via (TSV) bonds. An energy optimized DRAM array provides efficient access to memory bits via the logic layer, providing an intelligent memory device truly optimized for performance and energy efficiencies. This elemental change in how memory is built into a system is paramount. By placing intelligent memory on the same substrate as the processing unit, each part of the system can do what it's designed to do far more optimally than any previous technology. []

      Designers envision placing the Micron stack on a chip substrate next to a server or network processor to provide new levels of fast memory access for high performance systems. Micron says it will deliver early next year 2 and 4 Gbyte versions of the stack providing aggregate bi-directional bandwidth of up to 160 Gbytes/second.

      [SIG] 10/28/2017: Soylent Upgrade v14 []
      • (Score: 2) by bob_super on Thursday May 07 2015, @05:38PM

        by bob_super (1357) on Thursday May 07 2015, @05:38PM (#179994)

        Designers envision taking off from my driveway and commuting to Mars too.

        As of today, HMC is a discrete chip, and not a small one at that. The chaining feature only works of you have physical space for multiple ones, and would be counter-productive on a on-substrate implementation.

        • (Score: 2) by takyon on Thursday May 07 2015, @05:42PM

          by takyon (881) <{takyon} {at} {}> on Thursday May 07 2015, @05:42PM (#179996) Journal

          If HBM can do it, I'm sure HMC can do it. The specification is already on version 2 and no products are out yet.

          I stand by my use of "competing" and "similar" to describe the relationship between HBM and HMC.

          [SIG] 10/28/2017: Soylent Upgrade v14 []