Today was Advanced Micro Devices' (AMD) 2015 Financial Analyst Day. The last one was held in 2012. Since then, the company has changed leadership, put its APUs in the major consoles, and largely abandoned the high-end chip market to Intel. Now AMD says it is focusing on gaming, virtual reality, and datacenters. AMD has revealed details of upcoming CPUs and GPUs at the event:
Perhaps the biggest announcement relates to AMD's x86 Zen CPUs, coming in 2016. AMD is targeting a 40% increase in instructions-per-clock (IPC) with Zen cores. By contrast, Intel's Haswell (a "Tock") increased IPC by about 10-11%, and Broadwell (a "Tick") increased IPC by about 5-6%. AMD is also abandoning the maligned Bulldozer modules with Clustered Multithreading in favor of a Simultaneous Multithreading design, similar to Intel's Hyperthreading. Zen is a high priority for AMD to the extent that it is pushing back its ARM K12 chips to 2017. AMD is also shifting focus away from Project Skybridge, an "ambidextrous framework" that combined x86 and ARM cores in SoCs. Zen cores will target a wide range of designs from "top-to-bottom", including both sub-10W TDPs and up to 100W. The Zen architecture will be followed by Zen+ at some point.
On the GPU front, AMD's 2016 GPUs will use FinFETs. AMD plans to be the first vendor to use High Bandwidth Memory (HBM), a 3D/stacked memory standard that enables much higher bandwidth (hence the name) and saves power. NVIDIA also plans to use HBM in its Pascal GPUs slated for 2016. The HBM will be positioned around the processor, as the GPU's thermal output would make cooling the RAM difficult if it were on top. HBM is competing against the similar Hybrid Memory Cube (HMC) standard.
Although High Bandwidth Memory is on track for 2016, it will actually be featured in an AMD desktop GPU to be released this quarter. AnandTech expects HBM to become a standard feature in AMD APUs, which benefit from higher memory bandwidth:
Coupled with the fact that any new GPU from AMD should also include AMD's latest color compression technology, and the implication is that the effective increase in memory bandwidth should be quite large. For AMD, they see this as being one of the keys of delivering better 4K performance along with better VR performance.
Finally, while talking about HBM on GPUs, AMD is also strongly hinting that they intend to bring HBM to other products as well. Given their product portfolio, we consider this to be a pretty transparent hint that the company wants to build HBM-equipped APUs. AMD's APUs have traditionally struggled to reach peak performance due to their lack of memory bandwidth – 128-bit DDR3 only goes so far – so HBM would be a natural extension to APUs."
AMD's Carrizo APUs will be released beginning this quarter, but it may be worth it to wait:
Badging aside, AMD still will have to face the fact that they're launching a 28nm notebook APU versus Intel's 14nm notebook CPUs, the company is once again banking on their strong GPU performance to help drive sales. Coupled with the combination of low power optimizations in Carrizo and full fixed-function hardware decoding of HEVC, and AMD will be relying on Carrizo to carry them through to 2016 and Zen.
AMD also announced Radeon M300 discrete GPUs for notebooks, promising "refined efficiency and power management" as well as DirectX 12 support.
One of the more interesting chips on AMD's roadmap may be a "high-performance server APU" intended for both high-performance computing and workstations.
Alternate coverage at Tom's Hardware and The Register.
Related Stories
Advanced Micro Devices (AMD) has shared more details about the High Bandwidth Memory (HBM) in its upcoming GPUs.
HBM in a nutshell takes the wide & slow paradigm to its fullest. Rather than building an array of high speed chips around an ASIC to deliver 7Gbps+ per pin over a 256/384/512-bit memory bus, HBM at its most basic level involves turning memory clockspeeds way down – to just 1Gbps per pin – but in exchange making the memory bus much wider. How wide? That depends on the implementation and generation of the specification, but the examples AMD has been showcasing so far have involved 4 HBM devices (stacks), each featuring a 1024-bit wide memory bus, combining for a massive 4096-bit memory bus. It may not be clocked high, but when it's that wide, it doesn't need to be.
AMD will be the only manufacturer using the first generation of HBM, and will be joined by NVIDIA in using the second generation in 2016. HBM2 will double memory bandwidth over HBM1. The benefits of HBM include increased total bandwidth (from 320 GB/s for the R9 290X to 512 GB/s in AMD's "theoretical" 4-stack example) and reduced power consumption. Although HBM1's memory bandwidth per watt is tripled compared to GDDR5, the memory in AMD's example uses a little less than half the power (30 W for the R9 290X down to 14.6 W) due to the increased bandwidth. HBM stacks will also use 5-10% as much area of the GPU to provide the same amount of memory that GDDR5 would. That could potentially halve the size of the GPU:
By AMD's own estimate, a single HBM-equipped GPU package would be less than 70mm × 70mm (4900mm2), versus 110mm × 90mm (9900mm2) for R9 290X.
HBM will likely be featured in high-performance computing GPUs as well as accelerated processing units (APUs). HotHardware reckons that Radeon 300-series GPUs featuring HBM will be released in June.
Samsung has announced the mass production of dynamic random access memory (DRAM) packages using the second generation High Bandwidth Memory (HBM2) interface.
AMD was the first and only company to introduce products using HBM1. AMD's Radeon R9 Fury X GPUs featured 4 gigabytes of HBM1 using four 1 GB packages. Both AMD and Nvidia will introduce GPUs equipped with HBM2 memory this year. Samsung's first HBM2 packages will contain 4 GB of memory each, and the press release states that Samsung intends to manufacture 8 GB HBM2 packages within the year. GPUs could include 8 GB of HBM2 using half of the die space used by AMD's Fury X, or just one-quarter of the die space if 8 GB HBM2 packages are used next year. Correction: HBM2 packages may be slightly physically larger than HBM1 packages. For example, SK Hynix will produce a 7.75 mm × 11.87 mm (91.99 mm2) HBM2 package, compared to 5.48 mm × 7.29 mm (39.94 mm2) HBM1 packages.
The 4GB HBM2 package is created by stacking a buffer die at the bottom and four 8-gigabit (Gb) core dies on top. These are then vertically interconnected by TSV holes and microbumps. A single 8Gb HBM2 die contains over 5,000 TSV holes, which is more than 36 times that of a 8Gb TSV DDR4 die, offering a dramatic improvement in data transmission performance compared to typical wire-bonding based packages.
Samsung's new DRAM package features 256GBps of bandwidth, which is double that of a HBM1 DRAM package. This is equivalent to a more than seven-fold increase over the 36GBps bandwidth of a 4Gb GDDR5 DRAM chip, which has the fastest data speed per pin (9Gbps) among currently manufactured DRAM chips. Samsung's 4GB HBM2 also enables enhanced power efficiency by doubling the bandwidth per watt over a 4Gb-GDDR5-based solution, and embeds ECC (error-correcting code) functionality to offer high reliability.
TSV refers to through-silicon via, a vertical electrical connection used to build 3D chip packages such as High Bandwidth Memory.
Update: HBM2 has been formalized in JEDEC's JESD235A standard, and Anandtech has an article with additional technical details.
Previously:
AMD Teases x86 Improvements, High Bandwidth Memory GPUs
AMD Shares More Details on High Bandwidth Memory
Samsung Mass Produces 128 GB DDR4 Server Memory
A CERN engineer has leaked a few details of an unreleased 32-core AMD "Zen" processor featuring support for 8 channels of DDR4 memory. The processor connects two 16-core CPUs with an on-die interconnect, and could be a replacement for older AMD Opteron chips and competitor to Intel's Xeon chips. Zen is the name of AMD's upcoming 14nm architecture:
AMD is long overdue for a major architecture update, though one is coming later this year. Featuring the codename "Zen," AMD's already provided a few details, such as that it will be built using a 14nm FinFET process technology and will have high core counts. In time, AMD will reveal all there is to know but Zen, but in the meantime, we now have a few additional details to share thanks to a computer engineer at CERN.
CERN engineer Liviu Valsan recently gave a presentation on technology and market trends for the data center. At around 2 minutes into the discussion, he brought up AMD's Zen architecture with a slide that contained some previously undisclosed details (along with a few things we already knew). One of the more interesting revelations was that upcoming x86 processors based on Zen will feature up to 32 physical cores.
Before you get too excited about the high core count, there are two things to note. The first is that AMD is employing a "bit of a trick," to use Valsan's words. To achieve a 32-core design, Valsan says AMD will use two 16-core CPUs on a single die with a next-generation interconnect, presumably one that would reduce or be void of bottlenecks.
The second thing to consider is that it's highly unlikely AMD would release a 32-core processor into the consumer market. Zen-based Opterons aren't out of the question—servers and workstations could take real advantage of the additional cores—but as far as FX processors go, it's more realistic to expect offerings to boast up to 8 cores, maybe even 16 at some point.
Previously: AMD's Upcoming "Zen" APU - 16 Cores, 32 Threads
(Score: 2) by Gravis on Thursday May 07 2015, @03:22AM
i'm not sure if this is the right move but what they are saying is that they are making improvements on Zen and then taking what they got right and putting it into the K12 as well. if this means overall power consumption for the K12 decreases and/or it's speed increases, i'm all for waiting for it to be ready. i just hope they aren't waiting too long and thus miss their window to cut into the market.
(Score: 4, Funny) by GungnirSniper on Thursday May 07 2015, @03:38AM
If they don't they'll have to go to community college.
Tips for better submissions to help our site grow. [soylentnews.org]
(Score: 1, Funny) by Anonymous Coward on Thursday May 07 2015, @03:34AM
First I've heard of that, it sounds like a strategy right out of business school. Too bad they haven't found any compelling applications (yet).
(Score: 2) by takyon on Thursday May 07 2015, @04:25AM
Compelling application: running a pure Android app within Windows 11.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by LoRdTAW on Thursday May 07 2015, @12:35PM
Is that really an issue? I have yet to come across a situation where I said: Damn, I wish this android app ran on my desktop." Besides, most apps are crappier versions of a website. Why would I want that kind of unnecessary redundancy? All that engineering effort spent mashing two incompatible CPU architectures together to run a "low resolution" website.
(Score: 1, Informative) by Anonymous Coward on Thursday May 07 2015, @01:59PM
> Compelling application: running a pure Android app within Windows 11.
x86 is now a standard target for android, there are tons of x86 android tablets out there using intel atom processors.
I'm sure there are still apps that are only compiled for arm, but at this rate by the time Win11 is out, there won't be many.
(Score: 2) by gnuman on Thursday May 07 2015, @03:54AM
I'm still waiting for the unified memory for the CPU and the GPU. I should not have to copy to and from GPU part of the APU just to run OpenCL code on the GPU bits.
All I want is to do an aligned malloc, normal malloc not special OpenCL malloc, then tell OpenCL to just use that and the APU does its stuff.
(Score: 3, Informative) by gman003 on Thursday May 07 2015, @04:44AM
Uh, they already have zero-copy, coherent, unified memory. That came out with Kaveri, IIRC.
(Score: 2, Informative) by Anonymous Coward on Thursday May 07 2015, @06:51AM
Unified memory (specifically Huma, heterogeneous uniform memory access) shipped first on the PS4, and is now out on consumer APUs (As parent mentioned with Kaveri). Its all part of HSA which is well underway. Its not even new anymore really.
(Score: -1, Troll) by Anonymous Coward on Thursday May 07 2015, @04:43AM
AMD hasn't made anything significant since x64 (perhaps the only significant thing they created). AMD fails in comparison to Intel in every way except the price point. For $75 more, why not buy Intel instead of piece of shit AMD.
Fuck you AMD and all you swindlers that bought that junk, built "computers" out of it, and sold it to people. It's shit, and you are shit.
(Score: 5, Insightful) by TheRaven on Thursday May 07 2015, @08:23AM
When Intel went insane with the P4 (tying it to expensive RAMBUS memory and then pushing ahead with a design that assumed that it could be quickly scaled to 10GHz when it was clear that over 2GHz would be problematic) they were amazing in comparison. The K7 was a solid chip and the Opterons (with their on-die memory controllers when the Xeons still had them off-chip in the northbridge) were impressive.
Since the Core 2 launched, AMD has struggled a lot. Their main advantage now is that they don't try to do artificial market segmentation. For example, when I bought the board for my NAS, it had an AMD CPU because Intel wouldn't sell Atom boards with more than 2 SATA ports, AMD would happily sell their equivalent with 4 or 6. Intel would disable the virtualisation extensions on their low-end chips, AMD kept them enabled.
Laptops have almost always been Intel dominated, but since the Pentium M there really hasn't been an AMD chip that's seemed competitive and that's been the largest growth area for the past decade. At the high end, we've been buying Intel systems since the first i7s for places that used to be exclusively Opterons.
Maybe AMD's ARM processors will make a difference to the company. A 40% IPC improvement sounds more impressive than a 10% improvement, but only if you don't consider how far behind they were before.
sudo mod me up
(Score: 2) by wantkitteh on Thursday May 07 2015, @08:38AM
Great minds think alike ;)
I'm building a cheap G3258 rig to tide me over until Zen comes out and give me an LGA1150 upgrade path if it's a performance flop, but I'll give AMD the benefit of the doubt. I'm no fanboi, but over the years I've built systems around the K6-233, Celeron 400, Duron 750, Duron 1400, Ath64 3700+, Pentium E2180 and Phenom II X6. None of them have been high-flyers exactly, but it illustrates how AMD have historically offered better value than Intel; they still do today, even if they don't even compete at the top end of the market any more.
(Score: 2) by bzipitidoo on Thursday May 07 2015, @01:41PM
Don't forget, AMD was first with 64bit x86.
I've been rooting for AMD for years. Kept getting pushed back into the Intel fold. First attempt was a K6 on a motherboard that could go either way. But, the K6 would hang, and the Pentium would not, so back to Intel. Tried again when Intel released the Pentium III with those unique identifiers burned into each processor. No way was I going to have a computer with a new "feature" that could help rat me out for supposed piracy. But by the time of the Pentium 4, Intel chips did floating point math every clock cycle, while AMD chips were only every other clock cycle, and I needed fast math, so back to Intel I went.
I've also been watching ATI vs Nvidia, ready to go with whichever one would finally allow an open graphics driver for Linux that has decent 3D acceleration. I'm still waiting for that. Meantime, Intel really improved their integrated graphics. Their HD 4000 actually has decent performance, for a low end graphics offering. And they offer open source drivers. I didn't like having to turn back to Intel, again, but Nvidia and ATI/AMD weren't delivering. As for other graphics companies, who is there? Matrox? They don't do fast 3D. 3dfx with the fondly remembered Voodoo line? Long gone, as are many other graphics card companies. SiS? Way more hostile to Linux than NVidia and AMD.
(Score: 2) by wantkitteh on Thursday May 07 2015, @01:58PM
You know, I always thought the Athlon had better floating point performance than the Pentium 4 - I guess not. [tomshardware.com]
(Score: 2) by wantkitteh on Thursday May 07 2015, @08:29AM
Haters gotta hate. But there is something to take away from this - AMD's market share is the lowest it's been since the GHz Wars, primarily because the hobbyist/enthusiast system builder market likes to get "The Best, as seen in Benchmarks", and AMD have only had price/performance value products on the market for a while now. Even if AMD's next product is superior to Intel's gear, AMD have to face Intel's entrenched position in the hearts, minds and gaming rigs of this core target market. Let's hope the 40% IPC improvement they're touting doesn't turn out to be overhype like the 50% benchmark lead they claimed pre-release for the FX 8150.
(Score: 3, Informative) by bob_super on Thursday May 07 2015, @04:22PM
> HBM is competing against the similar Hybrid Memory Cube (HMC) standard
Well... no.
No, and no, and actually no. And in case you were wondering, NO!
HMC is a discrete chip with lots of transceivers which provides memory at Meh! latency. If you wanted 16GB of RAM, you'd still need quite a few of them. They are bulky and somewhat hot. You also don't control the controller, so you get data whenever its wants (usually, pretty dang fast). Indications I saw say that you cannot exceed 70% bandwidth efficiency until Micron bothers to do a Si spin. It's a lot smaller footprint to route multiple HMC to a CPU than multiple DRAMs, but you gotta know your 15-gigabit signal integrity.
HBM is on-substrate memory. Either on top of the CPU via TSVs or next to it, depending on the CPU heat dissipation. It's like having a PCB with the CPU die and the Memory dies inside the "CPU" package. It's very limited by physical size and CPU heat, but current tech allows for thousands of parallel low-voltage connections to the RAM, so it allows fantastic bandwidth at lower latency. The PCB doesn't have to even know it exists, because it's all taken care of by the CPU vendor. On the other hand, you still need external RAM of other kinds because capacity isn't there yet.
They are in no way, shape or form "similar". They could actually be complementary.
(Score: 3, Informative) by takyon on Thursday May 07 2015, @05:13PM
Yes, yes, yes, and yes.
http://en.wikipedia.org/wiki/Hybrid_Memory_Cube [wikipedia.org]
http://www.hybridmemorycube.org/faq.html [hybridmemorycube.org]
http://www.eetimes.com/document.asp?doc_id=1261415 [eetimes.com]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by bob_super on Thursday May 07 2015, @05:38PM
Designers envision taking off from my driveway and commuting to Mars too.
As of today, HMC is a discrete chip, and not a small one at that. The chaining feature only works of you have physical space for multiple ones, and would be counter-productive on a on-substrate implementation.
(Score: 2) by takyon on Thursday May 07 2015, @05:42PM
If HBM can do it, I'm sure HMC can do it. The specification is already on version 2 and no products are out yet.
I stand by my use of "competing" and "similar" to describe the relationship between HBM and HMC.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]