Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday November 07 2018, @04:49AM   Printer-friendly
from the moah-powah dept.

AMD has announced the next generation of its Epyc server processors, with up to 64 cores (128 threads) each. Instead of an 8-core "core complex" (CCX), AMD's 64-core chips will feature 8 "chiplets" with 8 cores each:

AMD on Tuesday formally announced its next-generation EPYC processor code-named Rome. The new server CPU will feature up to 64 cores featuring the Zen 2 microarchitecture, thus providing at least two times higher performance per socket than existing EPYC chips.

As discussed in a separate story covering AMD's new 'chiplet' design approach, AMD EPYC 'Rome' processor will carry multiple CPU chiplets manufactured using TSMC's 7 nm fabrication process as well as an I/O die produced at a 14 nm node. As it appears, high-performance 'Rome' processors will use eight CPU chiplets offering 64 x86 cores in total.

Why chiplets?

Separating CPU chiplets from the I/O die has its advantages because it enables AMD to make the CPU chiplets smaller as physical interfaces (such as DRAM and Infinity Fabric) do not scale that well with shrinks of process technology. Therefore, instead of making CPU chiplets bigger and more expensive to manufacture, AMD decided to incorporate DRAM and some other I/O into a separate chip. Besides lower costs, the added benefit that AMD is going to enjoy with its 7 nm chiplets is ability to easier[sic] bin new chips for needed clocks and power, which is something that is hard to estimate in case of servers.

AMD also announced that Zen 4 is under development. It could be made on a "5nm" node, although that is speculation. The Zen 3 microarchitecture will be made on TSMC's N7+ process ("7nm" with more extensive use of extreme ultraviolet lithography).

AMD's Epyc CPUs will now be offered on Amazon Web Services.

AnandTech live blog of New Horizon event.

Previously: AMD Epyc 7000-Series Launched With Up to 32 Cores
TSMC Will Make AMD's "7nm" Epyc Server CPUs
Intel Announces 48-core Xeons Using Multiple Dies, Ahead of AMD Announcement

Related: Cray CS500 Supercomputers to Include AMD's Epyc as a Processor Option
Oracle Offers Servers with AMD's Epyc to its Cloud Customers


Original Submission

Related Stories

AMD Epyc 7000-Series Launched With Up to 32 Cores 19 comments

AMD has launched its Ryzen-based take on x86 server processors to compete with Intel's Xeon CPUs. All of the Epyc 7000-series CPUs support 128 PCIe 3.0 lanes and 8 channels (2 DIMMs per channel) of DDR4-2666 DRAM:

A few weeks ago AMD announced the naming of the new line of enterprise-class processors, called EPYC, and today marks the official launch with configurations up to 32 cores and 64 threads per processor. We also got an insight into several features of the design, including the AMD Infinity Fabric.

Today's announcement of the AMD EPYC product line sees the launch of the top four CPUs, focused primarily at dual socket systems. The full EPYC stack will contain twelve processors, with three for single socket environments, with the rest of the stack being made available at the end of July. It is worth taking a few minutes to look at how these processors look under the hood.

On the package are four silicon dies, each one containing the same 8-core silicon we saw in the AMD Ryzen processors. Each silicon die has two core complexes, each of four cores, and supports two memory channels, giving a total maximum of 32 cores and 8 memory channels on an EPYC processor. The dies are connected by AMD's newest interconnect, the Infinity Fabric, which plays a key role not only in die-to-die communication but also processor-to-processor communication and within AMD's new Vega graphics. AMD designed the Infinity Fabric to be modular and scalable in order to support large GPUs and CPUs in the roadmap going forward, and states that within a single package the fabric is overprovisioned to minimize any issues with non-NUMA aware software (more on this later).

With a total of 8 memory channels, and support for 2 DIMMs per channel, AMD is quoting a 2TB per socket maximum memory support, scaling up to 4TB per system in a dual processor system. Each CPU will support 128 PCIe 3.0 lanes, suitable for six GPUs with full bandwidth support (plus IO) or up to 32 NVMe drives for storage. All the PCIe lanes can be used for IO devices, such as SATA drives or network ports, or as Infinity Fabric connections to other devices. There are also 4 IO hubs per processor for additional storage support.

AMD's slides at Ars Technica.


Original Submission

Cray CS500 Supercomputers to Include AMD's Epyc as a Processor Option 10 comments

Cray supercomputers with AMD Epyc processors will start shipping in the summer:

Cray is adding an AMD processor option to its CS500 line of clustered supercomputers.

The CS500 supports more than 11,000 nodes which can use Intel Xeon SP CPUs, optionally accelerated by Nvidia Tesla GPUs or Intel Phi co-processors. Intel Stratix FPGA acceleration is also supported.

There can be up to 72 nodes in a rack, interconnected by EDR/FDR InfiniBand or Intel's OmniPath fabric.

Cray has now added an AMD Epyc 7000 option to the CPU mix:

  • Systems provide four dual-socket nodes in a 2U chassis
  • Each node supports two PCIe 3.0 x 16 slots (200Gb network capability) and HDD/SSD options
  • Epyc 7000 processors support up to 32 cores and eight DDR4 memory channels per socket

Top-of-the-line Epyc chips have 32 cores and 64 threads. An upcoming generation of 7nm Epyc chips is rumored to have up to 48 or 64 cores, using 6 or 8 cores per Core Complex (CCX) instead of the current 4.

Related: AMD Epyc 7000-Series Launched With Up to 32 Cores
Intel's Skylake-SP vs AMD's Epyc
Data Centers Consider Intel's Rivals


Original Submission

TSMC Will Make AMD's "7nm" Epyc Server CPUs 4 comments

AMD "Rome" EPYC CPUs to Be Fabbed By TSMC

AMD CEO Lisa Su has announced that second-generation "Rome" EPYC CPU that the company is wrapping up work on is being produced out at TSMC. This is a notable departure from how things have gone for AMD with the Zen 1 generation, as GlobalFoundries has produced all of AMD's Zen CPUs, both for consumer Ryzen and professional EPYC parts.

[...] As it stands, AMD seems rather optimistic about how things are currently going. Rome silicon is already back in the labs, and indeed AMD is already sampling the parts to certain partners for early validation. Which means AMD remains on track to launch their second-generation EPYC processors in 2019.

[...] Ultimately however if they are meeting their order quota from GlobalFoundries, then AMD's situation is ultimately much more market driven: which fab can offer the necessary capacity and performance, and at the best prices. Which will be an important consideration as GlobalFoundries has indicated that it may not be able to keep up with 7nm demand, especially with the long manufacturing process their first-generation DUV-based 7nm "7LP" process requires.

See also: No 16-core AMD Ryzen AM4 Until After 7nm EPYC Launch (2019)

Related: TSMC Holds Groundbreaking Ceremony for "5nm" Fab, Production to Begin in 2020
Cray CS500 Supercomputers to Include AMD's Epyc as a Processor Option
AMD Returns to the Datacenter, Set to Launch "7nm" Radeon Instinct GPUs for Machine Learning in 2018
AMD Ratcheting Up the Pressure on Intel
More on AMD's Licensing of Epyc Server Chips to Chinese Companies


Original Submission

Oracle Offers Servers with AMD's Epyc to its Cloud Customers 1 comment

Oracle puts AMD EPYC in the Cloud

The process of AMD ramping up its EPYC efforts involves a lot of 'first-step' vendor interaction. Having been a very minor player for so long, all the big guns are taking it slowly with AMD's newest hardware in verifying whether it is suitable for their workloads and customers. The next company to tick that box is Oracle, who is announcing today that they will be putting bare metal EPYC instances available in its cloud offering.

The new E-series instances will start with Standard E2, costing around $0.03 per core per hour, up to 64 cores per server, Oracle is stating that this pricing structure is 66% less than the average per-core instance on the market. One bare metal standard instance, BM.Standard E2.52, will offer dual EPYC 7551 processors at 2.0 GHz, with 512 GB of DDR4, dual 25GbE networking, and up to 1PB of remote block storage. Another offering is the E2.64 instance, which will offer 16 cores by comparison.

Related: AMD Epyc 7000-Series Launched With Up to 32 Cores
Data Centers Consider Intel's Rivals
Cray CS500 Supercomputers to Include AMD's Epyc as a Processor Option
AMD Returns to the Datacenter, Set to Launch "7nm" Radeon Instinct GPUs for Machine Learning in 2018
Chinese Company Produces Chips Closely Based on AMD's Zen Microarchitecture
More on AMD's Licensing of Epyc Server Chips to Chinese Companies
TSMC Will Make AMD's "7nm" Epyc Server CPUs


Original Submission

Intel Announces 48-core Xeons Using Multiple Dies, Ahead of AMD Announcement 23 comments

Intel announces Cascade Lake Xeons: 48 cores and 12-channel memory per socket

Intel has announced the next family of Xeon processors that it plans to ship in the first half of next year. The new parts represent a substantial upgrade over current Xeon chips, with up to 48 cores and 12 DDR4 memory channels per socket, supporting up to two sockets.

These processors will likely be the top-end Cascade Lake processors; Intel is labelling them "Cascade Lake Advanced Performance," with a higher level of performance than the Xeon Scalable Processors (SP) below them. The current Xeon SP chips use a monolithic die, with up to 28 cores and 56 threads. Cascade Lake AP will instead be a multi-chip processor with multiple dies contained with in a single package. AMD is using a similar approach for its comparable products; the Epyc processors use four dies in each package, with each die having 8 cores.

The switch to a multi-chip design is likely driven by necessity: as the dies become bigger and bigger it becomes more and more likely that they'll contain a defect. Using several smaller dies helps avoid these defects. Because Intel's 10nm manufacturing process isn't yet good enough for mass market production, the new Xeons will continue to use a version of the company's 14nm process. Intel hasn't yet revealed what the topology within each package will be, so the exact distribution of those cores and memory channels between chips is as yet unknown. The enormous number of memory channels will demand an enormous socket, currently believed to be a 5903 pin connector.

Intel also announced tinier 4-6 core E-2100 Xeons with ECC memory support.

Meanwhile, AMD is holding a New Horizon event on Nov. 6, where it is expected to announce 64-core Epyc processors.

Related: AMD Epyc 7000-Series Launched With Up to 32 Cores
AVX-512: A "Hidden Gem"?
Intel's Skylake-SP vs AMD's Epyc
Intel Teases 28 Core Chip, AMD Announces Threadripper 2 With Up to 32 Cores
TSMC Will Make AMD's "7nm" Epyc Server CPUs
Intel Announces 9th Generation Desktop Processors, Including a Mainstream 8-Core CPU


Original Submission

AMD Announces "7nm" Vega GPUs for the Enterprise Market 3 comments

AMD Announces Radeon Instinct MI60 & MI50 Accelerators: Powered By 7nm Vega

As part of this morning's Next Horizon event, AMD formally announced the first two accelerator cards based on the company's previously revealed 7nm Vega GPU. Dubbed the Radeon Instinct MI60 and Radeon Instinct MI50, the two cards are aimed squarely at the enterprise accelerator market, with AMD looking to significantly improve their performance competitiveness in everything from HPC to machine learning.

Both cards are based on AMD's 7nm GPU, which although we've known about at a high level for some time now, we're only finally getting some more details on. GPU is based on a refined version of AMD's existing Vega architecture, essentially adding compute-focused features to the chip that are necessary for the accelerator market. Interestingly, in terms of functional blocks here, 7nm Vega is actually rather close to the existing 14nm "Vega 10" GPU: both feature 64 CUs and HBM2. The difference comes down to these extra accelerator features, and the die size itself.

With respect to accelerator features, 7nm Vega and the resulting MI60 & MI50 cards differentiates itself from the previous Vega 10-powered MI25 in a few key areas. 7nm Vega brings support for half-rate double precision – up from 1/16th rate – and AMD is supporting new low precision data types as well. These INT8 and INT4 instructions are especially useful for machine learning inferencing, where high precision isn't necessary, with AMD able to get up to 4x the perf of an FP16/INT16 data type when using the smallest INT4 data type. However it's not clear from AMD's presentation how flexible these new data types are – and with what instructions they can be used – which will be important for understanding the full capabilities of the new GPU. All told, AMD is claiming a peak throughput of 7.4 TFLOPS FP64, 14.7 TFLOPS FP32, and 118 TOPS for INT4.

Previously: AMD Returns to the Datacenter, Set to Launch "7nm" Radeon Instinct GPUs for Machine Learning in 2018

Related: AMD Previews Zen 2 Epyc CPUs with up to 64 Cores, New "Chiplet" Design


Original Submission

AnandTech Interview With AMD CTO Mark Papermaster Regarding Rome (Zen 2 Epyc) 23 comments

Naples, Rome, Milan, Zen 4: An Interview with AMD CTO, Mark Papermaster

The goal of AMD's event in the middle of the fourth quarter of the year was to put into perspective two elements of AMD's strategy: firstly, its commitment to delivering a 7nm Vega based product by the end of the year, as the company promised in early 2018, but also to position its 7nm capabilities as some of the best by disclosing the layout of its next generation enterprise processor set to hit shelves in 2019. [...] We sat down with AMD's CTO, Mark Papermaster, to see if we could squeeze some of the finer details about both AMD's strategy and the finer points of some of the products from the morning sessions.

[...] Ian Cutress: Forrest explained on the stage that the datacenter of today is very different to the datacenter ten years ago (or even 3-5 years ago). What decisions are you making today to predict the datacenter of the future?

Mark Papermaster: We believe we will be positioned very well – it all ties back to my opening comments on Moore's Law. We all accept that the traditional Moore's Law is slowing down, and that while process does still matter you have to be agile about how you put the pieces together, otherwise you cannot win. We leveraged ourselves to have scalability in our first EPYC launch. We leveraged our ability in our chiplet approach here to combine really small 7nm CPU dies with tried and proven 14nm for the IO die. That modularity only grows in importance going forward. We've stated our case as to where we believe it is necessary to keep pace on a traditional Moore's Law growth despite the slowing of the process gains per node and the length of time between major semiconductor nodes. I think you'll see others adopt what we've done with the chiplet approach, and I can tell you we are committed.

[...] IC: Where does Rome sit with CCIX support?

MP: We didn't announce specifically those attributes beyond PCIe 4.0 today, but I can say we are a member of CCIX as we are with Gen Z. Any further detail there you will have to wait until launch. Any specific details about the speeds, feeds, protocols, are coming in 2019.

AMD Announces Radeon VII GPU, Teases Third-Generation Ryzen CPU 15 comments

At AMD's CES 2019 keynote, CEO Lisa Su revealed the Radeon VII, a $700 GPU built on TSMC's "7nm" process. The GPU should have around the same performance and price as Nvidia's already-released RTX 2080. While it does not have any dedicated ray-tracing capabilities, it includes 16 GB of High Bandwidth Memory.

Nvidia's CEO has trashed his competitor's new GPU, calling it "underwhelming" and "lousy". Meanwhile, Nvidia has announced that it will support Adaptive Sync, the standardized version of AMD's FreeSync dynamic refresh rate and anti-screen tearing technology. Lisa Su also says that AMD is working on supporting ray tracing in future GPUs, but that the ecosystem is not ready yet.

Su also showed off a third-generation Ryzen CPU at the CES keynote, but did not announce a release date or lineup details. Like the second generation of Epyc server CPUs, the new Ryzen CPUs will be primarily built on TSMC's "7nm" process, but will include a "14nm" GlobalFoundries I/O part that includes the memory controllers and PCIe lanes. The CPUs will support PCIe 4.0.

The Ryzen 3000-series ("Matisse") should provide a roughly 15% single-threaded performance increase while significantly lowering power consumption. However, it has been speculated that the chips could include up to 16 cores or 8 cores with a separate graphics chiplet. AMD has denied that there will be a variant with integrated graphics, but Lisa Su has left the door open for 12- or 16-core versions of Ryzen, saying that "There is some extra room on that package, and I think you might expect we'll have more than eight cores". Here's "that package".

Also at The Verge.

Previously: Watch AMD's CES 2019 Keynote Live: 9am PT/12pm ET/5pm UK


Original Submission

Leaked Intel Discrete Graphics Roadmap Reveals Plans for "Seamless" Dual, Quad, and Octa-GPUs 14 comments

Intel has teased* plans to return to the discrete graphics market in 2020. Now, some of those plans have leaked. Intel's Xe branded GPUs will apparently use an architecture capable of scaling to "any number" of GPUs that are connected by a multi-chip module (MCM). The "e" in Xe is meant to represent the number of GPU dies, with one of the first products being called X2/X2:

Developers won't need to worry about optimizing their code for multi-GPU, the OneAPI will take care of all that. This will also allow the company to beat the foundry's usual lithographic limit of dies that is currently in the range of ~800mm2. Why have one 800mm2 die when you can have two 600mm2 dies (the lower the size of the die, the higher the yield) or four 400mm2 ones? Armed with One API and the Xe macroarchitecture Intel plans to ramp all the way up to Octa GPUs by 2024. From this roadmap, it seems like the first Xe class of GPUs will be X2.

The tentative timeline for the first X2 class of GPUs was also revealed: June 31st, 2020. This will be followed by the X4 class sometime in 2021. It looks like Intel plans to add two more cores [dies] every year so we should have the X8 class by 2024. Assuming Intel has the scaling solution down pat, it should actually be very easy to scale these up. The only concern here would be the packaging yield – which Intel should be more than capable of handling and binning should take care of any wastage issues quite easily. Neither NVIDIA nor AMD have yet gone down the MCM path and if Intel can truly deliver on this design then the sky's the limit.

AMD has made extensive use of MCMs in its Zen CPUs, but will reportedly not use an MCM-based design for its upcoming Navi GPUs. Nvidia has published research into MCM GPUs but has yet to introduce products using such a design.

Intel will use an MCM for its upcoming 48-core "Cascade Lake" Xeon CPUs. They are also planning on using "chiplets" in other CPUs and mixing big and small CPU cores and/or cores made on different process nodes.

*Previously: Intel Planning a Return to the Discrete GPU Market, Nvidia CEO Responds
Intel Discrete GPU Planned to be Released in 2020
Intel Announces "Sunny Cove", Gen11 Graphics, Discrete Graphics Brand Name, 3D Packaging, and More

Related: Intel Integrates LTE Modem Into Custom Multi-Chip Module for New HP Laptop
Intel Promises "10nm" Chips by the End of 2019, and More


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by takyon on Wednesday November 07 2018, @04:53AM (1 child)

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Wednesday November 07 2018, @04:53AM (#758837) Journal

    https://arstechnica.com/gadgets/2018/11/amd-outlines-its-future-7nm-gpus-with-pcie-4-zen-2-zen-3-zen-4/ [arstechnica.com]

    Zen 2 will also address certain weak aspects of the original Zen. For example, the original Zen used 128-bit data paths to handle 256-bit AVX2 operations; each operation was split into two parts and processed sequentially. In workloads using AVX2, this gave Intel, with its native 256-bit implementation, a huge advantage. Zen 2 doubles the floating-point execution units and data paths to be 256-bit, doubling the bandwidth available and greatly improving the performance of this code. For integer workloads, branch prediction and prefetching have been made more accurate, and some caches enlarged.

    Zen 2 will also offer improved hardware protection against some variants of the Spectre attacks.

    Intel may have also played around in order to claim a "3.4x" improvement [arstechnica.com] over the older Epyc with its 48-core Xeon. See update on this article. [tomshardware.com]

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 1, Interesting) by Anonymous Coward on Wednesday November 07 2018, @10:45AM

      by Anonymous Coward on Wednesday November 07 2018, @10:45AM (#758899)

      In workloads using AVX2, this gave Intel, with its native 256-bit implementation, a huge advantage

      From what I can tell ( https://www.phoronix.com/scan.php?page=article&item=6-linux-eoy2017&num=7 [phoronix.com] ), a consumer needs to use a specialized distro and a limited selection of libraries that Intel is optimizing for you to enjoy the benefits. Once we're talking about running a specific piece of software you maintain yourself, you can match Intel's optimized assembly with AMD optimized assembly for 99.999% of the use cases and would, regardless, be better off with GPU compute.

  • (Score: 1, Interesting) by Anonymous Coward on Wednesday November 07 2018, @06:50AM (3 children)

    by Anonymous Coward on Wednesday November 07 2018, @06:50AM (#758858)

    https://www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP4341.html [ibm.com]

    This had 5"x5" 'chips' that housed up to 100 actual chips. Each module had a water cooled jacket and up to a 56 layer "back plane" the chips talked through. It was build to increase the speed and size of 270 family line. It is also one of the "freak-out" techs that Apple and Motorola saw when they three computer worked for 68000 series processor and motherboard.

    Looking for a Scientific America from the late 1970's / early 1980's that talked about the tech in more detail.

    • (Score: 4, Informative) by takyon on Wednesday November 07 2018, @07:05AM (2 children)

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Wednesday November 07 2018, @07:05AM (#758861) Journal

      AMD's CEO Lisa Su used to be Director of Emerging Products at IBM.

      Ryzen uses simultaneous multithreading [wikipedia.org], which was developed at IBM way back in 1968.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 0) by Anonymous Coward on Wednesday November 07 2018, @07:23AM (1 child)

        by Anonymous Coward on Wednesday November 07 2018, @07:23AM (#758867)

        Yup everything old is new again.

        Now if INTEL and AMD will get multiple byte (256 to 32k) in a single instruction charged with a single clock "tick". That was nice of earlier IBM ASM. Reality it was microcode running a loop, but it was set up and ran ran as single assembly instruction (like MV(255) from,to compiled up to 6 total bytes (4 bytes if reg to reg pointing). Allowed for every tight code size and simple to read. When I started it 64kB was largest a program coude compile to, but on some of my ealry machines (System 3) we had 12kB of total storage.

        • (Score: 2, Funny) by Anonymous Coward on Wednesday November 07 2018, @12:40PM

          by Anonymous Coward on Wednesday November 07 2018, @12:40PM (#758929)

          Yup everything old is new again.

          I wanna be new again [sigh]

  • (Score: 3, Insightful) by Anonymous Coward on Wednesday November 07 2018, @10:58AM

    by Anonymous Coward on Wednesday November 07 2018, @10:58AM (#758904)

    This amd vs intel story is like a textbook argument in favor of competition. Simultaneously, it also shows how a monopoly position naturally leads to stagnation, which in turn opens the door to competition. I think it is also a good case of c-level actual "diversity" (merit based) vs c-level lipservice to diversity (identity based). Theres just so many interesting angles here.

  • (Score: 0) by Anonymous Coward on Wednesday November 07 2018, @05:30PM

    by Anonymous Coward on Wednesday November 07 2018, @05:30PM (#759061)

    and they won't sell you an epyc chip without their closed source management BS/backdoor.

  • (Score: 3, Funny) by bob_super on Wednesday November 07 2018, @06:59PM

    by bob_super (1357) on Wednesday November 07 2018, @06:59PM (#759101)

    "Move the memory controller inside the chip"
    "Okay, I put it on the die"
    "I want more of them"
    "Okay, but it's gonna cause you to have a huge chip, bad yields"
    "Fine, use multiple chips"
    "Now you get asymmetric latency"
    "Ok, let's pull th controller back off-die"
    "Can I keep it in the chip?"
    "Sure, the chip is the new PCB, as the embedded guys already know"

(1)