AnandTech compared Intel's Skylake-SP chips to AMD's Epyc chips:
We can continue to talk about Intel's excellent mesh topology and AMD strong new Zen architecture, but at the end of the day, the "how" will not matter to infrastructure professionals. Depending on your situation, performance, performance-per-watt, and/or performance-per-dollar are what matters.
The current Intel pricing draws the first line. If performance-per-dollar matters to you, AMD's EPYC pricing is very competitive for a wide range of software applications. With the exception of database software and vectorizable HPC code, AMD's EPYC 7601 ($4200) offers slightly less or slightly better performance than Intel's Xeon 8176 ($8000+). However the real competitor is probably the Xeon 8160, which has 4 (-14%) fewer cores and slightly lower turbo clocks (-100 or -200 MHz). We expect that this CPU will likely offer 15% lower performance, and yet it still costs about $500 more ($4700) than the best EPYC. Of course, everything will depend on the final server system price, but it looks like AMD's new EPYC will put some serious performance-per-dollar pressure on the Intel line.
The Intel chip is indeed able to scale up in 8 sockets systems, but frankly that market is shrinking fast, and dual socket buyers could not care less.
Meanwhile, although we have yet to test it, AMD's single socket offering looks even more attractive. We estimate that a single EPYC 7551P would indeed outperform many of the dual Silver Xeon solutions. Overall the single-socket EPYC gives you about 8 cores more at similar clockspeeds than the 2P Intel, and AMD doesn't require explicit cross socket communication - the server board gets simpler and thus cheaper. For price conscious server buyers, this is an excellent option.
However, if your software is expensive, everything changes. In that case, you care less about the heavy price tags of the Platinum Xeons. For those scenarios, Intel's Skylake-EP Xeons deliver the highest single threaded performance (courtesy of the 3.8 GHz turbo clock), high throughput without much (hardware) tuning, and server managers get the reassurance of Intel's reliable track record. And if you use expensive HPC software, you will probably get the benefits of Intel's beefy AVX 2.0 and/or AVX-512 implementations.
AMD's flagship Epyc CPU has 32 cores, while the largest Skylake-EP Xeon CPU has 28 cores.
https://www.hpcwire.com/2017/02/27/google-gets-first-dibs-new-skylake-chips/
As part of an ongoing effort to differentiate its public cloud services, Google made good this week on its intention to bring custom Xeon Skylake chips from Intel Corp. to its Google Compute Engine. The cloud provider is the first to offer the next-gen Xeons, and is getting access ahead of traditional server-makers like Dell and HPE.
Google announced plans to incorporate the next-generation Intel server chips into its public could last November. On Friday (Feb. 24), Urs Hölzle, Google's senior vice president for cloud infrastructure, said the Skylake upgrade would deliver a significant performance boost for demanding applications and workloads ranging from genomic research to machine learning.
The cloud vendor noted that Skylake includes Intel Advanced Vector Extensions (AVX-512) that target workloads such as data analytics, engineering simulations and scientific modeling. When compared to previous generations, the Skylake extensions are touted as doubling floating-point performance "for the heaviest calculations," Hölzle noted in a blog post.
Recently, Intel was rumored to be releasing 10 and 12 core "Core i9" CPUs to compete with AMD's 10-16 core "Threadripper" CPUs. Now, Intel has confirmed these as well as 14, 16, and 18 core Skylake-X CPUs. Every CPU with 6 or more cores appears to support quad-channel DDR4:
|Intel Core
|Cores/Threads
|Price
|$/core
|i9-7980XE
|18/36
|$1,999
|$111
|i9-7960X
|16/32
|$1,699
|$106
|i9-7940X
|14/28
|$1,399
|$100
|i9-7920X
|12/24
|$1,199
|$100
|i9-7900X
|10/20
|$999
|$100
|i7-7820X
|8/16
|$599
|$75
|i7-7800X
|6/12
|$389
|$65
|i7-7740X
|4/8
|$339
|$85
|i7-7640X
|4/4
|$242
|$61 (less threads)
Last year at Computex, the flagship Broadwell-E enthusiast chip was launched: the 10-core i7-6950X at $1,723. Today at Computex, the 10-core i9-7900X costs $999, and the 16-core i9-7960X costs $1,699. Clearly, AMD's Ryzen CPUs have forced Intel to become competitive.
Although the pricing of AMD's 10-16 core Threadripper CPUs is not known yet, the 8-core Ryzen R7 launched at $500 (available now for about $460). The Intel i7-7820X has 8 cores for $599, and will likely have better single-threaded performance than the AMD equivalent. So while Intel's CPUs are still more expensive than AMD's, they may have similar price/performance.
For what it's worth, Intel also announced quad-core Kaby Lake-X processors.
Welcome to the post-quad-core era. Will you be getting any of these chips?
AMD has launched its Ryzen-based take on x86 server processors to compete with Intel's Xeon CPUs. All of the Epyc 7000-series CPUs support 128 PCIe 3.0 lanes and 8 channels (2 DIMMs per channel) of DDR4-2666 DRAM:
A few weeks ago AMD announced the naming of the new line of enterprise-class processors, called EPYC, and today marks the official launch with configurations up to 32 cores and 64 threads per processor. We also got an insight into several features of the design, including the AMD Infinity Fabric.
Today's announcement of the AMD EPYC product line sees the launch of the top four CPUs, focused primarily at dual socket systems. The full EPYC stack will contain twelve processors, with three for single socket environments, with the rest of the stack being made available at the end of July. It is worth taking a few minutes to look at how these processors look under the hood.
On the package are four silicon dies, each one containing the same 8-core silicon we saw in the AMD Ryzen processors. Each silicon die has two core complexes, each of four cores, and supports two memory channels, giving a total maximum of 32 cores and 8 memory channels on an EPYC processor. The dies are connected by AMD's newest interconnect, the Infinity Fabric, which plays a key role not only in die-to-die communication but also processor-to-processor communication and within AMD's new Vega graphics. AMD designed the Infinity Fabric to be modular and scalable in order to support large GPUs and CPUs in the roadmap going forward, and states that within a single package the fabric is overprovisioned to minimize any issues with non-NUMA aware software (more on this later).
With a total of 8 memory channels, and support for 2 DIMMs per channel, AMD is quoting a 2TB per socket maximum memory support, scaling up to 4TB per system in a dual processor system. Each CPU will support 128 PCIe 3.0 lanes, suitable for six GPUs with full bandwidth support (plus IO) or up to 32 NVMe drives for storage. All the PCIe lanes can be used for IO devices, such as SATA drives or network ports, or as Infinity Fabric connections to other devices. There are also 4 IO hubs per processor for additional storage support.
AMD's slides at Ars Technica.
Arthur T Knackerbracket has found the following story:
During April and May, Intel started updating processor documentation with a new errata note, and over the weekend we learned why: Skylake and Kaby Lake silicon has a microcode bug.
The errata is described in detail on the Debian mailing list, and affects Skylake and Kaby Lake Intel Core processors (in desktop, high-end desktop, embedded and mobile platforms), Xeon v5 and v6 server processors, and some Pentium models.
The Debian advisory says affected users need to disable hyper-threading "immediately" in their BIOS or UEFI settings, because the processors can "dangerously misbehave when hyper-threading is enabled."
Symptoms can include "application and system misbehaviour, data corruption, and data loss".
Henrique de Moraes Holschuh, who authored the Debian post, notes that all operating systems, not only Linux, are subject to the bug.
Also at Tom's Hardware and Ars Technica.
Upcoming Intel processors will support scalable AVX-512 instructions, which one former Intel employee calls a "hidden gem":
Imagine if we could use vector processing on something other than just floating point problems. Today, GPUs and CPUs work tirelessly to accelerate algorithms based on floating point (FP) numbers. Algorithms can definitely benefit from basing their mathematics on bits and integers (bytes, words) if we could just accelerate them too. FPGAs can do this, but the hardware and software costs remain very high. GPUs aren't designed to operate on non-FP data. Intel AVX introduced some support, and now Intel AVX-512 is bringing a great deal of flexibility to processors. I will share why I'm convinced that the "AVX512VL" capability in particular is a hidden gem that will let AVX-512 be much more useful for compilers and developers alike.
Fortunately for software developers, Intel has done a poor job keeping the "secret" that AVX-512 is coming to Intel's recently announced Xeon Scalable processor line very soon. Amazon Web Services has publically touted AVX-512 on Skylake as coming soon!
It is timely to examine the new AVX-512 capabilities and their ability to impact beyond the more regular HPC needs for floating point only workloads. The hidden gem in all this, which enables shifting to AVX-512 more easily, is the "VL" (vector length) extensions which allow AVX-512 instructions to behave like SSE or AVX/AVX2 instructions when that suits us. This is a clever and powerful addition to enable its adoption in a wider assortment of software more quickly. The VL extensions mean that programmers (and compilers) do not need to shift immediately from 256-bits (AVX/AVX2) to 512-bits to use the new bit/byte/word manipulations. This transitional benefit is useful not only for an interim, but also for applications which find 256-bits more natural (perhaps a small, but important, subset of problems).
Will it be enough to stave off "Epyc"?