Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday December 03 2019, @11:50PM   Printer-friendly

Amazon Announces Graviton2 SoC Along With New AWS Instances: 64-Core Arm With Large Performance Uplifts

The new Graviton2 SoC is a custom design by Amazon's own in-house silicon design teams and is a successor to the first-generation Graviton chip. The new chip quadruples the core count from 16 cores to 64 cores and employs Arm's newest Neoverse N1 cores. Amazon is using the highest performance configuration available, with 1MB L2 caches per core, with all 64 cores connected by a mesh fabric supporting 2TB/s aggregate bandwidth as well as integrating 32MB of L3 cache.

Amazon claims the new Graviton2 chip is[sic] can deliver up to 7x higher performance than the first generation based A1 instances in total across all cores, up to 2x the performance per core, and delivers memory access speed of up to 5x compared to its predecessor. The chip comes in at a massive 30B transistors on a 7nm manufacturing node - if Amazon is using similar high density libraries to mobile chips (they have no reason to use HPC libraries), then I estimate the chip to fall around 300-350mm² if I was forced to put out a figure.

The memory subsystem of the new chip is supported by 8 DDR4-3200 channels with support for hardware AES256 memory encryption. Peripherals of the system are supported by 64 PCIe4 lanes.


Original Submission

Related Stories

Marvell Announces ThunderX3, an ARM Server CPU With 96 Cores, 384 Threads 10 comments

Marvell Announces ThunderX3: 96 Cores & 384 Thread 3rd Gen Arm Server Processor

The Arm server ecosystem is well alive and thriving, finally getting into serious motion after several years of false-start attempts. Among the original pioneers in this space was Cavium, which went on to be acquired by Marvell in 2018. Among the company's server CPU products is the ThunderX line; while the first generation ThunderX left quite a lot to be desired, the ThunderX2 was the first Arm server silicon that we deemed viable and competitive against Intel and AMD products. Since then, the ecosystem has accelerated quite a lot, and only last week we saw how impressive the new Amazon Graviton2 with the N1 chips ended up. Marvell didn't stop at the ThunderX2, and had big ambitions for its newly acquired CPU division, and today is announcing the new ThunderX3.

The ThunderX3 is a continuation and successor to then-Cavium's custom microarchitecture found in the TX2, adopting a lot of the key characteristics, most notably the capability of 4-way SMT. Adopting a new microarchitecture with higher IPC capabilities, the new TX3 also ups the clock frequencies, and now hosts up to a whopping 96 CPU cores, allowing the chip to scale up to 384 threads in a single socket.

Related: Marvell Technology to Buy Cavium for $6 Billion
ARM "Project Trillium", Cambricon MLU-100, and Cavium ThunderX2
HPE Delivers World's Largest Arm Supercomputer for U.S. Department of Energy
Ampere Launches its First ARM-Based Server Processors in Challenge to Intel
Amazon Announces 64-core Graviton2 Arm CPU
80-Core Arm CPU To Bring Lower Power, Higher Density To A Rack Near You


Original Submission

Ampere Announces Altra ARM CPUs with Up to 80 Cores, Going to 128 Cores by 2021 41 comments

Ampere's Product List: 80 Cores, up to 3.3 GHz at 250 W; 128 Core in Q4

The Ampere Altra range, as part of today's release, will offer parts from 32 cores up to 80 cores, up to 3.3 GHz, with a variety of TDPs up to 250 W. As we've described in our previous news items on the chip, this is an Arm v8.2 core with a few 8.3+8.5 features, offers support for FP16 and INT8, supports 8 channels of DDR4-3200 ECC at 2 DIMMs per channel, and up to 4 TiB of memory per socket in a 1P or 2P configuration. Each CPU will offer 128 PCIe 4.0 lanes, 32 of which can be used for socket-to-socket communications implemented with the CCIX protocol over PCIe. This means 50 GB/s in each direction, and 192 PCIe 4.0 lanes in a dual socket system for add-in cards. Each of the PCIe lanes can bifurcate down to x2.

[...] Previously Ampere had stated they were going for 80 cores at 3.0 GHz at 210 W, however the Q80-33 is pushing that frequency another 300 MHz for another 40 W, and we understand that the tapeout of silicon from TSMC performed better than expected, hence this new top processor.

[...] If that wasn't enough, Ampere dropped a sizeable nugget into our pre-announcement briefing. The company is set to launch a 128-core version of Altra later this year.

This will be a new silicon design, beyond Ampere's initial layout of 80 cores for Altra, however Ampere states that while they are using the same platform as the regular Altra, they have done extensive tweaking and optimizations within the mesh interconnect for Altra Max to hide the additional contention that might occur when using the same main memory speeds.

Altra Max will be socket and pin-compatible with Altra, also support dual socket deployments, and Ampere states that the silicon will be ready for early sampling with partners in Q4, and is looking to move into high volume in mid-2021.

Previously: Ampere Launches its First ARM-Based Server Processors in Challenge to Intel
80-Core Arm CPU To Bring Lower Power, Higher Density To A Rack Near You

Related: Amazon Announces 64-core Graviton2 Arm CPU
Marvell Announces ThunderX3, an ARM Server CPU With 96 Cores, 384 Threads
AMD and Intel Have a Formidable New Foe (Amazon)


Original Submission

ARM Announces Neoverse V1 and N2 Cores 7 comments

Arm Announces Neoverse V1 & N2 Infrastructure CPUs: +50% IPC, SVE Server Cores

Amazon's Graviton2 64-core Neoverse N1 server chip is the first of what should become a wider range of designs that will be driving the Arm server ecosystem forward and actively assaulting the infrastructure CPU market share that's currently dominated by the x86 players such as Intel and AMD.

[...] Today, we're ready to take the next step towards the next generation of the Neoverse platform, not only revealing the CPU microarchitecture previously known as Zeus, but a whole new product category that goes beyond the Neoverse N-series: Introducing the new Neoverse V-series and the Neoverse V1 (Zeus), as well as a new roadmap insertion in the form of the Neoverse N2 (Perseus).

[...] In terms of generational performance uplift, it's akin to Arm throwing down the gauntlet to the competition, achieving a ground-breaking +50[sic, % obvs] IPC boost compared to Neoverse N1 that we're seeing in silicon today. The performance uplift potential here is tremendous, as this is merely a same-process ISO-frequency upgrade, and actual products based on the V1 will also in all likelihood also see additional performance gains thanks to increased frequencies through process node advancements.

If we take the conservatively clocked Graviton2 with its 2.5GHz N1 cores as a baseline, a theoretical 3GHz V1 chip would represent an 80% uplift in per-core single-threaded performance. Not only would such a performance uptick vastly exceed any current x86 competition in the server space in terms of per-core performance, it would be enough to match the current best high-performance desktop chips from AMD and Intel today (Though we have to remember it'll compete against next-gen Zen3 Milan and Sapphire Rapids products).

[...] Alongside the Neoverse V1 platform, we've seen a roadmap insertion that previously wasn't there. The Perseus design will become the Neoverse N2, and will be the effective product-positioning successor to the N1. This new CPU IP represents a 40% IPC uplift compared to the N1, however still maintains the same design philosophy of maximising performance within the lowest power and smallest area.

Neoverse V1 is basically the server-oriented equivalent of the Cortex-X1 core, where performance is prioritized at the cost of less power efficiency and a greater die area (more cache, etc.). Neoverse N2 is more like (an unannounced successor of) Cortex-A78.

Also at TechPowerUp.

Related: Amazon Announces 64-core Graviton2 Arm CPU


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2, Insightful) by Snotnose on Wednesday December 04 2019, @12:45AM (8 children)

    by Snotnose (1623) on Wednesday December 04 2019, @12:45AM (#927911)

    Amazon is making server chips. They don't need floating point, that's a big chunk of silicon. They don't need default graphics for those cheapskates that are trying to avoid buying a $500 graphics card, that's a huge chunk of silicon. Their cores don't need to talk to each other so much, so they are saving a hell of a lot on multi-core interconnects. They don't care so much about other threads sniffing cryptographic keys, so instruction pipelines et all can go full speed.

    Not having a problem seeing how Amazon's custom built chips can give Intel's general purpose chips a good ass whipping. In fact, it's pretty much a textbook case of why you make custom silicon.

    --
    When the dust settled America realized it was saved by a porn star.
    • (Score: 5, Informative) by takyon on Wednesday December 04 2019, @01:39AM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Wednesday December 04 2019, @01:39AM (#927929) Journal

      https://aws.amazon.com/about-aws/whats-new/2019/12/announcing-new-amazon-ec2-m6g-c6g-and-r6g-instances-powered-by-next-generation-arm-based-aws-graviton2-processors/ [amazon.com]

      AWS Graviton2 processors deliver several performance optimizations over the first generation AWS Graviton processors such as 7x performance, 4x the number of compute cores, 2x larger private caches per core, 5x faster memory, and 2x faster floating-point performance per core.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 5, Informative) by sgleysti on Wednesday December 04 2019, @01:42AM (1 child)

      by sgleysti (56) Subscriber Badge on Wednesday December 04 2019, @01:42AM (#927932)

      Their cores don't need to talk to each other so much, so they are saving a hell of a lot on multi-core interconnects.

      vs.

      ...with all 64 cores connected by a mesh fabric supporting 2TB/s aggregate bandwidth...

    • (Score: 1, Touché) by Anonymous Coward on Wednesday December 04 2019, @01:46AM

      by Anonymous Coward on Wednesday December 04 2019, @01:46AM (#927935)

      Almost everything in this post is wrong, I'll add to the others that have been pointed out and note that giving an ass whipping to Intel's chips is not interesting. AMD Epyc cpus are the benchmark now.

    • (Score: 0) by Anonymous Coward on Wednesday December 04 2019, @04:11AM (1 child)

      by Anonymous Coward on Wednesday December 04 2019, @04:11AM (#927982)

      They don't care so much about other threads sniffing cryptographic keys

      Amazon server chips go into AWS, where everybody's code runs, including amazon's - so they care.

    • (Score: 2) by takyon on Friday December 06 2019, @12:10AM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Friday December 06 2019, @12:10AM (#928671) Journal

      Any follow-up on these guesses?

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 2) by Snotnose on Friday December 06 2019, @12:40AM

      by Snotnose (1623) on Friday December 06 2019, @12:40AM (#928680)

      The dozen or so followups said exactly how I was wrong in this post. Yet I'm still insightful?

      sigh

      Those followups reminded me I retired 10 years ago and I'm not exactly up to date anymore. Which is sad, I bought a TRS-80 in '78 and grew up with personal computing, making my living with embedded software.

      I'm old. At least I still have a warm lap for a cat to curl up in.

      --
      When the dust settled America realized it was saved by a porn star.
  • (Score: 3, Interesting) by Rosco P. Coltrane on Wednesday December 04 2019, @02:53AM (4 children)

    by Rosco P. Coltrane (4757) on Wednesday December 04 2019, @02:53AM (#927952)

    What undocumented data collection features are baked into the silicon?

    • (Score: 2) by takyon on Wednesday December 04 2019, @08:59AM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Wednesday December 04 2019, @08:59AM (#928028) Journal

      These chips are running at Amazon... you don't even have physical access.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 0) by Anonymous Coward on Wednesday December 04 2019, @01:26PM (2 children)

      by Anonymous Coward on Wednesday December 04 2019, @01:26PM (#928093)

      1-Click with new and improved cyber blockchain AI, is now baked into the silicon. Every time you click your mouse, type on your keyboard, or tap on a mobile device, a portion of your money is transferred directly from your bank account or cryptowallet directly into Amazon's.

      • (Score: 0) by Anonymous Coward on Thursday December 05 2019, @02:20AM (1 child)

        by Anonymous Coward on Thursday December 05 2019, @02:20AM (#928317)

        cyber blockchain AI, is now baked into the silico

        Source? This is what I have been looking for but do not see it in TFA.

        • (Score: 2) by Walzmyn on Friday December 06 2019, @02:36PM

          by Walzmyn (987) on Friday December 06 2019, @02:36PM (#928840)

          [insert gif of joke flying over stick figure]

(1)