Arm Announces Mobile Armv9 CPU Microarchitectures: Cortex-X2, Cortex-A710 & Cortex-A510
It's that time of the year again, and after last month's unveiling of Arm's newest infrastructure Neoverse V1 and Neoverse N2 CPU IPs, it's now time to cover the client and mobile side of things. This year, things Arm is shaking things up quite a bit more than usual as we're seeing new[sic] three new generation microarchitectures for mobile and client: The flagship Cortex-X2 core, a new A78 successor in the form of the Cortex-A710, and for the first time in years, a brand-new little core with the new Cortex-A510. The three new CPUs form a new trio of Armv9 compatible designs that aim to mark a larger architectural/ISA shift that comes very seldomly in the industry.
Alongside the new CPU cores, we're also seeing a new L3 and cluster design with the DSU-110, and Arm is also making a big upgrade in its interconnect IP with the new cache coherent CI-700 mesh network and NI-700 network-on-chip IPs.
The Cortex-X2, A710 and A510 follow up on last year's X1, A78 and A55. For the new Cortex-X2 and A710 in particular, these are direct microarchitectural successors to their predecessors. These parts, while iterating on generational improvements in IPC and efficiency, also incorporate brand-new architectural features in the form of Armv9 and new extensions such as SVE2.
The Cortex-X2 is a large, power-hungry core. Arm is claiming +16% more integer performance than its predecessor, when comparing a design with double the L3 cache (8 MB instead of 4 MB with Cortex-X1). The improvement may not be realized in next year's smartphones due to thermal issues.
The Cortex-A710 can improve performance by 10% at the same power usage, or use 30% less power than the Cortex-A78 while delivering the same performance. This may be dependent on the L3 cache since Arm compares A710 with 8MB to A78 with 4MB, and SoC designers may choose to stick with 4 MB. The A710 will be the only core of the trio to retain 32-bit (AArch32) support, in order to give the Chinese market more time to shift to 64-bit only applications, because it "lacks the homogeneous ecosystem capabilities of the global Play Store markets".
The Cortex-A510 is Arm's long-awaited update to the Cortex-A55, which was launched in 2017. It employs a "merged-core architecture", similar to AMD's maligned Bulldozer microarchitecture, except only the FP/SIMD back-end and L2 cache are shared between core pairs. Using pairs of merged A510 cores in a design is actually optional, but would be expected due to the smaller die size it can achieve. Arm's graph comparing performance and power usage for the A510 and A55 (again with 8 MB vs. 4 MB L3 cache) show that performance and efficiency is nearly identical until they reach higher frequencies, where the A510 pulls ahead by using 20% less power or having 10% more performance at some point.
See also: Arm Announces Armv9 Architecture: SVE2, Security, and the Next Decade
Previously: ARM Cortex-A75, Cortex-A55, and Mali-G72 Announced
ARM Announces Cortex-A78 and Cortex-X1
Related Stories
ARM has announced two new CPU cores, the Cortex-A75 and Cortex-A55. According to ARM, the A75 increases performance by around 22% over the A73 at the same level of power consumption. It can also scale to use more power per core (1-2 W rather than 0.75 W) which could slightly improve the performance of ARM laptops and tablets.
The smaller core, the Cortex-A55, increases performance by around 18% compared to the Cortex-A53, but also increases power consumption by 3%. Thus, power efficiency is about 14-15% better than the A53.
ARM's successor to big.LITTLE, DynamIQ, allows for up to 8 cores of any size (which for now means either the A75 or A55) inside of a single cluster. This means that a configuration including 1x Cortex-A75 and 7x Cortex-A55 cores would be possible, or even optimal according to ARM.
ARM also announced its Mali-G72 GPU, an incremental upgrade to the Mali-G71:
ARM says that the Mali-G72 will see a 25 percent boost to energy efficiency compared with the G71, meaning that SoC designers will have more power to play with to boost performance or increase battery life.
Similarly, the G72 offers 20 percent better performance density, meaning that manufacturers can pack more GPU cores into the same die area as before, giving further potential for a performance boost without an increase in cost. Previously ARM was targeting 16 to 20 Mali-G71 cores as the optimum for mobile, and expects to see the number push closer to the 32 shader core maximum supported by the G72 this time around.
Arm's New Cortex-A78 and Cortex-X1 Microarchitectures: An Efficiency and Performance Divergence
Today for Arm's 2020 TechDay announcements, the company is not just releasing a single new CPU microarchitecture, but two. The long-expected Cortex-A78 is indeed finally making an appearance, but Arm is also introducing its new Cortex-X1 CPU as the company's new flagship performance design. The move is not only surprising, but marks an extremely important divergence in Arm's business model and design methodology, finally addressing some of the company's years-long product line compromises.
[...] The new Cortex-A78 pretty much continues Arm's traditional design philosophy, that being that it's built with a stringent focus on a balance between performance, power, and area (PPA). PPA is the name of the game for the wider industry, and here Arm is pretty much the leading player on the scene, having been able to provide extremely competitive performance at with low power consumption and small die areas. These design targets are the bread & butter of Arm as the company has an incredible range of customers who aim for very different product use-cases – some favoring performance while some other have cost as their top priority.
All in all (we'll get into the details later), the Cortex-A78 promises a 20% improvement in sustained performance under an identical power envelope. This figure is meant to be a product performance projection, combining the microarchitecture's improvements as well as the upcoming 5nm node advancements. The IP should represent a pretty straightforward successor to the already big jump that were the A76 and A77.
[...] The Cortex-X1 was designed within the frame of a new program at Arm, which the company calls the "Cortex-X Custom Program". The program is an evolution of what the company had previously already done with the "Built on Arm Cortex Technology" program released a few years ago. As a reminder, that license allowed customers to collaborate early in the design phase of a new microarchitecture, and request customizations to the configurations, such as a larger re-order buffer (ROB), differently tuned prefetchers, or interface customizations for better integrations into the SoC designs. Qualcomm was the predominant benefactor of this license, fully taking advantage of the core re-branding options.
[...] At the end of the day, what we're getting are two different microarchitectures – both designed by the same team, and both sharing the same fundamental design blocks – but with the A78 focusing on maximizing the PPA metric and having a big focus on efficiency, while the new Cortex-X1 is able to maximize performance, even if that means compromising on higher power usage or a larger die area.
While Cortex-A78 will only improve performance by around 7% from microarchitectural changes alone, Cortex-X1 will improve performance by up to 30% due to a wider design, doubling of most cache sizes, and other changes. Cortex-X1 cores are also expected to reach 3 GHz on a "5nm" node, delivering even more performance. The Cortex-X1 cores could use up to 50-100% more power than Cortex-A77/A78. Cores could be arranged in a 1+3+4 or 2+2+4 setup of Cortex-X1, Cortex-A78, and Cortex-A55 cores.
See also: Arm Announces The Mali-G78: Evolution to 24 Cores
(Score: 3, Informative) by takyon on Wednesday May 26 2021, @12:44AM
Arm Announces New Mali-G710, G610, G510 & G310 Mobile GPU Families [anandtech.com]
Arm freezes hiring until Nvidia takeover, cancels everyone's 'wellbeing' allowance [theregister.com]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 1, Funny) by Anonymous Coward on Wednesday May 26 2021, @12:50AM (1 child)
staring into the lights of the oncoming RISC V
(Score: 2) by takyon on Wednesday May 26 2021, @01:03AM
Not anytime soon. People are getting excited over slow single-core RISC-V SBCs.
That said, the more you look at these new Arm cores, the worse it gets.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Wednesday May 26 2021, @10:35AM
From the article: