Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 17 submissions in the queue.
posted by Fnord666 on Wednesday November 04 2020, @06:03PM   Printer-friendly

Arm Cortex-A78C core supports up to 8 cores per cluster, 8MB L3 cache for always-on laptops

Arm Cortex-A78 CPU core was first introduced in May 2020 with a focus on mobile devices like smartphones and was followed by Cortex-A78AE for automotive and industrial embedded applications in September.

The company has now introduced a new variant with Arm Cortex-A78C supporting up to eight cores per cluster, a larger cache up to 8MB for higher performance, and advanced security features all designed for always-on laptops and other "on-the-go" devices.

[...] All those improvements will provide increased performance in laptops, likely at the cost of higher power consumption, but considering Arm laptop often get over 20 hours of battery life, it may be a worthwhile compromise to lose a couple of hours of battery life for higher performance.

This is being seen as a reaction to Apple's custom ARM SoCs for Macs, which are expected to be announced within a week. A successor to the Qualcomm Snapdragon 8cx could use 8 "big" cores.

Also at Wccftech.

Previously: ARM Announces Cortex-A78 and Cortex-X1


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by takyon on Thursday November 05 2020, @09:01PM

    by takyon (881) <{takyon} {at} {soylentnews.org}> on Thursday November 05 2020, @09:01PM (#1073541) Journal

    This is from today's AnandTech 5950X and 5900X review:

    Page 3 [anandtech.com]

    Being an x86 core, of the difficulties of the ISA is the fact that instructions are of a variable length with encoding varying from 1 byte to 15 bytes. This has been legacy side-effect of the continuous extensions to the instruction set over the decades, and as modern CPU microarchitectures become wider in their execution throughput, it had become an issue for architects to design efficient wide decoders. For Zen3, AMD opted to remain with a 4-wide design as going wider would have meant additional pipeline cycles which would have reduced the performance of the whole design.

    Page 4 [anandtech.com]

    I do hope that these designs come in a timely fashion with impressive changes, as the competition from the Arm side is definitely heating up, with designs such as the Cortex-X1 or the Neoverse-V1 appearing to be more than a match for lower-clocked Zen3 designs (such as in the server/enterprise space). On the consumer side of things, AMD appears to be currently unrivalled, although we’ll be keeping an eye open for the upcoming Apple silicon.

    Page 9 [anandtech.com]

    https://images.anandtech.com/graphs/graph16214/119125.png [anandtech.com]

    In the performance per clock uplifts, measured at peak performance, we’re seeing a 20.87% median and 24.99% average improvement for the new Zen3 microarchitecture when compared to last year’s Zen2 design. AMD is still quite behind Apple’s A13 and A14 (review coming soon), but that’s natural given the almost double the microarchitectural width of Apple’s design, running at lower frequencies. It’ll be interesting to get Apple Silicon Mac devices tested and compared against the new AMD parts.

    [...] What I hope to see from AMD in future designs is a more aggressive push towards a wider core design with even larger IPC jumps. In workloads that are more execution bound, Zen3 isn’t all that big of an uplift. The move from a 16MB to a 32MB L3 cache isn’t something that’ll repeated any time soon in terms of improvement magnitude, and it’s also very doubtful we’ll see significant frequency uplifts with coming generations. As Moore’s Law is slowing, going wider and smarter seems to be the only way forward for advancing performance.

    https://twitter.com/andreif7/status/1324431700663930889 [twitter.com]
    https://twitter.com/andreif7/status/1324436277970829315 [twitter.com]

    Obviously, there are things that can be improved, and advantages that could be leveraged to perform better when compared to Apple/ARM. One of the things rumored for Zen 4 is a nice big (gigabytes) L4 cache (e.g. HBM) stacked on top of the I/O die. Ultimately, decreasing the distance between cores and memory is one of the best ways to improve performance, and putting towers of cache around is an intermediate step. The L3 cache could be layered up [soylentnews.org], for instance.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3