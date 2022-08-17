Today at the Hot Chips conference, the company released schematics and details about the internal workings of the SoC that is set to power the upcoming 4K-ready gaming console. We already knew much of what the company discussed at the Hot Chips presentation, including the core count; clock speed; and bandwidth specifications of the CPU, GPU, and memory used in the system, but now we know how the components interact with each other.

[...] The Scorpio Engine is a monster of an SoC developed by AMD, featuring a 359mm2 die with seven billion transistors built on TSMC's 16nm FinFETT+ technology. The GPU compute units (the yellow section of the layout) consume most of the large die's surface area. The Scorpio Engine's GPU components include four shader arrays that each offer 11 compute units. Microsoft said that one compute unit per shader array is left inactive to compensate for yield problems that may occur.

The right side of the SoC die features the two four-core 2.3GHz CPU clusters (represented in dark green on the diagram). A pair of cache controllers flanks each CPU cluster. Twelve GDDR5 memory controllers line the top, bottom, and right edges of the SoC. The retail Xbox One X features 12GB of memory. Developer kits offer 2GB per channel for a total of 24GB system memory.

[...] When Microsoft announced Project Scorpio, the company boasted that the new console would be the first to deliver 6Tflops of 32-bit floating point performance. During the Hot Chips presentation, the company said that it managed to squeeze out "just a hair more than 6Tflops." Each of the 40 compute units can perform 128 floating point operations second. Multiplied by the 1,172MHz core clock, that's a total of 6,000,640 Flops.

[...] The new console features an eight-core Jaguar-derived CPU like the one found in the Xbox One S console, but it operates 31% faster than the previous version. Microsoft said that most of the CPU performance optimizations revolve around memory latency improvements of the main memory controllers (up to 20%). The company attributes the improvement to tripling the available memory channels and increasing the number of main memory banks by a multiple of six. It also credits the rearrangement and enlargement of the TLB cache, and the introduction of a redesigned and larger Page Descriptor Cache, which "caches information about nesting page translations" and improves performance by "up to 4.3%."