from the Computex-Roundup dept.
AMD announced the Radeon RX 6800M, 6700M, and 6600M discrete GPUs for laptops, promising better performance, efficiency, and battery-constrained performance. The Radeon RX 6800M is a 40 compute unit design (equivalent to the Radeon RX 6700 XT on desktop) with 12 GB of VRAM.
AMD biggest announcements were the introduction of FidelityFX Super Resolution (FSR) and the demonstration of a 3D chiplet design. FSR uses a spatial scaling algorithm to upscale game graphics for higher frame rates at a given resolution. The algorithm competes with Nvidia's Deep Learning Super Sampling (DLSS), but will be released as open source and work with some older AMD GPUs, integrated graphics, as well as competing products from Nvidia and Intel (it was shown running on an Nvidia GTX 1060).
AMD CEO Lisa Su also showed off a modified, delidded Ryzen 9 5900X CPU prototype, with "3D V-Cache technology". It was identical to the standard 5900X with the exception of through-silicon via (TSV) stacked L3 cache. This allowed the 5900X prototype to have 192 MB of total L3 cache instead of 64 MB (96 MB per 8-core chiplet). AMD claims it can run games with an average of +15% performance (simply due to the larger cache size), and some version of this will appear in products that are starting production at the end of 2021.
Taiwan Semiconductor Manufacturing Company (TSMC) announced a number of node scaling details and technological advancements at its 2020 Technology Symposium:
TSMC's first "5nm" node (N5) has a lower defect rate than its initial "7nm" node did at the same point in its development cycle (high volume manufacturing, which N5 is now in). This is due in part to increasing use of extreme ultraviolet lithography (EUV). "5nm" will represent 11% of TSMC's sub-"16nm" wafer production in 2020.
TSMC's "3nm" node (N3) will continue to use FinFETs rather than gate-all-around (GAA) transistors, and is scheduled for volume production in mid-late 2022. Performance is expected to improve 10-15% at the same power (compared to N5), or power consumption will be reduced 25-30% for the same performance. Logic area density improvement will be 1.7x, but SRAM density will only increase by 1.2x, leading to a 1.27x overall density increase for chips that are 70% SRAM and 30% logic.
Intel's EMIB (Embedded Die Interconnect Bridge) connects "chiplets" together without using a full silicon interposer. TSMC has its own version that it is calling Local Si Interconnect (LSI), and it will be combined with other packaging technologies. TSMC has demonstrated 12-layer stacking of chips using through silicon vias (TSVs), although cooling or doing anything useful with them could be somebody else's job.
See also: TSMC Updates on Node Availability Beyond Logic: Analog, HV, Sensors, RF
TSMC Launches New N12e Process: FinFET at 0.4V for IoT
2023 Interposers: TSMC Hints at 3400mm2 + 12x HBM in one Package
TSMC and Graphcore Prepare for AI Acceleration on 3nm
TSMC Has Reportedly Secured Orders for Its 2nm Node – Samsung May Not Beat Its Foundry Rival Until 2030, Claims New Report
AMD has announced "Cezanne" desktop APUs with Zen 3 cores and Vega integrated graphics:
AMD has officially launched its next-generation Ryzen 5000G APUs codenamed Cezanne which features the brand new Zen 3 core architecture. The AMD Ryzen 5000G APUs are aimed at the consumer segment with an initial supply coming to OEM PCs first and later heading out to the gaming & mainstream DIY segment.
The AMD Ryzen 7 5700G will be the flagship offering within the lineup. It will feature 8 cores and 16 threads. The clock speeds are reported at a 3.8 GHz base and a 4.6 GHz boost. The CPU will carry a total of 16 MB L3 and 4 MB L2 cache with the TDP being set at 65W. The APU will also carry a Vega integrated GPU with 8 CUs or 512 stream processors running at clock speeds around 2.0 GHz. The 35W Ryzen 7 5700GE will feature the same specs but reduced core clocks of 3.2 GHz base and 4.6 GHz boost. The CPU should retail at around $350-$400 US.
Unlike the previous-generation "Renoir" desktop APUs, these processors should eventually see a retail release, but will only be offered by OEMs for now.
AMD has also launched OEM-only Ryzen 9 5900 and Ryzen 7 5800 CPUs with lower TDPs and clock speeds than their X counterparts.
Also at AnandTech.
AMD has announced its "Milan-X" Epyc CPUs, which reuse the same Zen 3 chiplets found in "Milan" Epyc CPUs with up to 64 cores, but with triple the L3 cache using stacked "3D V-Cache" technology designed in partnership with TSMC. This means that some Epyc CPUs will go from having 256 MiB of L3 cache to a whopping 768 MiB (804 MiB of cache when including L1 and L2 cache). 2-socket servers using Milan-X can have over 1.5 gigabytes of L3 cache. The huge amount of additional cache results in average performance gains in "targeted workloads" of around 50% according to AMD. Microsoft found an 80% improvement in some workloads (e.g. computational fluid dynamics) due to the increase in effective memory bandwidth.
AMD's next-generation of Instinct high-performance computing GPUs will use a multi-chip module (MCM) design, essentially chiplets for GPUs. The Instinct MI250X includes two "CDNA 2" dies for a total of 220 compute units, compared to 120 compute units for the previous MI100 monolithic GPU. Performance is roughly doubled (FP32 Vector/Matrix, FP16 Matrix, INT8 Matrix), quadrupled (FP64 Vector), or octupled (FP64 Matrix). VRAM has been quadrupled to 128 GB of High Bandwidth Memory. Power consumption of the world's first MCM GPU will be high, as it has a 560 Watt TDP.
The Frontier exascale supercomputer will use both Epyc CPUs and Instinct MI200 GPUs.
AMD officially confirmed that upcoming Zen 4 "Genoa" Epyc CPUs made on a TSMC "5nm" node will have up to 96 cores. AMD also announced "Bergamo", a 128-core "Zen 4c" Epyc variant, with the 'c' indicating "cloud-optimized". This is a denser, more power-efficient version of Zen 4 with a smaller cache. According to a recent leak, Zen 4c chiplets will have 16 cores instead of 8, will retain hyperthreading, and will be used in future Zen 5 Ryzen desktop CPUs as AMD's answer to Intel's Alder Lake heterogeneous ("big.LITTLE") x86 microarchitecture.
Also at Tom's Hardware (Milan-X).
Previously: AMD Reveals 'Instinct' for Machine Intelligence
AMD Launches "Milan" Epyc Server CPUs, with Zen 3 and up to 64 Cores
AMD at Computex 2021: 5000G APUs, 6000M Mobile GPUs, FidelityFX Super Resolution, and 3D Chiplets
AMD Unveils New Ryzen V-Cache Details at HotChips 33
AMD Aims to Increase Energy Efficiency of Epyc CPUs and Instinct AI Accelerators 30x by 2025
The most prominent form of [upscaling algorithm] today is Nvidia's [Deep Learning Super Sampling (DLSS)], the company's proprietary AI-based temporal upscaling solution that runs on GeForce RTX GPUs. Temporal upscaling means data is accumulated from multiple frames and combined into the final image, with the AI component running on Nvidia's Tensor cores to assist with this reconstruction. DLSS has gone through more than one iteration, and right now at version 2.0, it's a major improvement over the initial release and it's also gathered decent game support after much work from Nvidia.
[AMD's FidelityFX Super Resolution (FSR)] takes a different approach. Instead of using temporal upscaling, FSR relies exclusively on spatial upscaling. AMD tells us that AI is not used at any stage of the FSR process (so FSR is not the technology described in that patent that's been floating around). This greatly simplifies the algorithm – spatial upscaling does not rely on data from multiple frames, or motion vectors, which makes it easier to integrate into games as there are less data inputs. However, with less data to work with, spatial upscaling algorithms need to be really good at figuring out how to reconstruct the image, and traditionally this is where they've fallen short.
AMD hasn't gone into great detail on how their algorithm works, but they tell us this is not a simple rehash of bilinear upscaling, which is the 'standard' method for spatial upscaling. AMD calls their technique an "advanced edge reconstruction algorithm," which is combined with a sharpening pass to create the final image. There is only one input to the algorithm, which is the lower resolution frame.
After AMD announced FidelityFX Super Resolution 2.0 back in March, as of today they have made good on their word to open-source it.
This temporal upscaling solution for game engines is now available under an MIT license. AMD self-describes FidelityFX Super Resolution 2.0 as, "FSR 2 uses cutting-edge temporal algorithms to reconstruct fine geometric and texture detail, producing anti-aliased output from aliased input. FSR 2 technology has been developed from the ground up, and is the result of years of research from AMD. It has been designed to provide higher image quality compared to FSR 1, our original open source spatial upscaling solution launched in June 2021."