from the top-that! dept.
AMD revealed at a recent high performance computing event that it is working on new designs that use 3D-stacked DRAM and SRAM on top of its processors to improve performance.
[...] Intel whipped the covers off its Foveros 3D chip stacking technology during its recent Architecture Day event and revealed it already has a leading-edge product ready to enter production. The package consists of a 10nm CPU and an I/O chip mated with TSVs (Through Silicon Via) that connect the die through vertical electrical connections in the center of the die. Intel also added a memory chip to the top of the stack using a conventional PoP (Package on Package) implementation.
Not to be left behind, AMD is also turning its eyes toward 3D chip stacking techniques, albeit from a slightly different angle. AMD SVP and GM Forrest Norrod recently presented at the Rice Oil and Gas HPC conference and revealed that the company has its own 3D stacking intiative underway.
[...] [True] 3D stacking consists of two die (in this case, memory and a processor) placed on top of each other and connected through vertical TSV connections that mate the die directly together. These TSV connections, which transfer data between the two die at the fastest speeds possible, typically reside in the center of the die. That direct mating increases performance and reduces power consumption (all data movement requires power, but direct connections streamline the process). 3D stacking also affords density advantages.
Where are the CPUs with attached High Bandwidth Memory?
Intel has announced new developments at its Architecture Day 2018:
Sunny Cove, built on 10nm, will come to market in 2019 and offer increased single-threaded performance, new instructions, and 'improved scalability'. Intel went into more detail about the Sunny Cove microarchitecture, which is in the next part of this article. To avoid doubt, Sunny Cove will have AVX-512. We believe that these cores, when paired with Gen11 graphics, will be called Ice Lake.
Willow Cove looks like it will be a 2020 core design, most likely also on 10nm. Intel lists the highlights here as a cache redesign (which might mean L1/L2 adjustments), new transistor optimizations (manufacturing based), and additional security features, likely referring to further enhancements from new classes of side-channel attacks. Golden Cove rounds out the trio, and is firmly in that 2021 segment in the graph. Process node here is a question mark, but we're likely to see it on 10nm and or 7nm. Golden Cove is where Intel adds another slice of the serious pie onto its plate, with an increase in single threaded performance, a focus on AI performance, and potential networking and AI additions to the core design. Security features also look like they get a boost.
Intel says that GT2 Gen11 integrated graphics with 64 execution units will reach 1 teraflops of performance. It compared the graphics solution to previous-generation GT2 graphics with 24 execution units, but did not mention Iris Plus Graphics GT3e, which already reached around 800-900 gigaflops with 48 execution units. The GPU will support Adaptive Sync, which is the standardized version of AMD's FreeSync, enabling variable refresh rates over DisplayPort and reducing screen tearing.
Intel's upcoming discrete graphics cards, planned for release around 2020, will be branded Xe. Xe will cover configurations from integrated and entry-level cards all the way up to datacenter-oriented products.
Like AMD, Intel will also organize cores into "chiplets". But it also announced FOVEROS, a 3D packaging technology that will allow it to mix chips from different process nodes, stack DRAM on top of components, etc. A related development is Intel's demonstration of "hybrid x86" CPUs. Like ARM's big.LITTLE and DynamIQ heterogeneous computing architectures, Intel can combine its large "Core" with smaller Atom cores. In fact, it created a 12mm×12mm×1mm SoC (compare to a dime coin which has a radius of 17.91mm and thickness of 1.35mm) with a single "Sunny Cove" core, four Atom cores, Gen11 graphics, and just 2 mW of standby power draw.
Intel Lakefield is based around Foveros technology which helps connect chips and chiplets in a single package that matches the functionality and performance of a monolithic SOC. Each die is then stacked using FTF micro-bumps on the active interposer through which TSVs are drilled to connect with solder bumps and eventually the final package. The whole SOC is just 12×12 (mm) which is 144mm2.
Talking about the SOC itself and its individual layers, the Lakefield SOC that has been previewed consists of at least four layers or dies, each serving a different purpose. The top two layers are composed of the DRAM which will supplement the processor as the main system memory. This is done through the PoP (Package on Package) memory layout which stacks two BGA DRAMs on top of each other as illustrated in the preview video. The SOC won't have to rely on socketed DRAM in this case which saves a lot of footprint on the main board.
The second layer is the Compute Chiplet with a Hybrid CPU architecture and graphics, based on the 10nm process node. The Hybrid CPU architecture has a total of five individual Cores, one of them is labeled as the Big Core which features the Sunny Cove architecture. That's the same CPU architecture that will be featured on Intel's upcoming 10nm Ice Lake processors. The Sunny Cove Core is optimized for high-performance throughput. There are also four small CPUs that are based on the 10nm process but optimized for power efficiency. The same die [has] Intel's Gen 11 graphics engine with 64 Execution Units.
[...] [Last] of all is the base die which serves as the cache and I/O block of the SOC. Labeled as the P1222 and based on a 22FFL process node, the base die comes with a low cost and low leakage design while providing a feature-rich array of I/O capabilities.
It would be nice to finally see some consumer CPUs with stacked DRAM, although the amount was not specified (8 GB?).
Intel Lakefield will be the first processors to feature the chipmaker's 3D Foveros packaging. Foveros is a technology that essentially allows Intel to stack chips one on top of the other, equivalent to what storage manufacturers are doing with some new types of 3D NAND (string stacking).
According to 3DMark's report, the unidentified processor is equipped with five cores, which concurs with the core configuration for Intel's Lakefield chips. As you recall, Lakefield utilizes a design that's similar to ARM's big.LITTLE architecture. Intel complements the powerful core with other slower and more energy-efficient cores.
In Lakefield's case, Intel plans to endow the processor with one Sunny Cove core and four accompanying Atom Tremont cores. The chipmaker will cook up Lakefield chips with a combination of manufacturing process. Intel uses the 10nm node for the compute die and the 22nm node for the base die.
I'd like to see configurations with 1 small core for every 4 big cores, with the small cores handling low-level and background tasks.
Previously: Intel Details Lakefield CPU SoC With 3D Packaging and Big/Small Core Configuration
AMD Plans to Stack DRAM and SRAM on Top of its Future Processors
Intel Reveals Three New Packaging Technologies for Stitching Multiple Dies Into One Processor
Yesterday, Samsung Electronics had announced a new 3D IC packaging technology called eXtended-Cube, or "X-Cube", allowing chip-stacking of SRAM dies on top of a base logic die through TSVs.
Current TSV deployments in the industry mostly come in the form of stacking memory dies on top of a memory controller die in high-bandwidth-memory (HBM) modules that are then integrated with more complex packaging technologies, such as silicon interposers, which we see in today's high-end GPUs and FPGAs, or through other complex packaging such as Intel's EMIB.
Samsung's X-Cube is quite different to these existing technologies in that it does away with intermediary interposers or silicon bridges, and directly connects a stacked chip on top of the primary logic die of a design.
Samsung has built a 7nm EUV test chip using this methodology by integrating an SRAM die on top of a logic die. The logic die is designed with TSV pillars which then connect to µ-bumps with only 30µm pitch, allowing the SRAM-die to be directly connected to the main die without intermediary mediums. The company this is the industry's first such design with an advanced process node technology.
[...] Stacking more valuable SRAM instead of DRAM on top of the logic chip would likely represent a higher value proposition and return-on-investment to chip designers, as this would allow smaller die footprints for the base logic dies, with larger SRAM cache structures being able to reside on the stacked die. Such a large SRAM die would naturally also allow for significantly more SRAM that would allow for higher performance and lower power usage for a chip.
3D SRAM is not a new idea, but this kind of stacking could become commonplace in CPUs within a few years. SRAM takes up a large amount of CPU die area, so stacking it into layers above or near cores could be beneficial.