from the and-the-little-core-could-outcompute-a-90s-mainframe dept.
Intel Lakefield will be the first processors to feature the chipmaker's 3D Foveros packaging. Foveros is a technology that essentially allows Intel to stack chips one on top of the other, equivalent to what storage manufacturers are doing with some new types of 3D NAND (string stacking).
According to 3DMark's report, the unidentified processor is equipped with five cores, which concurs with the core configuration for Intel's Lakefield chips. As you recall, Lakefield utilizes a design that's similar to ARM's big.LITTLE architecture. Intel complements the powerful core with other slower and more energy-efficient cores.
In Lakefield's case, Intel plans to endow the processor with one Sunny Cove core and four accompanying Atom Tremont cores. The chipmaker will cook up Lakefield chips with a combination of manufacturing process. Intel uses the 10nm node for the compute die and the 22nm node for the base die.
I'd like to see configurations with 1 small core for every 4 big cores, with the small cores handling low-level and background tasks.
Previously: Intel Details Lakefield CPU SoC With 3D Packaging and Big/Small Core Configuration
AMD Plans to Stack DRAM and SRAM on Top of its Future Processors
Intel Reveals Three New Packaging Technologies for Stitching Multiple Dies Into One Processor
Intel Lakefield is based around Foveros technology which helps connect chips and chiplets in a single package that matches the functionality and performance of a monolithic SOC. Each die is then stacked using FTF micro-bumps on the active interposer through which TSVs are drilled to connect with solder bumps and eventually the final package. The whole SOC is just 12×12 (mm) which is 144mm2.
Talking about the SOC itself and its individual layers, the Lakefield SOC that has been previewed consists of at least four layers or dies, each serving a different purpose. The top two layers are composed of the DRAM which will supplement the processor as the main system memory. This is done through the PoP (Package on Package) memory layout which stacks two BGA DRAMs on top of each other as illustrated in the preview video. The SOC won't have to rely on socketed DRAM in this case which saves a lot of footprint on the main board.
The second layer is the Compute Chiplet with a Hybrid CPU architecture and graphics, based on the 10nm process node. The Hybrid CPU architecture has a total of five individual Cores, one of them is labeled as the Big Core which features the Sunny Cove architecture. That's the same CPU architecture that will be featured on Intel's upcoming 10nm Ice Lake processors. The Sunny Cove Core is optimized for high-performance throughput. There are also four small CPUs that are based on the 10nm process but optimized for power efficiency. The same die [has] Intel's Gen 11 graphics engine with 64 Execution Units.
[...] [Last] of all is the base die which serves as the cache and I/O block of the SOC. Labeled as the P1222 and based on a 22FFL process node, the base die comes with a low cost and low leakage design while providing a feature-rich array of I/O capabilities.
It would be nice to finally see some consumer CPUs with stacked DRAM, although the amount was not specified (8 GB?).
AMD revealed at a recent high performance computing event that it is working on new designs that use 3D-stacked DRAM and SRAM on top of its processors to improve performance.
[...] Intel whipped the covers off its Foveros 3D chip stacking technology during its recent Architecture Day event and revealed it already has a leading-edge product ready to enter production. The package consists of a 10nm CPU and an I/O chip mated with TSVs (Through Silicon Via) that connect the die through vertical electrical connections in the center of the die. Intel also added a memory chip to the top of the stack using a conventional PoP (Package on Package) implementation.
Not to be left behind, AMD is also turning its eyes toward 3D chip stacking techniques, albeit from a slightly different angle. AMD SVP and GM Forrest Norrod recently presented at the Rice Oil and Gas HPC conference and revealed that the company has its own 3D stacking intiative underway.
[...] [True] 3D stacking consists of two die (in this case, memory and a processor) placed on top of each other and connected through vertical TSV connections that mate the die directly together. These TSV connections, which transfer data between the two die at the fastest speeds possible, typically reside in the center of the die. That direct mating increases performance and reduces power consumption (all data movement requires power, but direct connections streamline the process). 3D stacking also affords density advantages.
Where are the CPUs with attached High Bandwidth Memory?
Intel will be using a few packaging technologies to connect CPU core "chiplets":
Intel revealed three new packaging technologies at SEMICON West: Co-EMIB, Omni-Directional Interconnect (ODI) and Multi-Die I/O (MDIO). These new technologies enable massive designs by stitching together multiple dies into one processor. Building upon Intel's 2.5D EMIB and 3D Foveros tech, the technologies aim to bring near-monolithic power and performance to heterogeneous packages. For the data-center, that could enable a platform scope that far exceeds the die-size limits of single dies.
[...] Compared to interposers, which can be reticle-sized (832mm2) or even larger, [EMIB (Embedded Multi-die Interconnect Bridge)] is just a small (hence, cheap) piece of silicon. It provides the same bandwidth and energy-per-bit advantages of an interposer compared to standard package traces, which are traditionally used for multi-chip packages (MCPs), such as AMD's Infinity Fabric. (To some extent, because the PCH is a separate die, chiplets have actually been around for a very long time.)
[...] Intel showed off a concept product that contains four Foveros stacks, with each stack having eight small compute chiplets that are connected via TSVs to the base die. (So the role of Foveros there is to connect the chiplets as if it were a monolithic die.) Each Foveros stack is then interconnected via two (Co-)EMIB links with its two adjacent Foveros stacks. Co-EMIB is further used to connect the HBM and transceivers to the compute stacks.
Evidently, the cost of such a product would be enormous, as it essentially contains multiple traditional monolithic-class products in a single package. That's likely why Intel categorized it as a data-centric concept product, aimed mainly at the cloud players that are more than happy to absorb those costs in exchange for the extra performance.
[...] When they are ready, these technologies will provide Intel with powerful capabilities for the heterogeneous and data-centric era. On the client side, the benefits of advanced packaging include smaller package size and lower power consumption (for Lakefield, Intel claims a 10x SoC standby power improvement at 2.6mW). In the data center, advanced packaging will help to build very large and powerful platforms on a single package, with performance, latency, and power characteristics close to what a monolithic die would yield. The yield advantage of small chiplets and the establishment of chipset ecosystem are major drivers, too.
Related: Intel Core i7-8809G with Radeon Graphics and High Bandwidth Memory: Details Leaked
Intel Announces "Sunny Cove", Gen11 Graphics, Discrete Graphics Brand Name, 3D Packaging, and More
Intel Promises "10nm" Chips by the End of 2019, and More
Intel Details Lakefield CPU SoC With 3D Packaging and Big/Small Core Configuration
Intel's Jim Keller Promises That "Moore's Law" is Not Dead, Outlines 50x Improvement Plan
While Intel has been discussing a lot about its mainstream Core microarchitecture, it can become easy to forget that its lower power Atom designs are still prevalent in many commercial verticals. Last year at Intel's Architecture Summit, the company unveiled an extended roadmap showing the next three generations of Atom following Goldmont Plus: Tremont, Gracemont, and 'Future Mont'. Tremont is set to be launched this year, coming first in a low powered hybrid x86 design called Lakefield for notebooks, and using a new stacking technology called Foveros built on 10+ nm. At the Linley Processor Conference today, Intel unveiled more about the microarchitecture behind Tremont.
[...] The Atom core within a given family is usually identical (L2 [cache] configuration might change), and because of the SoC in play, it might get a different name based on the market where it was headed. Intel scrapped the smartphone program back with Broxton in 2016, and the tablet type of SoC has also gone away. With Lakefield, combining Core and Atom, it could be used in Tablets again for 2019/2020, but we will see it in Notebooks with the Surface Pro Neo and in networking/embedded markets as Snow Ridge.
[...] The interesting thing here in our briefing with Intel is that they specifically stated that Tremont was built with performance in mind, and the aim was for a sizeable uptick in the raw clock-for-clock throughput compared to the previous generation Atom, Goldmont Plus. Based on Intel's own metrics, namely using SPEC, Intel is going to claim an average 30% iso-frequency performance uplift in core performance for Tremont over Goldmont Plus. It's worth noting here that this data is from an early Tremont design we were told, and should represent minimum uplifts.
[...] A 30% average jump in performance is a sizeable jump for any generation-to-generation cadence. Just taking it as-is feels premature: aside from microarchitectural advancements and a jump to 10nm, there has to be something at play here – either the power budget of Atom has ballooned, or the die area. With Intel explicitly out of the gate stating that their focusing on performance, a cynic is going to suggested that something else has paid that price, and to that end Intel wasn't prepared to talk about power windows or die area, though they did point to the already announced Lakefield CPU, which has a 1 x Core + 4 x Tremont design