Stories
Slash Boxes
Comments

SoylentNews is people

posted by chromas on Tuesday September 11 2018, @03:03AM   Printer-friendly
from the nano-SoCs dept.

Samsung Foundry Updates: 8LPU Added, EUVL on Track for HVM in 2019

Samsung recently hosted its Samsung Foundry Forum 2018 in Japan, where it made several significant foundry announcements. Besides reiterating plans to start high-volume manufacturing (HVM) using extreme ultraviolet lithography (EUVL) tools in the coming quarters, along with reaffirming plans to use gate all around FETs (GAAFETs) with its 3 nm node, the company also added its brand-new 8LPU process technology to its roadmap. Samsung Foundry's general roadmap was announced earlier this year, so at SFF in Japan the contract maker of semiconductors reiterated some of its plans, made certain corrections, and provided some additional details about its future plans.

First up, Samsung added another fabrication technology into its family of manufacturing processes based on its 10 nm node. The new tech is called 8LPU (low power ultimate) and, according to Samsung's usual classification, this is a process for SoCs that require both high clocks and high transistor density. Samsung's 8LPP technology, which qualified for production last year, is a development of Samsung's 10 nm node that uses narrower metal pitches to deliver a 10% area reduction (at the same complexity) as well as a 10% lower power consumption (at the same frequency and complexity) compared to 10LPP process. 8LPU is a further evolution of the technology platform that likely increases transistor density and frequency potential vs 8LPP. Meanwhile Samsung does not disclose how it managed to improve 8LPU vs. 8LPP and whether it involved advances of design rules, usage of a new library, or a shrink of metal pitches. Samsung's 8LPP and 8LPU technologies are aimed at customers who need higher performance or lower power and/or higher transistor density than what Samsung's 10LPP, 10LPC, and 10LPU processes can offer, but who cannot gain access to Samsung's 7LPP or more advanced manufacturing technologies that use EUVL. Risk production using 8LPU was set to start in 2018, so expect high-volume manufacturing to commence next year at Samsung's Fab S1 in Giheung, South Korea.

[...] By the time the new production line in Hwaseong becomes operational, Samsung Foundry promises to start risk production using its 5/4 nm node. As reported earlier this year, Samsung is prepping 5LPE, 4LPE, and 4LPP fabrication technologies, but eventually this list will likely expand. Based on what Samsung has disclosed about all three manufacturing processes so far, they will have certain similarities, which will simplify migration from 5LPE all the way to 4LPP, though the company does not elaborate. [...] One of the unexpected things that Samsung Foundry announced was start of risk production using its 3 nm node already in 2020, which is at least a year ahead of what was expected earlier. Samsung's 3 nm will be the first node to use the company's own GAAFET implementation called MBCFET (multi-bridge-channel FETs) and will officially include at least two process technologies: 3GAAE and 3GAAP (3nm gate-all-around early/plus).

Previously: Samsung Roadmap Includes "5nm", "4nm" and "3nm" Manufacturing Nodes
Samsung Plans to Make "5nm" Chips Starting in 2019-2020

Related: GlobalFoundries Abandons "7nm LP" Node, TSMC and Samsung to Pick Up the Slack


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by LoRdTAW on Tuesday September 11 2018, @04:23PM (1 child)

    by LoRdTAW (3755) on Tuesday September 11 2018, @04:23PM (#733185) Journal

    Like, why not a 128bit or 256bit CPU architecture for one thing?

    mores bits != faster a faster processor unless you work with a lot of really big numbers. Besides that's what AVX instructions are for.

    Maybe another new memory bus, DDR6. Or, TriDR1 for Triple Data Rate 1, or more like QDR1, since computers work so much better with base 2 and 4 than with base 3?

    Maybe. But for now the problem is stuffing too many CPU's on one die with a comparatively dinky memory controller. We only get away with it at the desktop level because most people don't crunch through huge sets of data in memory across multiple cpu's. The larger caches on the multi "core" monsters hide this shortcoming for most desktop loads. But some loads have demonstrated the weakness of the monster multicore like the bandwidth issue of the 32 core AMD threadripper. It's a balancing act that the user has to consider. Someone building an HPC might use more physical processor sockets with lower core counts to mitigate memory bandwidth issues.

    And maybe PCI Express is getting a bit long in the tooth? How can a freaking _serial_ standard have beaten a parallel one? We may be seeing a "Parallel PCI" (PPCI) bus next decade.

    No need for parallel PCI as PCIe links may consist of multiple lanes up to 32x. Video cards use 16 links to make one x16 link. So there's your parallel PCI. The serial protocol allows for this flexibility where parallel PCI is stuck at x bits width. The advantages are numerous the big ones being pin count reduction and trace count is severely reduced making trace management at the PCB level easier in order to maintain equal trace length to maintain phasing. It can also lower board costs by reducing the number of layers needed. win-win.

    Or perhaps the whole mess will be ditched in favor of "Super SoC".

    We're there already. AMD has a nice set of SoC's for embedded use, the G series (I have a PCengines APU2). Intel also has SoC's for embedded and I believe mobile/industrial use as well. The desktop is still demands a bit of flexibility and performance so the pincount on those packages is needed for lots of PCIe lanes (NVME uses up to 4 lanes per "disk"), processor links, and two or more 64 bit memory channels. I'm sure we will see low/mid range desktop and mobile SoC's in the future as the desktop shrinks down to smaller sizes.

    Then there's the old trick of implementing common software routines in hardware. Lot more mining to do there, why should only video decoding be implemented in hardware?

    The The APU and heterogeneous computing is going to come into play here eventually. And while hardware video decoders may appear redundant when paired with beefy multicore cpu's, they actually level the playing field by removing the CPU variable from the equation. With a hardware decoder, it doesn't matter if your CPU is 1GHz single core or 4GHz 16 core, it's going to play back properly.

    And of course our software has gotten very bloated. Would be a lot of effort, but there is doubtless an awful lot of fat to trim there.

    Personally I hold these two projects as pinnacles of software simplicity: OpenBSD and Plan 9.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by bzipitidoo on Tuesday September 11 2018, @08:17PM

    by bzipitidoo (4388) on Tuesday September 11 2018, @08:17PM (#733287) Journal

    Really it comes down to more parallelism. Which combination of a wider (128bit or more) CPU, more cores, more AVX sorts of instructions, wider buses, offloading common algorithms to dedicated hardware, and other techniques will make for the fastest computer is the question.

    The parallelism in a 1980s era home computer was practically none. For instance, the 6502 CPU in the Apple II had to do everything, and I do mean everything, even low level stuff such as pulsing the floppy drive arm stepper motor at the correct intervals and in the correct order to move the arm in or out, reading the individual bits on the floppy disk, and generating sound by clicking the speaker hundreds of times per second. It didn't take long for falling hardware costs and the desire for more speed to make it viable to move everything they could to what were basically built in embedded computer systems designed to run whatever piece of hardware was being offloaded.