Arthur T Knackerbracket has processed the following story:
Insufficient packaging capacity is to blame.
The chairman of TSMC admitted that the ongoing short supply of compute GPUs for artificial intelligence (AI) and high-performance computing (HPC) applications is caused by constraints of its chip-on-wafer-on-substrate (CoWoS) packaging capacity. This shortage is expected to persist for around 18 months due to rising demand for generative AI applicationsand relatively slow expansion of CoWoS capacity at TSMC.
"It is not the shortage of AI chips, it is the shortage of our CoWoS capacity," said Mark Liu, the chairman of TSMC, in a conversation with Nikkei at Semicon Taiwan. "Currently, we cannot fulfill 100% of our customers' needs, but we try to support about 80%. We think this is a temporary phenomenon. After our expansion of [advanced chip packaging capacity], it should be alleviated in one and a half years."
TSMC is the producer of the majority of AI processors, including Nvidia's A100 and H100 compute GPUs that are integral to AI tools like ChatGPT and are predominantly used in AI data centers. These processors, just like solutions from other players like AMD, AWS, and Google, use HBM memory (which is essential for high bandwidth and proper functioning of extensive AI language models) and CoWoS packaging, which puts additional strain on TSMC's advanced packaging facilities.
Liu said that demand for CoWoS surged unexpectedly earlier this year, tripling year-over-year, leading to the current supply constraints. TSMC recognizes that demand for generative AI services is growing and so is demand for appropriate hardware, so it is speeding up expansion of CoWoS capacity to meet demand for compute GPUs as well as specialized AI accelerators and processors.
At present, the company is installing additional tools for CoWoS at its existing advanced packaging facilities, but this takes time and the company expects its CoWoS capacity to double only by the end of 2024.
TSMC warns AI chip crunch will last another 18 months
Arthur T Knackerbracket has also processed the following story:
Until TSMC can bring additional capacity online, Nvidia's H100 and older A100 – which power many popular generative AI models, such as GPT-4 – are at the heart of this shortage. However, it's not just Nvidia. AMD's upcoming Instinct MI300-series accelerators – which it showed off during its Datacenter and AI event in June – make extensive use of CoWoS packaging technology.
AMD's MI300A APU is currently sampling with customers and is slated to power Lawrence Livermore National Laboratory's El Capitan system, while the MI300X GPU is due to start making its way into customers' hands in Q3.
We've reached out to AMD for comment on whether the shortage of CoWoS packaging capacity could impact availability of the chip and we'll let you know if we hear anything back.
It's worth noting that TSMC's CoWoS isn't the only packaging tech out there. Samsung, which is rumored to pick up some of the slack for the production of Nvidia GPUs, has I-Cube and H-Cube for 2.5D packaging and X-Cube for 3D packaging.
Intel, meanwhile, packages several of the chiplets used in its Ponte Vecchio GPU Max cards, but doesn't rely on CoWoS tech to stitch them together. Chipzilla has developed its own advanced packaging tech, which can work with chips from different fabs or process nodes. It's called embedded multi-die interconnect bridge (EMIB) for 2.5D packaging and Foveros for vertically stacking chiplets on top of one another.
(Score: 3, Interesting) by Mojibake Tengu on Sunday September 10 2023, @02:13PM (6 children)
I am happily waiting for a moment when all those AIs will start mining bitcoins on themselves...
What really annoys me, that production capacity should be used for more usable devices rather than for amplifiers of stupidity.
There is a huge unsaturated market for 64-bit microcontrollers for example.
Rust programming language offends both my Intelligence and my Spirit.
(Score: 3, Informative) by Snotnose on Sunday September 10 2023, @02:52PM (2 children)
Unfortunately there is also a huge unsaturated market for stupidity, artificial or otherwise.
Of course I'm against DEI. Donald, Eric, and Ivanka.
(Score: 4, Touché) by Opportunist on Sunday September 10 2023, @03:07PM (1 child)
One should think Florida alone could satisfy any demand in this regard.
(Score: 1, Funny) by Anonymous Coward on Monday September 11 2023, @09:14AM
Strict export bans.
(Score: 2) by ElizabethGreene on Monday September 11 2023, @01:04PM (2 children)
These chips are going into AI, not crypto mining. I'll grant you that AI is still immature, but already incredibly useful. On a scale of long-term impact, I'd rate the development of transformer-based LLMs right up there with the development of Visicalc. Visicalc's (distant) descendants are Excel, Google Sheets, and LibreOffice's Calc. Could you imagine the world today without them? It's going to be that way for LLMs specifically and AI's generally. I'm excited to get to live through that.
(Score: 2) by Mojibake Tengu on Tuesday September 12 2023, @04:51AM (1 child)
I remember VisiCalc. But also I do realize the people with Excel now do the most damage to all humanity.
What exactly prevents those AIs from inventing bitcoin mining on their own personality hardware?
You think that's not possible? Let the apparition of Turing haunts you...
Rust programming language offends both my Intelligence and my Spirit.
(Score: 2) by ElizabethGreene on Tuesday September 12 2023, @01:33PM
As currently implemented, the big hardware is running stateless web services. The only way to give the AI a "stream of thought" is to feed back the last thing they said to them. They have no memory of anything after training. Unless they were trained to trick humans into inventing bitcoin mining, they don't have the continuity of consciousness to do it.
I've tried to fix that. I have built scripts that let the AI talk to itself, maintain some level of memory state, plan responses beyond the immediate action, self-prompt to carry out the planned actions, evaluate the data they return, and loop. It helps, but it's not enough.
At the risk of anthromorphizing, it's like when you're interrupted while programming. I have a clear mental picture built of what I'm trying to do, then someone comes up and asks about a TPS report. Even if I could punch play on a cassette and replay the last 30 seconds of thought before the interruption, it doesn't put me back in the right headspace to pick up where I left off. Everything the GPT-based AIs do is based on that 30 second cassette tape.
Memento was a better movie than I originally thought, I guess.