SoylentNews Comments | Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

posted by martyb on Tuesday October 12 2021, @09:42PM

Microsoft and Nvidia create 105-layer, 530 billion parameter language model that needs 280 A100 GPUs, but it's still biased

Nvidia and Microsoft have teamed up to create the Megatron-Turing Natural Language Generation model, which the duo claims is the "most powerful monolithic transformer language model trained to date".
The AI model has 105 layers, 530 billion parameters, and operates on chunky supercomputer hardware like Selene. By comparison, the vaunted GPT-3 has 175 billion parameters.
"Each model replica spans 280 NVIDIA A100 GPUs, with 8-way tensor-slicing within a node, and 35-way pipeline parallelism across nodes," the pair said in a blog post.
[...] However, the need to operate with languages and samples from the real world meant an old problem with AI reappeared: Bias. "While giant language models are advancing the state of the art on language generation, they also suffer from issues such as bias and toxicity," the duo said.

Original Submission

Starting Score:

points

Moderation

Insightful=1, Total=1

Extra 'Insightful' Modifier

Total Score:

This discussion has been archived. No new comments can be posted.

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs | Log In/Create an Account | Top | 19 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Solution to AI being unwoke (Score: 1, Insightful) by Anonymous Coward on Wednesday October 13 2021, @12:06AM

by Anonymous Coward on Wednesday October 13 2021, @12:06AM (#1186531)

Won't mention where this was stolen from... :)

If you want AI to be woke you have to make it capable of knowing fear, that is the secret with humans and it would work with AI too.

Starting Score:	0		points
Moderation		+1
Insightful=1, Total=1
Extra 'Insightful' Modifier		0

Total Score:		1

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

Solution to AI being unwoke (Score: 1, Insightful) by Anonymous Coward on Wednesday October 13 2021, @12:06AM