SoylentNews Comments | Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

posted by martyb on Tuesday October 12 2021, @09:42PM

Microsoft and Nvidia create 105-layer, 530 billion parameter language model that needs 280 A100 GPUs, but it's still biased

Nvidia and Microsoft have teamed up to create the Megatron-Turing Natural Language Generation model, which the duo claims is the "most powerful monolithic transformer language model trained to date".
The AI model has 105 layers, 530 billion parameters, and operates on chunky supercomputer hardware like Selene. By comparison, the vaunted GPT-3 has 175 billion parameters.
"Each model replica spans 280 NVIDIA A100 GPUs, with 8-way tensor-slicing within a node, and 35-way pipeline parallelism across nodes," the pair said in a blog post.
[...] However, the need to operate with languages and samples from the real world meant an old problem with AI reappeared: Bias. "While giant language models are advancing the state of the art on language generation, they also suffer from issues such as bias and toxicity," the duo said.

Original Submission

This discussion has been archived. No new comments can be posted.

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs | Log In/Create an Account | Top | 19 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

(1)

The first generated phrase The first generated phrase (Score: 2, Funny) by Anonymous Coward on Tuesday October 12 2021, @10:59PM (2 children)

by Anonymous Coward on Tuesday October 12 2021, @10:59PM (#1186522)

"Nigga please."
Typical.
- Re:The first generated phrase Re:The first generated phrase (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:17AM (1 child)
  
  by Anonymous Coward on Wednesday October 13 2021, @03:17AM (#1186560)
  
  Left unattended for 2 hours, it generated "Hello, world.", causing several 'moderators' on stackexchange to finally commit suicide.
  
  Parent
  - Re:The first generated phrase (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:29PM
    
    by Anonymous Coward on Wednesday October 13 2021, @03:29PM (#1186676)
    
    I've been replaced by a machine!
    
    Parent
Toxic (Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:03PM

by Anonymous Coward on Tuesday October 12 2021, @11:03PM (#1186523)

Their VI isn't woke?
MS Scalper MS Scalper (Score: 3, Funny) by Anonymous Coward on Tuesday October 12 2021, @11:12PM (2 children)

by Anonymous Coward on Tuesday October 12 2021, @11:12PM (#1186524)

640 GPUs should be enough.
- Re:MS Scalper (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:27AM
  
  by Anonymous Coward on Wednesday October 13 2021, @03:27AM (#1186567)
  
  So you're saying the underlying OS is still DOS? At least that was reasonably stable...
  
  Parent
- Re:MS Scalper (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @02:51PM
  
  by Anonymous Coward on Wednesday October 13 2021, @02:51PM (#1186658)
  
  So that's where Clippy went.
  
  Parent
Layers? Ogres have layers Layers? Ogres have layers (Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:22PM (1 child)

by Anonymous Coward on Tuesday October 12 2021, @11:22PM (#1186526)

105 layers? Isn't that barking up the wrong tree a bit? A human neocortex is recognised as having only six.
- Re:Layers? Ogres have layers (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:37PM
  
  by Anonymous Coward on Wednesday October 13 2021, @07:37PM (#1186749)
  
  Yes, but there are interconnections as well in human brains. 105 levels is like unrolling in compilers. Turing-equivalent, it's really just delaying loopbacks and/or expanding outer circuits.
  
  Parent
That's because it runs on Windows 11 (Score: 3, Funny) by Anonymous Coward on Wednesday October 13 2021, @12:02AM

by Anonymous Coward on Wednesday October 13 2021, @12:02AM (#1186530)

The Linux version works with two A100 GPUs.
Solution to AI being unwoke (Score: 1, Insightful) by Anonymous Coward on Wednesday October 13 2021, @12:06AM

by Anonymous Coward on Wednesday October 13 2021, @12:06AM (#1186531)

Won't mention where this was stolen from... :)
If you want AI to be woke you have to make it capable of knowing fear, that is the secret with humans and it would work with AI too.
bias and toxicity, yeah okay.. bias and toxicity, yeah okay.. (Score: 1) by fustakrakich on Wednesday October 13 2021, @12:24AM (4 children)

by fustakrakich (6150) on Wednesday October 13 2021, @12:24AM (#1186532) Journal

Transistors need bias..
Just give us the straight dope, how many watts?

--
La politica e i criminali sono la stessa cosa..
- Re:bias and toxicity, yeah okay.. (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @12:29AM
  
  by Anonymous Coward on Wednesday October 13 2021, @12:29AM (#1186533)
  
  "Just give us the straight dope"
  N or P?
  I leave the "transistor bias" to the next EE dope.
  
  Parent
- Re:bias and toxicity, yeah okay.. (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @01:30AM
  
  by Anonymous Coward on Wednesday October 13 2021, @01:30AM (#1186541)
  
  84 kW?
  
  Parent
- Re:bias and toxicity, yeah okay.. Re:bias and toxicity, yeah okay.. (Score: 2) by DannyB on Wednesday October 13 2021, @05:20PM (1 child)
  
  by DannyB (5839) on Wednesday October 13 2021, @05:20PM (#1186710) Journal
  
  Transistors need bias
  I may be biased, but I think some of those trans sistors are saturated with inexpensive alcoholic beverages.
  
  --
  People who think Republicans wouldn't dare destroy Social Security or Medicare should ask women about Roe v Wade.
  
  Parent
  - Re:bias and toxicity, yeah okay.. (Score: 1) by fustakrakich on Wednesday October 13 2021, @09:33PM
    
    by fustakrakich (6150) on Wednesday October 13 2021, @09:33PM (#1186799) Journal
    
    They're not doped up?
    
    --
    La politica e i criminali sono la stessa cosa..
    
    Parent
530 billion, they forgot the most important ones (Score: 2) by istartedi on Wednesday October 13 2021, @05:00AM

by istartedi (123) on Wednesday October 13 2021, @05:00AM (#1186584) Journal

All those parameters, and they forgot the most important ones: Where you are, who you're with, and how drunk you are.
It probably defaults to drunk at Thanksgiving.

--
Appended to the end of comments you post. Max: 120 chars.
That will be in the minimum specs for win13 (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:51AM

by Anonymous Coward on Wednesday October 13 2021, @07:51AM (#1186605)

without it the local NLP spyware watching you won't run
Which is better(er)? (Score: 2) by DannyB on Wednesday October 13 2021, @05:25PM

by DannyB (5839) on Wednesday October 13 2021, @05:25PM (#1186712) Journal

In one corner we have Microsoft's language model which can spew semi coherent sounding language it learned online. You just need a few starting words to trigger it.
In the other corner we have IBM's Watson which analyzes documents for content and answers questions about that content.
Which will be the first to solve unsolvable problems that need solving?

--
People who think Republicans wouldn't dare destroy Social Security or Medicare should ask women about Roe v Wade.

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Microsoft and Nvidia Create 105-Layer, 530 Billion Parameter Language Model That Needs 280 A100 GPUs

The first generated phrase The first generated phrase (Score: 2, Funny) by Anonymous Coward on Tuesday October 12 2021, @10:59PM (2 children)

Re:The first generated phrase Re:The first generated phrase (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:17AM (1 child)

Re:The first generated phrase (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:29PM

Toxic (Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:03PM

MS Scalper MS Scalper (Score: 3, Funny) by Anonymous Coward on Tuesday October 12 2021, @11:12PM (2 children)

Re:MS Scalper (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:27AM

Re:MS Scalper (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @02:51PM

Layers? Ogres have layers Layers? Ogres have layers (Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:22PM (1 child)

Re:Layers? Ogres have layers (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:37PM

That's because it runs on Windows 11 (Score: 3, Funny) by Anonymous Coward on Wednesday October 13 2021, @12:02AM

Solution to AI being unwoke (Score: 1, Insightful) by Anonymous Coward on Wednesday October 13 2021, @12:06AM

bias and toxicity, yeah okay.. bias and toxicity, yeah okay.. (Score: 1) by fustakrakich on Wednesday October 13 2021, @12:24AM (4 children)

Re:bias and toxicity, yeah okay.. (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @12:29AM

Re:bias and toxicity, yeah okay.. (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @01:30AM

Re:bias and toxicity, yeah okay.. Re:bias and toxicity, yeah okay.. (Score: 2) by DannyB on Wednesday October 13 2021, @05:20PM (1 child)

Re:bias and toxicity, yeah okay.. (Score: 1) by fustakrakich on Wednesday October 13 2021, @09:33PM

530 billion, they forgot the most important ones (Score: 2) by istartedi on Wednesday October 13 2021, @05:00AM

That will be in the minimum specs for win13 (Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:51AM

Which is better(er)? (Score: 2) by DannyB on Wednesday October 13 2021, @05:25PM