Nvidia and Microsoft have teamed up to create the Megatron-Turing Natural Language Generation model, which the duo claims is the "most powerful monolithic transformer language model trained to date".
The AI model has 105 layers, 530 billion parameters, and operates on chunky supercomputer hardware like Selene. By comparison, the vaunted GPT-3 has 175 billion parameters.
"Each model replica spans 280 NVIDIA A100 GPUs, with 8-way tensor-slicing within a node, and 35-way pipeline parallelism across nodes," the pair said in a blog post.
[...] However, the need to operate with languages and samples from the real world meant an old problem with AI reappeared: Bias. "While giant language models are advancing the state of the art on language generation, they also suffer from issues such as bias and toxicity," the duo said.
Related: OpenAI's New Language Generator GPT-3 is Shockingly Good
A College Student Used GPT-3 to Write a Fake Blog Post that Ended Up at the Top of Hacker News
A Robot Wrote This Entire Article. Are You Scared Yet, Human?
OpenAI's Text-Generating System GPT-3 Is Now Spewing Out 4.5 Billion Words a Day
(Score: 2, Funny) by Anonymous Coward on Tuesday October 12 2021, @10:59PM (2 children)
"Nigga please."
Typical.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:17AM (1 child)
Left unattended for 2 hours, it generated "Hello, world.", causing several 'moderators' on stackexchange to finally commit suicide.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:29PM
I've been replaced by a machine!
(Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:03PM
Their VI isn't woke?
(Score: 3, Funny) by Anonymous Coward on Tuesday October 12 2021, @11:12PM (2 children)
640 GPUs should be enough.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @03:27AM
So you're saying the underlying OS is still DOS? At least that was reasonably stable...
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @02:51PM
So that's where Clippy went.
(Score: 0) by Anonymous Coward on Tuesday October 12 2021, @11:22PM (1 child)
105 layers? Isn't that barking up the wrong tree a bit? A human neocortex is recognised as having only six.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:37PM
Yes, but there are interconnections as well in human brains. 105 levels is like unrolling in compilers. Turing-equivalent, it's really just delaying loopbacks and/or expanding outer circuits.
(Score: 3, Funny) by Anonymous Coward on Wednesday October 13 2021, @12:02AM
The Linux version works with two A100 GPUs.
(Score: 1, Insightful) by Anonymous Coward on Wednesday October 13 2021, @12:06AM
Won't mention where this was stolen from... :)
If you want AI to be woke you have to make it capable of knowing fear, that is the secret with humans and it would work with AI too.
(Score: 1) by fustakrakich on Wednesday October 13 2021, @12:24AM (4 children)
Transistors need bias..
Just give us the straight dope, how many watts?
La politica e i criminali sono la stessa cosa..
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @12:29AM
"Just give us the straight dope"
N or P?
I leave the "transistor bias" to the next EE dope.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @01:30AM
84 kW?
(Score: 2) by DannyB on Wednesday October 13 2021, @05:20PM (1 child)
I may be biased, but I think some of those trans sistors are saturated with inexpensive alcoholic beverages.
Fact: We get heavier as we age due to more information in our heads. When no more will fit it accumulates as fat.
(Score: 1) by fustakrakich on Wednesday October 13 2021, @09:33PM
They're not doped up?
La politica e i criminali sono la stessa cosa..
(Score: 2) by istartedi on Wednesday October 13 2021, @05:00AM
All those parameters, and they forgot the most important ones: Where you are, who you're with, and how drunk you are.
It probably defaults to drunk at Thanksgiving.
Appended to the end of comments you post. Max: 120 chars.
(Score: 0) by Anonymous Coward on Wednesday October 13 2021, @07:51AM
without it the local NLP spyware watching you won't run
(Score: 2) by DannyB on Wednesday October 13 2021, @05:25PM
In one corner we have Microsoft's language model which can spew semi coherent sounding language it learned online. You just need a few starting words to trigger it.
In the other corner we have IBM's Watson which analyzes documents for content and answers questions about that content.
Which will be the first to solve unsolvable problems that need solving?
Fact: We get heavier as we age due to more information in our heads. When no more will fit it accumulates as fat.