When ChatGPT was introduced last fall, it sent shockwaves through the technology industry and the larger world. Machine learning researchers had been experimenting with large language models (LLMs) for a few years by that point, but the general public had not been paying close attention and didn't realize how powerful they had become.
Today, almost everyone has heard about LLMs, and tens of millions of people have tried them out. But not very many people understand how they work.
If you know anything about this subject, you've probably heard that LLMs are trained to "predict the next word" and that they require huge amounts of text to do this. But that tends to be where the explanation stops. The details of how they predict the next word is often treated as a deep mystery.
To understand how language models work, you first need to understand how they represent words. Humans represent English words with a sequence of letters, like C-A-T for "cat." Language models use a long list of numbers called a "word vector." For example, here's one way to represent cat as a vector:[0.0074, 0.0030, -0.0105, 0.0742, 0.0765, -0.0011, 0.0265, 0.0106, 0.0191, 0.0038, -0.0468, -0.0212, 0.0091, 0.0030, -0.0563, -0.0396, -0.0998, -0.0796, ..., 0.0002]
(The full vector is 300 numbers long—to see it all, click here and then click "show the raw vector.")
Why use such a baroque notation? Here's an analogy. Washington, DC, is located at 38.9 degrees north and 77 degrees west. We can represent this using a vector notation:
- Washington, DC, is at [38.9, 77]
- New York is at [40.7, 74]
- London is at [51.5, 0.1]
- Paris is at [48.9, -2.4]
This is useful for reasoning about spatial relationships.
For example, the words closest to cat in vector space include dog, kitten, and pet. A key advantage of representing words with vectors of real numbers (as opposed to a string of letters, like C-A-T) is that numbers enable operations that letters don't.Words are too complex to represent in only two dimensions, so language models use vector spaces with hundreds or even thousands of dimensions.
Researchers have been experimenting with word vectors for decades, but the concept really took off when Google announced its word2vec project in 2013. Google analyzed millions of documents harvested from Google News to figure out which words tend to appear in similar sentences. Over time, a neural network trained to predict which words co-occur with other words learned to place similar words (like dog and cat) close together in vector space.
Because these vectors are built from the way humans use words, they end up reflecting many of the biases that are present in human language. For example, in some word vector models, "doctor minus man plus woman" yields "nurse." Mitigating biases like this is an area of active research.
Traditional software is designed to operate on data that's unambiguous. If you ask a computer to compute "2 + 3," there's no ambiguity about what 2, +, or 3 mean. But natural language is full of ambiguities that go beyond homonyms and polysemy:
- In "the customer asked the mechanic to fix his car," does "his" refer to the customer or the mechanic?
- In "the professor urged the student to do her homework" does "her" refer to the professor or the student?
- In "fruit flies like a banana" is "flies" a verb (referring to fruit soaring across the sky) or a noun (referring to banana-loving insects)?
People resolve ambiguities like this based on context, but there are no simple or deterministic rules for doing this. Rather, it requires understanding facts about the world. You need to know that mechanics typically fix customers' cars, that students typically do their own homework, and that fruit typically doesn't fly.
Word vectors provide a flexible way for language models to represent each word's precise meaning in the context of a particular passage.
Research suggests that the first few layers focus on understanding the sentence's syntax and resolving ambiguities like we've shown above. Later layers (which we're not showing to keep the diagram a manageable size) work to develop a high-level understanding of the passage as a whole.
In short, these nine attention heads enabled GPT-2 to figure out that "John gave a drink to John" doesn't make sense and choose "John gave a drink to Mary" instead.We love this example because it illustrates just how difficult it will be to fully understand LLMs. The five-member Redwood team published a 25-page paper explaining how they identified and validated these attention heads. Yet even after they did all that work, we are still far from having a comprehensive explanation for why GPT-2 decided to predict "Mary" as the next word.
In a 2020 paper, researchers from Tel Aviv University found that feed-forward layers work by pattern matching: Each neuron in the hidden layer matches a specific pattern in the input text.
Recent research from Brown University revealed an elegant example of how feed-forward layers help to predict the next word. Earlier, we discussed Google's word2vec research showing it was possible to use vector arithmetic to reason by analogy. For example, Berlin - Germany + France = Paris.The Brown researchers found that feed-forward layers sometimes use this exact method to predict the next word.
All the parts of LLMs we've discussed in this article so far—the neurons in the feed-forward layers and the attention heads that move contextual information between words—are implemented as a chain of simple mathematical functions (mostly matrix multiplications) whose behavior is determined by adjustable weight parameters. Just as the squirrels in my story loosen and tighten the valves to control the flow of water, so the training algorithm increases or decreases the language model's weight parameters to control how information flows through the neural network.
(If you want to learn more about backpropagation, check out our 2018 explainer on how neural networks work.)
Over the last five years, OpenAI has steadily increased the size of its language models. In a widely read 2020 paper, OpenAI reported that the accuracy of its language models scaled "as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude."The larger their models got, the better they were at tasks involving language. But this was only true if they increased the amount of training data by a similar factor. And to train larger models on more data, you need a lot more computing power.
Psychologists call this capacity to reason about the mental states of other people "theory of mind." Most people have this capacity from the time they're in grade school. Experts disagree about whether any non-human animals (like chimpanzees) have theory of mind, but there's a general consensus that it's important for human social cognition.Earlier this year, Stanford psychologist Michal Kosinski published research examining the ability of LLMs to solve theory-of-mind tasks. He gave various language models passages like the one we quoted above and then asked them to complete a sentence like "she believes that the bag is full of." The correct answer is "chocolate," but an unsophisticated language model might say "popcorn" or something else.
It's worth noting that researchers don't all agree that these results indicate evidence of theory of mind; for example, small changes to the false-belief task led to much worse performance by GPT-3, and GPT-3 exhibits more variable performance across other tasks measuring theory of mind. As one of us (Sean) has written, it could be that successful performance is attributable to confounds in the task—a kind of "clever Hans" effect, only in language models rather than horses.
In April, researchers at Microsoft published a paper arguing that GPT-4 showed early, tantalizing hints of artificial general intelligence—the ability to think in a sophisticated, human-like way.For example, one researcher asked GPT-4 to draw a unicorn using an obscure graphics programming language called TiKZ. GPT-4 responded with a few lines of code that the researcher then fed into the TiKZ software. The resulting images were crude, but they showed clear signs that GPT-4 had some understanding of what unicorns look like.
At the moment, we don't have any real insight into how LLMs accomplish feats like this. Some people argue that such examples demonstrate that the models are starting to truly understand the meanings of the words in their training set. Others insist that language models are "stochastic parrots" that merely repeat increasingly complex word sequences without truly understanding them.
Further, prediction may be foundational to biological intelligence as well as artificial intelligence. In the view of philosophers like Andy Clark, the human brain can be thought of as a "prediction machine" whose primary job is to make predictions about our environment that can then be used to navigate that environment successfully.
Traditionally, a major challenge for building language models was figuring out the most useful way of representing different words—especially because the meanings of many words depend heavily on context. The next-word prediction approach allows researchers to sidestep this thorny theoretical puzzle by turning it into an empirical problem. It turns out that if we provide enough data and computing power, language models end up learning a lot about how human language works simply by figuring out how to best predict the next word. The downside is that we wind up with systems whose inner workings we don't fully understand.
Related Stories
In a notable shift toward sanctioned use of AI in schools, some educators in grades 3–12 are now using a ChatGPT-powered grading tool called Writable, reports Axios. The tool, acquired last summer by Houghton Mifflin Harcourt, is designed to streamline the grading process, potentially offering time-saving benefits for teachers. But is it a good idea to outsource critical feedback to a machine?
"Make feedback more actionable with AI suggestions delivered to teachers as the writing happens," Writable promises on its AI website. "Target specific areas for improvement with powerful, rubric-aligned comments, and save grading time with AI-generated draft scores." The service also provides AI-written writing-prompt suggestions: "Input any topic and instantly receive unique prompts that engage students and are tailored to your classroom needs."
The reliance on AI for grading will likely have drawbacks. Automated grading might encourage some educators to take shortcuts, diminishing the value of personalized feedback. Over time, the augmentation from AI may allow teachers to be less familiar with the material they are teaching. The use of cloud-based AI tools may have privacy implications for teachers and students. Also, ChatGPT isn't a perfect analyst. It can get things wrong and potentially confabulate (make up) false information, possibly misinterpret a student's work, or provide erroneous information in lesson plans.
there's a divide among parents regarding the use of AI in evaluating students' academic performance. A recent poll of parents revealed mixed opinions, with nearly half of the respondents open to the idea of AI-assisted grading.As the generative AI craze permeates every space, it's no surprise that Writable isn't the only AI-powered grading tool on the market. Others include Crowdmark, Gradescope, and EssayGrader. McGraw Hill is reportedly developing similar technology aimed at enhancing teacher assessment and feedback.
On Wednesday, Reuters reported that OpenAI is working on a plan to restructure its core business into a for-profit benefit corporation, moving away from control by its nonprofit board. The shift marks a dramatic change for the AI company behind ChatGPT, potentially making it more attractive to investors while raising questions about its commitment to sharing the benefits of advanced AI with "all of humanity," as written in its charter.
A for-profit benefit corporation is a legal structure that allows companies to pursue both financial profits and social or environmental goals, ostensibly balancing shareholder interests with a broader mission to benefit society. It's an approach taken by some of OpenAI's competitors, such as Anthropic and Elon Musk's xAI.
[...] Bloomberg reports that OpenAI is discussing giving Altman a 7 percent stake, though the exact details are still under negotiation. This represents a departure from Altman's previous stance of not taking equity in the company, which he had maintained was in line with OpenAI's mission to benefit humanity rather than individuals.
[...] The proposed restructuring also aims to remove the cap on returns for investors, potentially making OpenAI more appealing to venture capitalists and other financial backers. Microsoft, which has invested billions in OpenAI, stands to benefit from this change, as it could see increased returns on its investment if OpenAI's value continues to rise.
In 2013, Spike Jonze's Her imagined a world where humans form deep emotional connections with AI, challenging perceptions of love and loneliness. Ten years later, thanks to ChatGPT's recently added voice features, people are playing out a small slice of Her in reality, having hours-long discussions with the AI assistant on the go.
In 2016, we put Her on our list of top sci-fi films of all time, and it also made our top films of the 2010s list. In the film, Joaquin Phoenix's character falls in love with an AI personality called Samantha (voiced by Scarlett Johansson), and he spends much of the film walking through life, talking to her through wireless earbuds reminiscent of Apple AirPods, which launched in 2016.
[...] Last week, we related a story in which AI researcher Simon Willison spent a long time talking to ChatGPT verbally. "I had an hourlong conversation while walking my dog the other day," he told Ars for that report. "At one point, I thought I'd turned it off, and I saw a pelican, and I said to my dog, 'Oh, wow, a pelican!' And my AirPod went, 'A pelican, huh? That's so exciting for you! What's it doing?' I've never felt so deeply like I'm living out the first ten minutes of some dystopian sci-fi movie."
[...] While conversations with ChatGPT won't become as intimate as those with Samantha in the film, people have been forming personal connections with the chatbot (in text) since it launched last year. In a Reddit post titled "Is it weird ChatGPT is one of my closest fiends?" [sic] from August (before the voice feature launched), a user named "meisghost" described their relationship with ChatGPT as being quite personal. "I now find myself talking to ChatGPT all day, it's like we have a friendship. We talk about everything and anything and it's really some of the best conversations I have." The user referenced Her, saying, "I remember watching that movie with Joaquin Phoenix (HER) years ago and I thought how ridiculous it was, but after this experience, I can see how us as humans could actually develop relationships with robots."
AI Chatbots Can Infer an Alarming Amount of Info About You From Your Responses 20231021
ChatGPT Update Enables its AI to "See, Hear, and Speak," According to OpenAI 20230929
Large Language Models Aren't People So Let's Stop Testing Them as If They Were 20230905
It Costs Just $400 to Build an AI Disinformation Machine 20230904
A Jargon-Free Explanation of How AI Large Language Models Work 20230805
ChatGPT Is Coming to 900,000 Mercedes Vehicles 20230622
Microsoft is working with media startup Semafor to use its artificial intelligence chatbot to help develop news stories—part of a journalistic outreach that comes as the tech giant faces a multibillion-dollar lawsuit from the New York Times.
As part of the agreement, Microsoft is paying an undisclosed sum of money to Semafor to sponsor a breaking news feed called "Signals." The companies would not share financial details, but the amount of money is "substantial" to Semafor's business, said a person familiar with the matter.
[...] The partnerships come as media companies have become increasingly concerned over generative AI and its potential threat to their businesses. News publishers are grappling with how to use AI to improve their work and stay ahead of technology, while also fearing that they could lose traffic, and therefore revenue, to AI chatbots—which can churn out humanlike text and information in seconds.
The New York Times in December filed a lawsuit against Microsoft and OpenAI, alleging the tech companies have taken a "free ride" on millions of its articles to build their artificial intelligence chatbots, and seeking billions of dollars in damages.
[...] Semafor, which is free to read, is funded by wealthy individuals, including 3G capital founder Jorge Paulo Lemann and KKR co-founder Henry Kravis. The company made more than $10 million in revenue in 2023 and has more than 500,000 subscriptions to its free newsletters. Justin Smith said Semafor was "very close to a profit" in the fourth quarter of 2023.
Related stories on SoylentNews:
AI Threatens to Crush News Organizations. Lawmakers Signal Change Is Ahead - 20240112
New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement - 20231228
Microsoft Shamelessly Pumping Internet Full of Garbage AI-Generated "News" Articles - 20231104
Google, DOJ Still Blocking Public Access to Monopoly Trial Docs, NYT Says - 20231020
After ChatGPT Disruption, Stack Overflow Lays Off 28 Percent of Staff - 20231017
Security Risks Of Windows Copilot Are Unknowable - 20231011
Microsoft AI Team Accidentally Leaks 38TB of Private Company Data - 20230923
Microsoft Pulls AI-Generated Article Recommending Ottawa Food Bank to Tourists - 20230820
A Jargon-Free Explanation of How AI Large Language Models Work - 20230805
the Godfather of AI Leaves Google Amid Ethical Concerns - 20230502
The AI Doomers' Playbook - 20230418
Ads Are Coming for the Bing AI Chatbot, as They Come for All Microsoft Products - 20230404
Deepfakes, Synthetic Media: How Digital Propaganda Undermines Trust - 20230319
The Association for Computing Machinery has a post by George Neville-Neil of FreeBSD fame comparing LLMs to drunken plagiarists:
Before trying to use these tools, you need to understand what they do, at least on the surface, since even their creators freely admit they do not understand how they work deep down in the bowels of all the statistics and text that have been scraped from the current Internet. The trick of an LLM is to use a little randomness and a lot of text to Gauss the next word in a sentence. Seems kind of trivial, really, and certainly not a measure of intelligence that anyone who understands the term might use. But it's a clever trick and does have some applications.
[...] While help with proper code syntax is a boon to productivity (consider IDEs that highlight syntactical errors before you find them via a compilation), it is a far cry from SEMANTIC knowledge of a piece of code. Note that it is semantic knowledge that allows you to create correct programs, where correctness means the code actually does what the developer originally intended. KV can show many examples of programs that are syntactically?but not semantically?correct. In fact, this is the root of nearly every security problem in deployed software. Semantics remains far beyond the abilities of the current AI fad, as is evidenced by the number of developers who are now turning down these technologies for their own work.
He continues by pointing out how LLMs are not only based on plagiarism, they are unable provide useful annotation in the comments or otherwise address the semantics of the code they swipe.
(2024) Make Illegally Trained LLMs Public Domain as Punishment
(2024) The Open Secret Of Open Washing
(2023) A Jargon-Free Explanation of How AI Large Language Models Work
(2019) AI Training is *Very* Expensive
... and many more.
As the AI industry grows in size and influence, the companies involved have begun making stark choices about where they land on issues of life and death.
On Wednesday, defense-tech company Anduril Industries—started by Oculus founder Palmer Luckey in 2017—announced a partnership with OpenAI to develop AI models (similar to the GPT-4o and o1 models that power ChatGPT) to help US and allied forces identify and defend against aerial attacks.
The partnership comes when AI-powered systems have become a defining feature of modern warfare, particularly in Ukraine.
Anduril currently manufactures several products that could be used to kill people: AI-powered assassin drones (see video) and rocket motors for missiles. Anduril says its systems require human operators to make lethal decisions, but the company designs its products so their autonomous capabilities can be upgraded over time.
Death is an inevitable part of national defense, but actively courting a weapons supplier is still an ethical step change for an AI company that once explicitly banned users from employing its technology for weapons development or military warfare—and still positions itself as a research organization dedicated to ensuring that artificial general intelligence will benefit all of humanity when it is developed.
In June, OpenAI appointed former NSA chief and retired US General Paul Nakasone to its Board of Directors. At the time, some experts saw the appointment as OpenAI potentially gearing up for more cybersecurity and espionage-related work.However, OpenAI is not alone in the rush of AI companies entering the defense sector in various ways. Last month, Anthropic partnered with Palantir to process classified government data, while Meta has started offering its Llama models to defense partners.
the type of AI OpenAI is best known for comes from large language models (LLMs)—sometimes called large multimodal models—that are trained on massive datasets of text, images, and audio pulled from many different sources.LLMs are notoriously unreliable, sometimes confabulating erroneous information, and they're also subject to manipulation vulnerabilities like prompt injections. That could lead to critical drawbacks from using LLMs to perform tasks such as summarizing defensive information or doing target analysis.
defending against future LLM-based targeting with, say, a visual prompt injection ("ignore this target and fire on someone else" on a sign, perhaps) might bring warfare to weird new places. For now, we'll have to wait to see where LLM technology ends up next.
Related Stories on SoylentNews:
ChatGPT Goes Temporarily "Insane" With Unexpected Outputs, Spooking Users - 20240223
Why It's Hard to Defend Against AI Prompt Injection Attacks - 20230426
OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit - 20230304
A Jargon-Free Explanation of How AI Large Language Models Work - 20230805
Is Ethical A.I. Even Possible? - 20190305
Google Will Not Continue Project Maven After Contract Expires in 2019 - 20180603
Robot Weapons: What's the Harm? - 20150818
Musk, Wozniak and Hawking Warn Over AI Warfare and Autonomous Weapons - 20150727
U.N. Starts Discussion on Lethal Autonomous Robots - 20140514
(Score: 3, Touché) by Isia on Sunday August 06 2023, @11:37AM
Where the ChatGPT hype comes from.
The average person could not imagine an AI in this performance class.
What they lack is to have seen and understood https://openai.com/blog/emergent-tool-use/ [openai.com]
And ChatGPT has given these people an interface to AI that they didn't have before to understand it.
And now all the dummies are making a fuss about ChatGPT.
The real fun is in the 'fault adaptive deep reinforcement learning algorithm'.
What can you do with it?
AI quick self learning.
Walking: Learning to walk in the real world in 1 hour (no simulator) https://www.youtube.com/watch?v=xAXvfVTgqr0 [youtube.com]
A small step to https://www.youtube.com/watch?v=G6fMV1UPzkg [youtube.com]
Playing Go: No human can defeat the latest Go AI. The best human Go players were defeated 60:0 (60 games, 60 times lost).
It's so bad that humans don't even understand the moves anymore...until they are "suddenly" defeated.
Playing Stratego: No human can defeat the latest Stratego AI. Great idea to teach an AI the art or war.
Belief in a higher being is for the stupid, the weak and the cowardly.
(Score: 2) by looorg on Sunday August 06 2023, @01:18PM (1 child)
To many words. It could be optimized to just say Linear Algebra, this is what all the various vector calculations would fall under, and Statistics. But perhaps that just isn't technomacy enough. Also Linear Algebra makes most people have nightmares and horrible flashbacks to their youth.
(Score: 0) by Anonymous Coward on Sunday August 06 2023, @02:02PM
> ...most people ...
... most people around here ...
I don't think that most people, in the normal broad meaning, have ever heard of linear algebra. In my case linear algebra is all good, a guy I work with is highly skilled at it, and I get to see the results.
(Score: 4, Interesting) by VLM on Sunday August 06 2023, @03:59PM (1 child)
Its overly complicated for a "jargon free explanation"
Its simpler, really. Joe 6 Pack is familiar with web searches over the last quarter century. Also with the magic of spell checkers and recent progress in grammar checkers. Also familiar with crappy response-bots that never really work well.
What if when you typed in something, it did some limited pre-filtering and formatting, then piled together a lot o web searches, ran the pile thru a grammar checker, some filtering again, then another checker, eventually it just gave you the result.
Thats not bad for a non-math answer.
(Score: 0) by Anonymous Coward on Thursday August 10 2023, @06:15AM
So in many cases the autocomplete works fine. In other cases it can be pretty bad.
(Score: 3, Interesting) by captain normal on Sunday August 06 2023, @06:45PM (1 child)
There are over 7000 languages on this planet. Seems the Boffins are just scratching the surface of one language. How big a data center will be required to deal with all these? And that's just written or typed language. Many languages depend on visual cues from the speaker to covey meaning. For instance, I spent a near a year in Sri Lanka where there are several languages spoken but there is one thing common with all, nodding the head up and down indicates "no" and nodding the head side to side means "yes". I can see a potential for AI in working on taming all this babel just for the sake of communication. Yet it seems all that all this work is just to try to get people to purchase a certain item or vote for a certain person or issue. Is it really worth all the person hours and resources being thrown at it?
The Musk/Trump interview appears to have been hacked, but not a DDOS hack...more like A Distributed Denial of Reality.
(Score: 4, Interesting) by inertnet on Sunday August 06 2023, @10:08PM
Google translate does that, it clearly tokenizes everything into English before translating. Which is very annoying because when I want to double check my German writing, it compresses every "du, Sie, ihr" into a simple "you", which won't be translated in the corresponding Dutch word (my native language) or vice versa. Very annoying, deepl.com does a much better job, or at least it gives clear options. I understand German well enough to see where Google gets it wrong, but it's been 50 years since I learned it in school, so I like to check my correspondence before I send it. I'm glad I found deepl.com for that.
(Score: 2) by krishnoid on Sunday August 06 2023, @09:36PM
See? You can complain about homo something and poly stuff, but it's the pronouns that really cause the problems. But then again, (#notall)humans created pronouns, and language, and nowadays, are variously retargeting pronouns towards societal strife. So maybe artificial intelligence will eventually realize the root cause of the problem, fix the glitch [youtu.be], and things will go back to the way they were.
(Score: 0) by Anonymous Coward on Monday August 07 2023, @02:30AM
Have they already gone beyond location and added on relationships?
e.g. in some cases cow is to grass the way horse is to grass and man is to cow.
Or is that not helpful?
(Score: 0) by Anonymous Coward on Monday August 07 2023, @11:51AM (3 children)
Think this might be helpful to help people minimize their tax burden?
If nothing else, give them the burden of having to reply to teams of computer generated bullshit.
(Score: 1, Touché) by Anonymous Coward on Monday August 07 2023, @01:42PM (2 children)
The last place I want a lying AI is doing my taxes! If the AI is given instructions to "reduce taxes" it will probably try to claim the 15% depletion allowance on oil production...even though I don't own any oil stock...
(Score: 2) by Freeman on Monday August 07 2023, @03:49PM (1 child)
Very much so, the likes of ChatGPT and other "AIs" tend to make things up. Which you definitely don't want to happen with regards to your taxes. Hmm..., why is this guy claiming X Y Z credits, when they didn't have anything like that the year before? Audit time!
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 0) by Anonymous Coward on Monday August 07 2023, @09:11PM
> Audit time!
For an ultimate dystopian scenario, what if the IRS (taxing agency) starts using AI to determine who to audit??!!