Slash Boxes

SoylentNews is people

posted by janrinok on Saturday June 08, @09:55PM   Printer-friendly
from the I-wonder-what-Betteridge-would-say dept.

Arthur T Knackerbracket has processed the following story:

[Editor's Note: RAG: retrieval-augmented generation]

We’ve been living through the generative AI boom for nearly a year and a half now, following the late 2022 release of OpenAI’s ChatGPT. But despite transformative effects on companies’ share prices, generative AI tools powered by large language models (LLMs) still have major drawbacks that have kept them from being as useful as many would like them to be. Retrieval augmented generation, or RAG, aims to fix some of those drawbacks.

Perhaps the most prominent drawback of LLMs is their tendency toward confabulation (also called “hallucination”), which is a statistical gap-filling phenomenon AI language models produce when they are tasked with reproducing knowledge that wasn’t present in the training data. They generate plausible-sounding text that can veer toward accuracy when the training data is solid but otherwise may just be completely made up.

Relying on confabulating AI models gets people and companies in trouble, as we’ve covered in the past. In 2023, we saw two instances of lawyers citing legal cases, confabulated by AI, that didn’t exist. We’ve covered claims against OpenAI in which ChatGPT confabulated and accused innocent people of doing terrible things. In February, we wrote about Air Canada’s customer service chatbot inventing a refund policy, and in March, a New York City chatbot was caught confabulating city regulations.

[...] “RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process” to help LLMs stick to the facts, according to Noah Giansiracusa, associate professor of mathematics at Bentley University.

[...] Although RAG is now seen as a technique to help fix issues with generative AI, it actually predates ChatGPT. Researchers coined the term in a 2020 academic paper by researchers at Facebook AI Research (FAIR, now Meta AI Research), University College London, and New York University.

As we've mentioned, LLMs struggle with facts. Google’s entry into the generative AI race, Bard, made an embarrassing error on its first public demonstration back in February 2023 about the James Webb Space Telescope. The error wiped around $100 billion off the value of parent company Alphabet. LLMs produce the most statistically likely response based on their training data and don’t understand anything they output, meaning they can present false information that seems accurate if you don't have expert knowledge on a subject.

LLMs also lack up-to-date knowledge and the ability to identify gaps in their knowledge. “When a human tries to answer a question, they can rely on their memory and come up with a response on the fly, or they could do something like Google it or peruse Wikipedia and then try to piece an answer together from what they find there—still filtering that info through their internal knowledge of the matter,” said Giansiracusa.

But LLMs aren’t humans, of course. Their training data can age quickly, particularly in more time-sensitive queries. In addition, the LLM often can’t distinguish specific sources of its knowledge, as all its training data is blended together into a kind of soup.

In theory, RAG should make keeping AI models up to date far cheaper and easier. “The beauty of RAG is that when new information becomes available, rather than having to retrain the model, all that’s needed is to augment the model’s external knowledge base with the updated information,” said Peterson. “This reduces LLM development time and cost while enhancing the model’s scalability.”

Original Submission

This discussion was created by janrinok (52) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Informative) by drussell on Saturday June 08, @10:09PM (1 child)

    by drussell (2678) on Saturday June 08, @10:09PM (#1359879) Journal


    • (Score: 1, Interesting) by Anonymous Coward on Sunday June 09, @01:33AM

      by Anonymous Coward on Sunday June 09, @01:33AM (#1359898)

      SN is offering an appropriate fortune cookie just now,

      In good speaking, should not the mind of the speaker know the truth of the matter about which he is to speak? -- Plato

  • (Score: 4, Touché) by Rosco P. Coltrane on Saturday June 08, @10:12PM

    by Rosco P. Coltrane (4757) on Saturday June 08, @10:12PM (#1359880)

    AI aren't the only semi-sentient morons on the internet who make stuff up.

  • (Score: 2, Flamebait) by looorg on Saturday June 08, @10:55PM

    by looorg (578) on Saturday June 08, @10:55PM (#1359884)

    So the LLM is on the RAG? Could explain that lack of logic. I'm not even a native English speaker but even I know this is a really poor acronym.

  • (Score: 3, Interesting) by ElizabethGreene on Sunday June 09, @01:49AM

    by ElizabethGreene (6748) Subscriber Badge on Sunday June 09, @01:49AM (#1359902) Journal

    Bing uses RAG, if you use its copilot feature.

  • (Score: 5, Interesting) by gznork26 on Sunday June 09, @01:56AM

    by gznork26 (1159) on Sunday June 09, @01:56AM (#1359903) Homepage Journal

    In order for an AI to be able to recognize the limits of its resources rather than making stuff up, wouldn't it have to understand the subject matter, rather than just make plausible sounding strings of words? For the AI to answer, 'I don't know', we'd need something beyond the robot parrots we have now. Could a test of a true artificial intelligence be its ability to admit ignorance?

    Khipu were Turing complete.
  • (Score: 2, Interesting) by anubi on Sunday June 09, @02:49AM (1 child)

    by anubi (2828) on Sunday June 09, @02:49AM (#1359905) Journal

    When tasked for a response, and you have nothing...

    Make Something Up !

    It's standard operating procedure.


    The problem we have is humans have been strongly ingrained with the meme that computers are always right, which traditionally has always been true. A computer spewing forth error is either operating on erroneous data or has a fault in its logic or a bad power supply.

    Now, the state of the art is advancing to the point a computer can lie just as convincingly as a used-car salesman...ummm...well anyone who has mastered marketing psychology.

    Some people will go to extreme lengths to persuade others and often use deceit to do so. When you know what to look for, people who are inclined to prey off other's goodwill will show obsessions with money, power, rank, personal grooming, anything it can find to virtue signal that it is of rank and superior to others.

    Most of us see through it. May look good, but has no substance. However other leadership types will readily place them into organizational power structure as their ethics are much easier to suppress with rank and privileges. People with ethics will often destroy their own career to expose what they consider to be morally wrong.

    Well, does an AI know right from wrong?

    Does anybody?

    I do know we have wildly different responses to being suppressed when we have an ethical bifurcation.

    Do we really want this attribute in our machines?

    I liked working with the simpler machines because I had problems with subordination to both my own moral code and that sometimes expected of me.

    Would I tolerate a voltmeter that lies to me?

    We have more than enough evil in us already. Use a computer for numerical analysis, fine, but once we start programming it to learn to lie....

    I know these are inference engines. So am I. I hope I retain the ethics to not spew bullshit as truth.

    The last thing I need are scheming machines all planning things ( "sorry, that's classified" ) behind my back.

    "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
    • (Score: 1) by anubi on Sunday June 09, @08:18AM

      by anubi (2828) on Sunday June 09, @08:18AM (#1359922) Journal

      I Did it again.
      I the RAG was the daydream conjured up to fill in missing data.
      Only to discover RAG is the attempt to reference where the data comes from. Exactly what I was pontificating for.
      Geez, I am getting quite bad at this.
      I can see why I don't need to get where I can do much damage.

      "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
  • (Score: 3, Interesting) by istartedi on Sunday June 09, @05:30PM

    by istartedi (123) on Sunday June 09, @05:30PM (#1359951) Journal

    The "hallucinations" were so obvious that some kind of fact-checking pass seemed like it should have been put in before any release. It's a bit like anti-virus where you say, "Why doesn't MS just fix their software?". The answer is, "That's hard, the band-aid is easier". Fixing AI seems much harder, perhaps even not possible at this time since most of them are a black box (though I've seen one company that claims to be capable of analyzing it, forgot the name). Given that, an anti-virus like band-aid approach seems logical.

    Appended to the end of comments you post. Max: 120 chars.
  • (Score: 2) by SomeRandomGeek on Monday June 10, @05:34PM (1 child)

    by SomeRandomGeek (856) on Monday June 10, @05:34PM (#1360050)

    I don't understand why the hallucination problem is so hard to solve. Can't they just train an AI to answer the question "Does this AI written passage contain hallucinations?"
    Then they can either use that to train the LLM better, or just substitute a failure message for the original hallucination message.

    • (Score: 0) by Anonymous Coward on Monday June 10, @10:56PM

      by Anonymous Coward on Monday June 10, @10:56PM (#1360091)

      Nice idea, add the test before the answer is shown to the requestor (person).

      Or, make it possible to turn on and off, similar to the old option to verify that MS-DOS allowed after a disk operation (adding the verify pass extended the time to make the transfer).