Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 14 submissions in the queue.
posted by hubie on Monday May 04, @09:39AM   Printer-friendly

A ChatGPT AI has proved a conjecture with a method no human had thought of. Experts believe it may have further uses:

Liam Price just cracked a 60-year-old problem that world-class mathematicians have tried and failed to solve. He's 23 years old and has no advanced mathematics training. What he does have is a ChatGPT Pro subscription, which gives him access to the latest large language models from OpenAI.

Artificial intelligence has recently made headlines for solving a number of "Erdős problems," conjectures left behind by the prolific mathematician Paul Erdős. But experts have warned that these problems are an imperfect benchmark of artificial intelligence's mathematical prowess. They range dramatically in both significance and difficulty, and many AI solutions have turned out to be less original than they appeared.

The new solution —which Price got in response to a single prompt to GPT-5.4 Pro and posted on www.erdosproblems.com , a website devoted to the Erdős problems, just over a week ago—is different. The problem it solves has eluded some prominent minds, bestowing it some esteem. And more importantly, the AI seems to have used a totally new method for problems of this kind. It's too soon to say with certainty, but this LLM-conceived connection may be useful for broader applications—something hard to find among recently touted AI triumphs in math.

"This one is a bit different because people did look at it, and the humans that looked at it just collectively made a slight wrong turn at move one," says Terence Tao, a mathematician at the University of California, Los Angeles, who has become a prominent scorekeeper for AI's push into his field. "What's beginning to emerge is that the problem was maybe easier than expected, and it was like there was some kind of mental block."

The question Price solved—or prompted ChatGPT to solve—concerns special sets of whole numbers, where no number in the set can be evenly divided by any other. Erdős called these "primitive sets" because of their connection to similarly indivisible prime numbers.

"A number is prime if it has no other divisors, and this is kind of generalizing that definition from an individual number to a collection of numbers," says Jared Duker Lichtman, a mathematician at Stanford University. Any set of prime numbers is automatically primitive, because primes have no factors (except themselves and the number one).

[...] "There was kind of a standard sequence of moves that everyone who worked on the problem previously started by doing," Tao says. The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.

"The raw output of ChatGPT's proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say," Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM's key insight.

More importantly, they already see other potential applications of the AI's cognitive leap. "We have discovered a new way to think about large numbers and their anatomy," Tao says. "It's a nice achievement. I think the jury is still out on the long-term significance."

Lichtman is hopeful because ChatGPT's discovery validates a sense he's had since graduate school. "I had the intuition that these problems were kind of clustered together and they had some kind of unifying feel to them," he says. "And this new method is really confirming that intuition."


Original Submission

This discussion was created by hubie (1068) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Interesting) by looorg on Monday May 04, @10:32AM (6 children)

    by looorg (578) on Monday May 04, @10:32AM (#1441517)

    "This one is a bit different because people did look at it, and the humans that looked at it just collectively made a slight wrong turn at move one," says Terence Tao, a mathematician at the University of California, Los Angeles, who has become a prominent scorekeeper for AI's push into his field. "What's beginning to emerge is that the problem was maybe easier than expected, and it was like there was some kind of mental block."

    "There was kind of a standard sequence of moves that everyone who worked on the problem previously started by doing," Tao says. The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.

    This is the interesting aspect then. Everyone used a way of thinking or "moves". But that was wrong for this type of problem. Perhaps that previous "move" that everyone was making is then wrong? Even for other things. Perhaps it's something we should look at ...

    • (Score: 0) by Anonymous Coward on Monday May 04, @01:23PM (2 children)

      by Anonymous Coward on Monday May 04, @01:23PM (#1441535)

      Tell me you’re not a mathematician, without telling me you’re not a mathematician.

      • (Score: 5, Interesting) by Mojibake Tengu on Monday May 04, @02:30PM (1 child)

        by Mojibake Tengu (8598) on Monday May 04, @02:30PM (#1441548) Journal

        $ ghci
        GHCi, version 9.10.3: https://www.haskell.org/ghc/  :? for help
        ghci> let 1 + 1 = 3
        ghci> 1 + 1
        3

        Well, I am programmer. We programmers can do anything. Literally. We are free. Mathematicians are bound.

        --
        Rust programming language offends both my Intelligence and my Spirit.
    • (Score: 2, Informative) by shrewdsheep on Monday May 04, @01:39PM (1 child)

      by shrewdsheep (5215) on Monday May 04, @01:39PM (#1441539) Journal

      This often happens when proofs of hard problems are finally found. Thinking outside the box is (always) required for hard problems as many extremely smart people have given them a try already. The current problem is about sets of numbers. One can try to check divisibility traditionally. You can also interpret the numbers as coordinates of points and go from their taking a geometric approach. Or you could interpret them of codes encoding for formulas (Goedel style) to analyze the problem that way. And so on. For a certain approach the solution is more obvious than for another. The key insight is not mentioned in TFS. If somebody did read TFA, please share.

      • (Score: 3, Informative) by looorg on Monday May 04, @05:54PM

        by looorg (578) on Monday May 04, @05:54PM (#1441569)

        I did look but I might have missed it. I have not looked at (or for) any kind of published proof. But from the article it seems that a lot of people have looked at the problem. They have apparently tried some agreed upon previous knowledge in Set theory. Which has in this case then utterly failed. So it would be interesting to know what they did, since it failed here. So this problem is, or could be, an exception to the previous methods. Or when this proof is done it would be interesting if it is applicable to other problems and works then better then the previous old solutions. But as far as I can tell the article did not really go that far. Perhaps there is something at the homepage in question or Tao put something on his page. I have not looked.

    • (Score: 5, Interesting) by JoeMerchant on Monday May 04, @03:53PM

      by JoeMerchant (3937) on Monday May 04, @03:53PM (#1441554)

      This is the power of "naive perspective".

      It has been a recognized power of human problem solving for many years. A common model is: take two or more experts in their fields and have them collaborate on a problem adjacent to their areas of expertise. None of these experts are "tied down" by the established dogma of the problem space, but they do have insights from their own areas of expertise which may (or more often may not) bring innovative solutions to stubborn aspects of the problem they are collaborating on.

      It has been fashionable since the 1990s for people with M.D. degrees to step out of medicine and get another degree in Engineering or Physics and then come back and apply their "dual pronged approach" to problems in the medical field.

      While I also (just recently, repeatedly with my colleagues) maintain that well established methods are often optimal simply because decades of refinement by hundreds or thousands of practitioners have brought those methods to a "local minima" of negative aspects, it is also true that there may be other approaches whose "local minima" have even fewer negative aspects or negative aspects of lesser concern... unfortunately, these comparisons are rarely clear-cut and in practical fields there is usually tremendous resistance to change, so even if your novel approach is better in many ways, it's going to have to demonstrate itself as radically undeniably better because of the inherent difficulty (and real costs) of getting practitioners to change.

      However, mathematics - being a wholly invented field with no messy problems of reality intruding with uninvited observations - is less inherently resistant to such changes. A proof is a proof is a proof (although an unconventional and complex proof does meet with significant community resistance even when it is undeniably correct...) As such, I believe the naive perspective is _more_ powerful in mathematical circles exactly because results are concrete, quantifiable, and even a tiny improvement is undeniably an improvement in the metric being measured. Now, naive interlopers will often make errors - but these are again testable in a binary sense: is it an error, or isn't it? If an LLM has been trained in recognition of errors vs correct results, it can apply the infinite number of monkeys banging away on all "likely looking approaches" from all kinds of perspectives both inside and outside the established dogma - and not bother us until it finds something both novel and correct.

      --
      🌻🌻🌻🌻 [google.com]
  • (Score: 4, Informative) by Thexalon on Monday May 04, @11:30AM (14 children)

    by Thexalon (636) on Monday May 04, @11:30AM (#1441521)

    So posting a potential solution to a problem is cool, and I would encourage anyone to try to solve these sorts of things whenever possible because you'll learn a lot from trying. The vast majority of the time, what you'll learn is that your approach has been tried a bunch of times before, and doesn't work, and here's a patient explanation from a person who has studied this much more than you have about why your approach doesn't work. But if you maintain your passion for the problem, you'll go after it with a different approach, and yet another approach, and keep at it until you've become a genuine expert at it.

    But there's a big difference between posting what appears to be a proof to an amateur and what's actually a proof. If it's only been up for a few weeks, I have suspicions that it has not yet had the kind of vetting that it needs to by other mathematicians before it becomes a canonical proof, and it's likely that a few revisions will be required as clever people find flaws in it. That kind of thing is normal: If I recall, Andrew Wiles had to do that a few times with his proof of Fermat's Last Theorem.

    --
    "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
    • (Score: 2, Insightful) by khallow on Monday May 04, @11:59AM (5 children)

      by khallow (3766) Subscriber Badge on Monday May 04, @11:59AM (#1441524) Journal
      There's a lot of failed proofs out there. That's the usual state of things actually. But a failed proof that tries a new, valid approach is valuable. Similarly, a failed proof that organizes itself into digestible pieces - basically a roadmap that a real proof can follow later - has value.

      The worst failed proofs are endless waves of math with no real organization and a math error or two hidden on page eight. I've seen amateur proofs of Fermat's theorem that fall into that category. Even if you can figure out where the math errors happened, there's no value to the proof. It doesn't illuminate any aspects of the problem you're trying to solve.
      • (Score: 4, Insightful) by SomeGuy on Monday May 04, @01:10PM (2 children)

        by SomeGuy (5632) on Monday May 04, @01:10PM (#1441532)

        So what is the value of proof that has been probabilistically slopped together using every existing resource that could be hovered down and hallucinated with no real understanding in to a form that looks right to someone with no advanced mathematics training?

        got in response to a single prompt

        Which ignores the who knows how many attempts by an infinite number of monkeys pounding away at shatgpt to produce the same thing. Oh, THIS one looks like it might do something, ok?

        "The raw output of ChatGPT's proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say," Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM's key insight.

        So they were basically reading tea leaves and the prediction that good fortune would come today happened to come true. :P

        Experts believe it may have further uses:

        As fertilizer.

        The real news is that a PERSON came up with a different approach - they just happened to use a creativity tool other than drugs.

        • (Score: 4, Insightful) by JoeMerchant on Monday May 04, @08:36PM

          by JoeMerchant (3937) on Monday May 04, @08:36PM (#1441589)

          >with no real understanding

          Irrelevant. It is either a correct proof, or it is not. If it is a novel correct proof, it has value, whether it was hand penned by Grigori Perelman or Terence Tao, or hallucinated by Grok 4.3, or shat out by seagulls as a spatter pattern on the side of a glass building... the value is in the content, not the origin.

          >in to a form that looks right

          That's always a nice touch... again, the value is in what it proves, how it proves it, and whether it is correct or not.

          >got in response to a single prompt

          I suspect this hides a bit of background. I mean, maybe the engine was primed to make this observation in its base state, but I find that the longer I interact with them, the more they shape their output into forms I want to see...

          >>"The raw output of ChatGPT's proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say," Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM's key insight.

          >So they were basically reading tea leaves and the prediction that good fortune would come today happened to come true. :P

          Actually, no, what this is saying is: the first draft received from ChatGPT was not "in a form that looks right" it was a convoluted mess - which nontheless contained valuable information in the form of a correct proof which was later refined into a more palatable form.

          > a PERSON came up with a different approach - they just happened to use a creativity tool other than drugs.

          Always. Until the dolphin or other species get bored enough that they decide to start participating in our esoteric games.

          --
          🌻🌻🌻🌻 [google.com]
        • (Score: 1) by khallow on Tuesday May 05, @01:09AM

          by khallow (3766) Subscriber Badge on Tuesday May 05, @01:09AM (#1441599) Journal
          They didn't say the proof was great. But it has been looked at by real experts in the field (peer-reviewed BTW) and cleaned up to the point that it's not a mess.
      • (Score: 2) by Thexalon on Monday May 04, @06:52PM (1 child)

        by Thexalon (636) on Monday May 04, @06:52PM (#1441578)

        I agree with what you're getting at: There's "right", "wrong", and "not even wrong" (attributed to Wolfgang Pauli). And there's useful kinds of wrong, certainly.

        What I'm cautioning against is thinking that (a) this proof is necessarily right before it's been tested heavily, and (b) assuming that this particular kind of wrong is any kind of useful when it may or may not be. Something like this without any kind of peer review, however correct or useful it might wind up being, is a long way from a Fields Medal.

        --
        "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
        • (Score: 2) by lars_stefan_axelsson on Tuesday May 05, @07:15AM

          by lars_stefan_axelsson (3590) on Tuesday May 05, @07:15AM (#1441611)

          You missed the part where Terence Tao helped develop/rewrite the proof? As in the Fields medalist Terence Tao: https://en.wikipedia.org/wiki/Terence_Tao [wikipedia.org]

          That's far beyond "some random guy on the internet"-level of scrutiny.

          I swear, the levels of contortions people go through to make any advancement by LLMs seem like it's just smoke and mirrors completely amazes me. But given the threat to jobs and ego, well probably mostly ego, that's perhaps not surprising.

          --
          Stefan Axelsson
    • (Score: 3, Interesting) by gnuman on Monday May 04, @02:05PM (1 child)

      by gnuman (5013) on Monday May 04, @02:05PM (#1441546)

      from TFA,

      “There was kind of a standard sequence of moves that everyone who worked on the problem previously started by doing,” Tao says. The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.

      “The raw output of ChatGPT’s proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say,” Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM’s key insight.

      More importantly, they already see other potential applications of the AI’s cognitive leap. “We have discovered a new way to think about large numbers and their anatomy,” Tao says. “It’s a nice achievement. I think the jury is still out on the long-term significance.”

      Lichtman is hopeful because ChatGPT’s discovery validates a sense he’s had since graduate school. “I had the intuition that these problems were kind of clustered together and they had some kind of unifying feel to them,” he says. “And this new method is really confirming that intuition.”

      This looks like verification, more or less. And Jared Duker Lichtman is a mathematician at Stanford University. So, basically, Price vibe-mathed the thing and passed it to some math student. The math student passed it to actual mathematicians, Lichtman and Tao - Terence Tao, a mathematician at the University of California. They shortened and fixed up the LLM mess, but the mess had a proof in it.

      I would say, these proofs are kind of like security issues in software. Not too many people are looking at these problems and there are far more problems than people looking to solve them.

      • (Score: 4, Informative) by Laci on Monday May 04, @03:31PM

        by Laci (2618) on Monday May 04, @03:31PM (#1441553)

        Actually, this proof has more than just "looks like a verification". TFA links to the the problem: https://www.erdosproblems.com/1196, [erdosproblems.com] and its status is "proved with LEAN". LEAN is a formal proof system, so if something is proved with LEAN, then it is proved. Period.
        You can actually follow the link to the distilled proof itself: https://www.erdosproblems.com/latex/1196. [erdosproblems.com] It's a mere 4 pages, and is rather elementary. It's interesting that such a long-standing theorem has such a simple proof.

    • (Score: 2) by epitaxial on Monday May 04, @02:37PM (3 children)

      by epitaxial (3165) on Monday May 04, @02:37PM (#1441550)

      I'm terrible at math so forgive me. Isn't math exacting? As in how could a proof take a peer review? It should either work or not work.

      • (Score: 2) by Thexalon on Monday May 04, @04:00PM

        by Thexalon (636) on Monday May 04, @04:00PM (#1441558)

        Mathematics is incredibly exacting, yes. But there's always a risk of glossing over something that can happen.

        A not-uncommon scenario is:
        1. Someone writes what they believe is a complete proof.
        2. Someone else reading it very carefully, possibly assisted by computers and other tooling, finds some counterexamples.
        3. Someone who may or may not be the original author revises the proof to address the counterexamples in step 2.
        4. Repeat steps 2-3 until nobody can find counterexamples.

        --
        "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
      • (Score: 2) by VLM on Monday May 04, @06:33PM (1 child)

        by VLM (445) on Monday May 04, @06:33PM (#1441574)

        Poor math education in the west across all countries.

        Teach kids that math proof solve problems. "Prove pythagorean theorem" "Prove quadratic formula"

        Real proof is more like boomer book author Polya or zoomette book author Alcock and it looks more like a map route from "known parts" thru semi-unknown parts to some other semi-known part.

        So the proof for FLT isn't like ten lines its 130 pages from Wiles about how there's a VERY indirect map route between FLT being true and modular forms of elliptic curves (yeah, related tangentially to crypto elliptic curves; wanna bet the NSA solved FLT "back in the 60s" but its still top secret, whereas Wiles solved it in public in the 90s?)

        If you don't like 130 pages of elliptic curves, try 1200 pages of Mochizuki's Teichmueller theory which is total WTF to me but its an interesting plausible alternative map route to get from "there" back to "known space" thus proving its true because if it weren't true there'd be no route back.

        About a century ago some smartasses found a way to add 1 plus 1 using pure set theory and it took like "1000 pages" at least three printed book volumes. The response from the general public was to make fun of stupid mathematicians because we already have arithmetic so who needs a set theory way to add 1+1. The whole point of the exercise to mathematicians is those two areas were not supposed to connect. It would be like abusing the absolute F out of google maps until it returned a 1500 page long land route between Hawaii and California and if you follow all 1500 pages carefully it actually works to everyone's utter shock, because there' isn't supposed to BE a land route between Hawaii and California... yet there it is...

        So thats what real math proof is about vs math proof in 8th grade algebra which is kinda something else.

        Given that "real proof" is more like map routing where in theory every step is accurate thus the entire path must be accurate, there's a lot of battle about brief but crappy individual steps like "turn left on the dirt road after you pass farmer Bob's sick cow" Well yeah that might work but professionals in the field are either going to give the author a bunch of shit for being too detailed or too vague (or both simultaneously if they just want to be jerks, see the response to most of Wolfram's popular science / popular math books)

        • (Score: 1, Informative) by Anonymous Coward on Monday May 04, @09:01PM

          by Anonymous Coward on Monday May 04, @09:01PM (#1441591)

          About a century ago some smartasses found a way to add 1 plus 1 using pure set theory and it took like "1000 pages" at least three printed book volumes.

          The most common model of the natural numbers (Peano Arithmetic) in set theory today is the Von Neumann model, which I think was published around the early 1920s, which defines 0 as the empty set and the successor function S(x) = x ∪ {x}. So 1 + 1 = S(1) = S(S(0)) = {∅,{∅}}. The union operation and the empty set are normally axioms of set theory. There is no way this takes 1000 pages to explain.

    • (Score: 3, Interesting) by JoeMerchant on Monday May 04, @03:59PM

      by JoeMerchant (3937) on Monday May 04, @03:59PM (#1441557)

      My undergraduate introduction to P vs NP was flawed... the professor talked about transforms of expressions and he illustrated the expressions on the chalkboard in sum-of-products form. This bothered me, because I "could see" how to transform sum-of-products expressions algorithmically, in polynomial time. I cornered him after the next class and showed him the approach... his first answer was a dogmatic: "what you're claiming is P=NP, and that's basically impossible with such a simple demonstration..." Finally, he put 2 and 2 together, realized I was demonstrating solutions for sum-of-products (as he wrote out on the chalkboard as an example), and said "no, no, no... the NP hard problem is transformation of product-of-sums expressions." Ah, well, let me think about that - yeah, that is quite a bit more difficult, perhaps even NP hard...

      Sum of Products: AB + BC + ABD + BD

      Product of Sums: (A + B)*(B + C)*(A + B + D)*(B + D)

      --
      🌻🌻🌻🌻 [google.com]
    • (Score: 2) by VLM on Monday May 04, @06:11PM

      by VLM (445) on Monday May 04, @06:11PM (#1441572)

      The vast majority of the time, what you'll learn is that your approach has been tried a bunch of times before

      From the erdos website "There are 1217 problems in the database" And this is a popular site.

      I suspect there's a lot of room out there for obscure proofs of obscure things for AI to "fill in the blanks" not because no human can solve it but because no human as ever considered the problem.

      Pretty good analogy to humans writing computer programs. There are individual hard to write programs (the linux kernel, etc) but there are near infinite possible programs to write provably correct solutions. I'd propose this news story is more the latter than the former.

  • (Score: 3, Insightful) by shrewdsheep on Monday May 04, @02:34PM (4 children)

    by shrewdsheep (5215) on Monday May 04, @02:34PM (#1441549) Journal

    Results like these show that LLMs can generate original output, i.e. output that is not just pre-existing wording (maybe with changed wording or sentence construction). This is also my own experience when challenging LLMs with tasks. They are of course not yet "thinking" but they are somewhere in between regurgitators and thinkers. I still have very disappointing experiences when the LLM goes round in circles (esp. when doing sys-admin) when they are worse than regurgitators (google search is more efficient) but when discussing scientific concepts, LLMs tend to perform well in my experience.

    • (Score: 3, Interesting) by JoeMerchant on Monday May 04, @04:06PM

      by JoeMerchant (3937) on Monday May 04, @04:06PM (#1441559)

      > they are somewhere in between regurgitators and thinkers.

      As are we all... if you're not mostly a regurgitator then nobody else understands/can relate to you.

      > I still have very disappointing experiences when the LLM goes round in circles

      I used to get that a lot, particularly from CoPilot this time last year. Claude was better but would still do it occasionally. Claude (for software creation) has gotten dramatically better up through the Opus 4.6 release. I haven't used Opus 4.7 too much, I have seen some online critique of 4.7 "backtracking" in its thinking where it will show streams like "no, wait, I need to look at this differently..." and run through a given problem several times before reaching a conclusion - I'm not sure that's a bad thing at all, as long as it doesn't run a backtracking loop so big that it ends up going in circles - better to explore alternatives and then choose the best than to confidently advance on the first thing that looks like it might work, in my opinion. Even if it's "wasting tokens" - better to waste tokens than my time and brain power. I, too, release CO2 when I think about problems...

      --
      🌻🌻🌻🌻 [google.com]
    • (Score: 2, Disagree) by pTamok on Monday May 04, @05:59PM

      by pTamok (3042) on Monday May 04, @05:59PM (#1441570)

      Results like these show that LLMs can generate original output, i.e. output that is not just pre-existing wording

      Die generate original output all the time. Take a die. Roll it 23 times, recording the result after each roll. That sequence of digits is likely unique, in that the number of different sequences available is about the same as the number of seconds since the Big Bang.

      Originality means nothing. As the college professor said of a student's thesis, it was both good and original: but the parts that were good, were not original, and the parts that were original were not good. LLMs provide an output that has the statistical properties of the corpus they were trained on: rearranging the tokens in statistically probable ways. The sequence you get back from an LLM in response to the prompt is likely original, in that it is unlikely to have been seen before, but whether it is meaningful is another question. Perhaps the resulting dequence is useful to your use-case: but only you can decide that.

    • (Score: 2, Offtopic) by Thexalon on Monday May 04, @07:08PM

      by Thexalon (636) on Monday May 04, @07:08PM (#1441581)

      The best thinkers come up with ideas that are original, correct, good, and useful. A small child with fingerpaints comes up with very "original" paintings, often on surfaces other than the intended medium, but that doesn't mean that the results belong in the Louvre.

      --
      "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
    • (Score: 3, Interesting) by ese002 on Monday May 04, @10:47PM

      by ese002 (5306) on Monday May 04, @10:47PM (#1441593)

      Results like these show that LLMs can generate original output, i.e. output that is not just pre-existing wording (maybe with changed wording or sentence construction).

      I don't think was ever in doubt that LLM's could generate original output. They do it all the time. They are called hallucinations because they are often wildly incorrect. Output that is reliably correct is seldom original. This seems to be a case of useful hallucination. A messy result that was not actually wrong. Humans were then able to restructure it into something useful.

(1)