In a study published last month https://www.nature.com/articles/s41562-025-02297-0 researchers analyze internal sentence representation for both humans and LLMs. It turns out that humans and LLMs use similar tree structures. Quote from their conclusions: "The results also add to the literature showing that the human brain and LLM, albeit fundamentally different in terms of the implementation, can have aligned internal representations of language."
Originally seen on techxplore https://techxplore.com/news/2025-10-humans-llms-sentences-similarly.html:
A growing number of behavioral science and psychology studies have thus started comparing the performance of humans to those of LLMs on specific tasks, in the hope of shedding new light on the cognitive processes involved in the encoding and decoding of language. As humans and LLMs are inherently different, however, designing tasks that realistically probe how both represent language can be challenging.
Researchers at Zhejiang University have recently designed a new task for studying sentence representation and tested both LLMs and humans on it. Their results, published in Nature Human Behavior, show that when asked to shorten a sentence, humans and LLMs tend to delete the same words, hinting at commonalities in their representation of sentences.
"Understanding how sentences are represented in the human brain, as well as in large language models (LLMs), poses a substantial challenge for cognitive science," wrote Wei Liu, Ming Xiang, and Nai Ding in their paper. "We develop a one-shot learning task to investigate whether humans and LLMs encode tree-structured constituents within sentences."
[...] nterestingly, the researchers' findings suggest that the internal sentence representations of LLMs are aligned with linguistics theory. In the task they designed, both humans and ChatGPT tended to delete full constituents (i.e., coherent grammatical units) as opposed to random word sequences. Moreover, the word strings they deleted appeared to vary based on the language they were completing the task in (i.e., Chinese or English), following language-specific rules.
"The results cannot be explained by models that rely only on word properties and word positions," wrote the authors. "Crucially, based on word strings deleted by either humans or LLMs, the underlying constituency tree structure can be successfully reconstructed."
Overall, the team's results suggest that when processing language, both humans and LLMs are guided by latent syntactic representations, specifically tree-structured sentence representations. Future studies could build on this recent work to further investigate the language representation patterns of LLMs and humans, either using adapted versions of the team's word deletion task or entirely new paradigms.
Journal Reference: Liu, W., Xiang, M. & Ding, N. Active use of latent tree-structured sentence representation in humans and large language models. Nat Hum Behav (2025). https://doi.org/10.1038/s41562-025-02297-0
(Score: 0) by Anonymous Coward on Tuesday October 28, @01:48PM
Who'da thunk?
(Score: 2, Interesting) by Anonymous Coward on Tuesday October 28, @01:49PM
> "Crucially, based on word strings deleted by either humans or LLMs, the underlying constituency tree structure can be successfully reconstructed."
That's not what I found recently when I asked Gemini to shorten a one-page professional biography for use in a conference program. It shortened it all right. But the things it took out changed the meaning dramatically--in my book that is unsuccesful.
I wrote my own summary this time.
Maybe I'll try again next year (this is a recurring problem).
(Score: 2, Insightful) by Anonymous Coward on Tuesday October 28, @02:45PM (2 children)
Technical creation modeled on Human Physiology shows resemblances to human physiology in structure and behavior
News at 11.
I can't find it, but there was a scientific study of LLMs by someone doing brain research who concluded that you could use LLMs to get better, deeper insight into how the brain works. Never mind the fact that they were created by looking at how the brain works, and reimplementing that in software. Sigh. Garbage in, garbage out. Train it on itself, and everything is true!
(Score: 5, Interesting) by ikanreed on Tuesday October 28, @03:05PM (1 child)
LLMs aren't based on human physiology a la neural nets. They are just ridiculously large matrices, incrementally multiplying a vector of tokens in an absurdly high dimensional vector space.
I don't think there's any homology or even analogy there.
But the training data is all human output, so I guess it's going to pick up human like patterns that way.
(Score: 0) by Anonymous Coward on Tuesday October 28, @09:42PM
Support:
- https://scitechdaily.com/neurons-astrocytes-and-transformers-are-ai-models-biologically-plausible/ [scitechdaily.com]
Transformer model is not a clone of a biological development.
(Score: 5, Interesting) by Mojibake Tengu on Tuesday October 28, @03:06PM (1 child)
Syntactic representation of sentence (program's encoding) and semantic representation of sentence (program's meaning) are not necessarily isomorphic. Meaning is co-dependent on actual execution model, not just on representation model. So, this research pivoted on ad-hoc representation language is pointless.
Generally, we can prove all and any LLMs can be encapsulated by one Turing Machine. Just one, let's call it TLLM. As a graph, it topologically closes the whole class of LLM machines, a minimal closure.
The problem is, if minimal, TLLM is not Universal Turing Machine: because the whole class of LLMs is incapable of certain problems, for example problems formulated as total recursion.
Arithmetic hyperoperations are out of reach for them. And sometimes they already know that, at least Gemini or GPT know, systematically refusing to compute Ackermann's function, one of the simplest totally recursive functions.
LLMs only do estimate. And for about 100 years already, Ackermann's function is proven not polynomially estimatable. So they can't do what cannot be done.
Conclusively, there is a huge class of programs which cannot be represented by LLM, but still can be represented by brain of a programmer who understands some stronger machines.
That means, seeking similarities between LLM and human mind is funny. LLM is definitely weaker.
I predict this will be historically fatal to political factions who employ LLMs to design and perform military strategies.
Judging by unsatisfactory field results of current wars, I am
afraidhappy that's already happening.Rust programming language offends both my Intelligence and my Spirit.
(Score: 1) by khallow on Wednesday October 29, @11:45AM
That doesn't seem a useful way to distinguish between LLMs and human brains - because Ackermann's function is out of reach for human brains too. Understanding that more powerful, theoretical machines could do the necessary calculations doesn't make the human mind that machine.
I think also there's the matter of how much LLMs (as well as our other attempts at AI) can be augmented or upgraded. If they have problems with meaning, could they be augmented to that level?
(Score: 5, Interesting) by kolie on Tuesday October 28, @09:46PM
This is a fascinating study, but I have to ask: Isn't this conclusion almost tautological?
The article presents it as a noteworthy discovery that LLMs and humans "can have aligned internal representations of language." But considering how LLMs are built, how could they not?
Large Language Models are, by definition, trained on a colossal corpus of human-generated language. Their entire objective function is to statistically analyze this data and get incredibly good at predicting the next token (word) in a sequence.
To do this successfully, the model must learn the underlying patterns, grammar, and syntax of its training data. These "latent tree-structured sentence representations" that the researchers found are, in essence, the very rules of human language that the LLM was forced to model to minimize its prediction error.
So, finding that an LLM's internal representation aligns with human linguistic structures isn't a discovery of a surprising cognitive parallel; it's a validation that the training worked. The model was fed human language, so it built a statistical map that reflects the structure of human language.
It feels a bit like the "tail wagging the dog" to be surprised by this. We've fed a machine the entire history of human text and asked it to "predict what comes next," so it's a necessary outcome- not a coincidence - that its internal logic for doing so mirrors the human logic that created the text in the first place.
(Score: 2) by mcgrew on Wednesday October 29, @04:25PM
As humans and LLMs are inherently different, however, designing tasks that realistically probe how both represent language can be challenging.
This is the problem with one science trying to use a different field they are untrained in. They miss the fact that of course they're similar; LLMs were built by humans to mimic humans!
I found when working with PhDs that one can be dumb as a box of rocks (apologies to geologists) and still hold a PhD. However, that said, all the rest of them were smarter than me and I learned from them.
No one born who could always afford anything he wanted can have a clue what "affordability" means.