Every time we speak, we're improvising:
"Humans possess a remarkable ability to talk about almost anything, sometimes putting words together into never-before-spoken or -written sentences," said Morten H. Christiansen, the William R. Kenan, Jr. Professor of Psychology in the College of Arts and Sciences.
We can improvise new sentences so readily, language scientists believe, because we have acquired mental representations of the patterns of language that allow us to combine words into sentences. The nature of those patterns and how they work, however, remains a puzzle in cognitive science, Christiansen said.
[...] For decades, scientists have believed we rely on a complex mental grammar to build sentences that have hierarchically organized structure – like a branching tree. But Christiansen and Nielsen suggest that our mental representations might be more like snapping together pre-assembled LEGO pieces (such as a door frame or a wheel set) into a complete model. Instead of intricate hierarchies, they propose, we use small, linear chunks of word classes like nouns and verbs – including short sequences that can't be formed by way of grammar, such as "in the middle of the" or "wondered if you."
[...] The prevailing theory since at least the 1950s is based on hierarchical, tree-like mental representations, setting humans apart from other animals, Christiansen said. In this view, words and phrases combine according to the principles of grammar into larger units called constituents. For example, in the sentence "She ate the cake," "the" and "cake" combine into a noun phrase "the cake", which then combines with "ate" into the verb phrase "ate the cake," and finally with "she" to make the sentence.
"But not all sequences of words form constituents," Christiansen and Nielsen wrote in a summary of their paper. "In fact, the most common three- or four-word sequences in language are often nonconstituents, such as 'can I have a' or 'it was in the.'"
Because they don't conform to grammar, nonconstituent sequences have been overlooked. But they do play a role in a speaker's knowledge of their language, the researchers found.
In experiments, an eye-tracking study and an analysis of phone conversations, they discovered that linear sequences of word classes can be "primed," meaning when we hear or read them once, we process them faster the next time. That's compelling evidence they're part of our mental representation of language, Christiansen said. In other words, they're a key part of our mental representation of language that goes beyond the rules of grammar.
"I think the main contribution is showing that traditional rules of grammar cannot capture all of the mental representations of language structure," Nielsen said.
"It might even be possible to account for how we use language in general with flatter structure," Christiansen said. "Importantly, if you don't need the more complex machinery of hierarchical syntax, then this could mean that the gulf between human language and other animal communication systems is much smaller than previously thought."
Journal Reference: Nielsen, Y.A., Christiansen, M.H. Evidence for the representation of non-hierarchical structures in language. Nat Hum Behav (2026). https://doi.org/10.1038/s41562-025-02387-z
(Score: 5, Interesting) by JoeMerchant on Monday February 02, @05:30PM (10 children)
There's an old saw about "working memory limits" typically demonstrated as: the average person starts to struggle to remember a sequence of more than 6 numbers, 7 is a typical upper limit... but, there are workarounds: https://en.wikipedia.org/wiki/Piphilology [wikipedia.org]
Without going overboard, if you can "chunk" a number sequence like: 3 5 1 5 4 6 7 2 into recognizable entities like 351 54 6 7 2 - now that's just a sequence of 5 to remember: the 351cu in V8, Car 54, and 6 7 2.
My (profoundly autistic communication impaired) son typically speaks in 2 or 3 word sentences (utterances), but sometimes he will recite a whole phrase as a single chunk: "ride in the orange car now please daddy!" I'm pretty sure he's processing that as a single concept.
🌻🌻🌻🌻 [google.com]
(Score: 3, Interesting) by pTamok on Monday February 02, @08:10PM (2 children)
Aye, when I was younger, I memorized pi to 50 decimal places. I recall it even now, and it is definitely chunked, and there is a rhythm. Same with memorizing some Shakespeare, Coleridge, and Shelley. I don't think it is implausible that language is 'chunked', and we can shuffle around the chunks.
(Score: 0) by Anonymous Coward on Tuesday February 03, @03:06PM (1 child)
Maybe you have a 50 decimal place pi neuron just like someone has a Halle Berry neuron: https://www.caltech.edu/about/news/single-cell-recognition-halle-berry-brain-cell-1013 [caltech.edu]
(Score: 2, Funny) by pTamok on Wednesday February 04, @07:05AM
Darn. I shall have to continue administering alcohol until it is eradicated and then I can lead a normal life.
(Score: 3, Interesting) by aafcac on Tuesday February 03, @12:31AM (6 children)
That sort of thing has been known for quite a while. Words don't normally show up randomly, they tend to have collocations of other words that go with them. And IMHO, at the start of learning a language, it's better to think in terms of sentence frames where you're swapping out a word or two to generate a new meaning. It's more or less what you're doing most of the time as one of the goals of fluency is to be able to express your thoughts without having to fixate on the words you're using. You'll typically find that most languages front load the irregular words in terms of stuff that you'll encounter early on as the less commonly used phrases don't get used often enough to be able to maintain oddities.
When there are serious misunderstandings, it's usually not grammar doing it, it's far more likely to be an outright wrong word being used. Grammar is typically more about efficiency than actual communication and you can get a lot further than people oftentimes realize with just 2 and 3 word sentences, provided the words are correct. Nobody really needs compound, complex or compound, complex sentences in English. You can use just simple sentences and strip those down from there and still be largely understood, assuming you don't need to communicate anything too complicated.
(Score: 3, Funny) by Reziac on Tuesday February 03, @02:49AM (5 children)
So, we are just bio-instances of a large language model....
And there is no Alkibiades to come back and save us from ourselves.
(Score: 3, Insightful) by aafcac on Tuesday February 03, @03:13AM (1 child)
Pretty much, the stuff we do and think, up until quite recently, was primarily focused on what got us to live long enough to reproduce. A bunch of the stuff that we do for fun is rooted in some sort of evolutionary need.
(Score: 0) by Anonymous Coward on Tuesday February 03, @07:47AM
For very loose definitions of "need".
For example, there's not a strong evolutionary need for music. Sure it can impress potential mates, but then those mates would have had to evolve that "need" to be impressed by music in the first place.
Perhaps it's like the peacock's tail.
In scenarios where not everything needs to be so close to the min-maxing optimums, there's a lot of room for other stuff.
(Score: 0) by Anonymous Coward on Tuesday February 03, @03:03PM
Once you've trained yourself to do stuff, you can do it without thinking too much about it. Ride a bicycle, type the correct letters for words. Use various words for various thoughts.
But how did we train that. How did we get all those vectors to be closer to useful?
Smarter dogs (and probably even crows) can figure out the difference between a bus and a car without thousands of samples. And definitely they won't mistake those for a traffic light.
(Score: 2, Interesting) by pTamok on Wednesday February 04, @08:02AM (1 child)
While amusing, that gets things somewhat reversed. Large Language Models were developed using a model of how people thought the brain works. It turns out that the model is a poor one, and while LLMs offer interesting results, it is clear that they are built according to a model that is an inadequate description of the brain and cognition.
A well-built orrery will give good predictions of where planets will be seen in the night sky seen from Earth. However, it is a model, not reality, and inspection of interplanetary space will not show a system of rods and gears. LLMs are a model of some of the workings of the brain and will give reasonably good imitations of what output from a real brain looks like. No-one expert in the field will claim that the brain is simply an instance of an LLM. Some of the physical features of the cerebellum (multi-layer neural networks) are implemented in software for LLMs, but scaling up something based on an inadequate description of the original does not give you the original, in the same way that putting together a lot of candles does not give you a sun. LLMs are candles, not small suns.
(Score: 2) by Reziac on Wednesday February 04, @02:50PM
Old Beetle Bailey comic:
Sarge (shaking his head over Beetle's myriad deficiencies): "Bailey, you are a model soldier."
Beetle, confused, consults a dictionary and reads: "Model: a small copy of the real thing."
Yeah, LLM to Brain is at best a spotty model. But it makes for an interesting extrapolation, and another way of looking at how we hang words together.
And there is no Alkibiades to come back and save us from ourselves.