Human speech may have a universal transmission rate: 39 bits per second
Italians are some of the fastest speakers on the planet, chattering at up to nine syllables per second. Many Germans, on the other hand, are slow enunciators, delivering five to six syllables in the same amount of time. Yet in any given minute, Italians and Germans convey roughly the same amount of information, according to a new study. Indeed, no matter how fast or slowly languages are spoken, they tend to transmit information at about the same rate: 39 bits per second, about twice the speed of Morse code.
"This is pretty solid stuff," says Bart de Boer, an evolutionary linguist who studies speech production at the Free University of Brussels, but was not involved in the work. Language lovers have long suspected that information-heavy languages—those that pack more information about tense, gender, and speaker into smaller units, for example—move slowly to make up for their density of information, he says, whereas information-light languages such as Italian can gallop along at a much faster pace. But until now, no one had the data to prove it.
Scientists started with written texts from 17 languages, including English, Italian, Japanese, and Vietnamese. They calculated the information density of each language in bits—the same unit that describes how quickly your cellphone, laptop, or computer modem transmits information. They found that Japanese, which has only 643 syllables, had an information density of about 5 bits per syllable, whereas English, with its 6949 syllables, had a density of just over 7 bits per syllable. Vietnamese, with its complex system of six tones (each of which can further differentiate a syllable), topped the charts at 8 bits per syllable.
From the Abstract:
"Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages' structural properties and their speakers' neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture."
(Score: 0) by Anonymous Coward on Friday September 06 2019, @07:09PM
This is reasonably straightforward estimation to make with some basic simplifying assumptions.
Claude Shannon showed way back in the early 1950s [princeton.edu] that written English has about 1 bit of information per letter.
It's hard to judge spoken English by the same metric because the actual words that come out of your mouth are only a small portion of the information conveyed by spoken communication. But if we completely ignore things like body language (let's imagine we are dealing exclusively with written transcripts of audio recordings taken out of their original context) then this is probably pretty similar to regular written English in terms of its information content per letter.
So by this, 39 bits per second is roughly equivalent to 39 characters per second or about 2.5k characters per minute.
Typists long ago standardized that a "word" is 5 characters, so 2.5k characters per minute is 500 words per minute. Suggesting this is merely twice as fast as Morse code seems within the realm of possibility. But it is a bit disingenuous as it assumes a morse code operator capable of over 200 words per minute, which is at world-record levels of proficiency...