Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by janrinok on Friday September 06 2019, @03:35PM   Printer-friendly
from the intelligent-content-might-be-significantly-lower dept.

Human speech may have a universal transmission rate: 39 bits per second

Italians are some of the fastest speakers on the planet, chattering at up to nine syllables per second. Many Germans, on the other hand, are slow enunciators, delivering five to six syllables in the same amount of time. Yet in any given minute, Italians and Germans convey roughly the same amount of information, according to a new study. Indeed, no matter how fast or slowly languages are spoken, they tend to transmit information at about the same rate: 39 bits per second, about twice the speed of Morse code.

"This is pretty solid stuff," says Bart de Boer, an evolutionary linguist who studies speech production at the Free University of Brussels, but was not involved in the work. Language lovers have long suspected that information-heavy languages—those that pack more information about tense, gender, and speaker into smaller units, for example—move slowly to make up for their density of information, he says, whereas information-light languages such as Italian can gallop along at a much faster pace. But until now, no one had the data to prove it.

Scientists started with written texts from 17 languages, including English, Italian, Japanese, and Vietnamese. They calculated the information density of each language in bits—the same unit that describes how quickly your cellphone, laptop, or computer modem transmits information. They found that Japanese, which has only 643 syllables, had an information density of about 5 bits per syllable, whereas English, with its 6949 syllables, had a density of just over 7 bits per syllable. Vietnamese, with its complex system of six tones (each of which can further differentiate a syllable), topped the charts at 8 bits per syllable.

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

From the Abstract:

"Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages' structural properties and their speakers' neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture."


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Friday September 06 2019, @07:09PM

    by Anonymous Coward on Friday September 06 2019, @07:09PM (#890655)

    9 syllables per second is what they apparently average 39 bits per second to (5-8 bits per syllable), and then suggest this is twice as fast as Morse Code. Wonder how they made the leap between bits/syllables to words per minute.

    This is reasonably straightforward estimation to make with some basic simplifying assumptions.

    Claude Shannon showed way back in the early 1950s [princeton.edu] that written English has about 1 bit of information per letter.

    It's hard to judge spoken English by the same metric because the actual words that come out of your mouth are only a small portion of the information conveyed by spoken communication. But if we completely ignore things like body language (let's imagine we are dealing exclusively with written transcripts of audio recordings taken out of their original context) then this is probably pretty similar to regular written English in terms of its information content per letter.

    So by this, 39 bits per second is roughly equivalent to 39 characters per second or about 2.5k characters per minute.

    Typists long ago standardized that a "word" is 5 characters, so 2.5k characters per minute is 500 words per minute. Suggesting this is merely twice as fast as Morse code seems within the realm of possibility. But it is a bit disingenuous as it assumes a morse code operator capable of over 200 words per minute, which is at world-record levels of proficiency...