Stories
Slash Boxes
Comments

SoylentNews is people

Submission Preview

Link to Story

Acoustics Researchers Decompose Sound Accurately Into its Three Basic Components

Accepted submission by hubie at 2023-10-24 03:47:56
Science

Any sound can now be perfectly replicated by a combination of whistles, clicks, and hisses, with implications for sound processing across the media landscape [aalto.fi]:

Researchers have been looking for ways to decompose sound into its basic ingredients for over 200 years. In the 1820s, French scientist Joseph Fourier proposed that any signal, including sounds, can be built using sufficiently many sine waves. These waves sound like whistles, each have their own frequency, level and start time, and are the basic building blocks of sound.

However, some sounds, such as the flute and a breathy human voice, may require hundreds or even thousands of sines to exactly imitate the original waveform. This comes from the fact that such sounds contain a less harmonical, more noisy structure, where all frequencies occur at once. One solution is to divide sound into two types of components, sines and noise, with a smaller number of whistling sine waves and combined with variable noises, or hisses, to complete the imitation.

Even this 'complete' two-component sound model has issues with the smoothing of the beginnings of sound events, such as consonants in voice or drum sounds in music. A third component, named transient, was introduced around the year 2000 to help model the sharpness of such sounds. Transients alone sound like clicks. From then on, sound has been often divided into three components: sines, noise, and transients.

The three-component model of sines, noise and transients has now been refined by researchers at Aalto University Acoustics Lab, using ideas from auditory perception, fuzzy logic, and perfect reconstruction.

Doctoral researcher Leonardo Fierro and professor Vesa Välimäki realized the way that people hear the different components and separate whistles, clicks, and hisses is important. If a click gets spread in time, it starts to ring and sound noisier; by contrast, focusing on very brief sounds might cause some loss of tonality.

Doctoral researcher Leonardo Fierro and professor Vesa Välimäki realized the way that people hear the different components and separate whistles, clicks, and hisses is important. If a click gets spread in time, it starts to ring and sound noisier; by contrast, focusing on very brief sounds might cause some loss of tonality.

[...] 'The new sound decomposition method opens many exciting possibilities in sound processing,' says professor Välimäki. 'The slowing down of sound is currently our main interest. It is striking that for example in sports news, the slow-motion videos are always silent. The reason is probably that the sound quality in current slow-down audio tools is not good enough. We have already started developing better time-scale modification methods, which use a deep neural network to help stretch some components.'

The high-quality sound decomposition also enables novel types of music remixing techniques. One of them leads to distortion-free dynamic range compression. Namely, the transient component often contains the loudest peaks in the sound waveform, so simply reducing the level of the transient component and mixing it back with the others can limit the peak-to-peak value of audio.

Journal Reference:
Fierro, L. & Välimäki, V. (2023). Enhanced Fuzzy Decomposition of Sound Into Sines, Transients, and Noise. Journal of the Audio Engineering Society. doi: 10.17743/jaes.2022.0077 [doi.org]


Original Submission