Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday April 07 2021, @07:14PM   Printer-friendly
from the what-languages-does-it-work-best-and-worst-on? dept.

Google Posts Initial Code For Lyra Speech Codec

Back in February we covered Google's work on the Lyra voice/audio codec designed for fitting with very low bit-rate audio for speech compression in use-cases like WebRTC and video chatting even on the most limited Internet connections. Thanks to leveraging machine learning, Lyra can function at just 3kbps. The code to Lyra is now public.

[...] The Lyra high-quality, low-bitrate speech codec is open-source with an initial v0.0.1 beta commit made today. Building Lyra requires the Bazel build system as well as a particular revision of LLVM/Clang for ABI compatibility.

GitHub (Apache-2.0 License).

Also at VentureBeat and CNX Software.

Previously:
Google Unveils Lyra Audio Codec with Better Speech Compression than Opus


Original Submission

Related Stories

Google Unveils Lyra Audio Codec with Better Speech Compression than Opus 18 comments

Lyra: A New Very Low-Bitrate Codec for Speech Compression

Since the inception of Lyra, our mission has been to provide the best quality audio using a fraction of the bitrate data of alternatives. Currently, the royalty-free open-source codec Opus, is the most widely used codec for WebRTC-based VOIP applications and, with audio at 32kbps, typically obtains transparent speech quality, i.e., indistinguishable from the original. However, while Opus can be used in more bandwidth constrained environments down to 6kbps, it starts to demonstrate degraded audio quality. Other codecs are capable of operating at comparable bitrates to Lyra (Speex, MELP, AMR), but each suffer from increased artifacts and result in a robotic sounding voice.

Lyra is currently designed to operate at 3kbps and listening tests show that Lyra outperforms any other codec at that bitrate and is compared favorably to Opus at 8kbps, thus achieving more than a 60% reduction in bandwidth. Lyra can be used wherever the bandwidth conditions are insufficient for higher-bitrates and existing low-bitrate codecs do not provide adequate quality.

[...] The implications of technologies like Lyra are far reaching, both in the short and long term. With Lyra, billions of users in emerging markets can have access to an efficient low-bitrate codec that allows them to have higher quality audio than ever before. Additionally, Lyra can be used in cloud environments enabling users with various network and device capabilities to chat seamlessly with each other. Pairing Lyra with new video compression technologies, like AV1, will allow video chats to take place, even for users connecting to the internet via a 56kbps dial-in modem.

This should help make an 8 MiB copy of Shrek sound even better.

Also at CNX Software and Phoronix.


Original Submission

Google Announces Lyra V2 Low Bit-Rate Speech Codec 12 comments

Google Announces Lyra V2 Low Bit-Rate Voice Codec

Lyra V2 is summed up by Google as being "a better, faster, and more versatile speech codec...a new architecture that enjoys a wider platform support, provides scalable bitrate capabilities, has better performance, and generates higher quality audio."

Lyra V2 makes use of the SoundStream end-to-end neural audio codec, continues showing much better performance than the Opus audio codec, improved audio quality, and more. The Lyra V2 open-source code is available today.

Lyra 1.2.0 on GitHub. New features:

  • Speed is significantly faster (~5x improvement seen on Android devices).
  • The SoundStream-based model produces significantly higher quality speech (when comparing 3kbps V1 to 3.2 kbps V2).
  • Selectable bitrate (3200, 6000, 9200 bits per second).
  • Codec latency reduced from 100 ms to 20 ms.
  • Mac and Windows support (in addition to continuing support for Linux and Android). Note: we have verified that these build, and run correctly, but have numerous compilation and linker warnings (Windows in particular due to MSVC/gcc mismatch). These issues and support for other platforms like iOS can be addressed by modifying the .bazelrc file. We welcome community contributions for this.
  • More portable code: The TensorFlow Lite model in the .tflite files can be used in other platforms. The TFLite runtime is optimized for individual platforms, replacing the need to write platform specific assembly.

Lyra (codec).

Previously: Google Unveils Lyra Audio Codec with Better Speech Compression than Opus
Google Posts First Beta Code for Lyra Speech Compression Codec


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0, Interesting) by Anonymous Coward on Wednesday April 07 2021, @07:28PM (1 child)

    by Anonymous Coward on Wednesday April 07 2021, @07:28PM (#1134409)

    >> Thanks to leveraging machine learning, Lyra can function at just 3kbps

    That's fine if you can use machine learning, but most humans are limited to what they learned in grade school. So all you will hear on your end of the conversation is "garble garble garble", while Google hears all the juicy personal information with no problems.

    • (Score: 0) by Anonymous Coward on Wednesday April 07 2021, @07:31PM

      by Anonymous Coward on Wednesday April 07 2021, @07:31PM (#1134411)

      So it's the opposite of homomorphic encryption?

  • (Score: 2) by Booga1 on Wednesday April 07 2021, @07:51PM (9 children)

    by Booga1 (6333) on Wednesday April 07 2021, @07:51PM (#1134418)

    I don't think this would be a big win unless it comes with a significant improvement to correction for packet loss and jitter.
    The tiny bandwidth savings over existing codecs is swamped by the video stream if that's the use case they're pushing. We've got other open and free codecs for low bitrate audio already so it better be bringing something to the table beyond "it's from Google" going for it.

    • (Score: 2) by istartedi on Wednesday April 07 2021, @08:22PM

      by istartedi (123) on Wednesday April 07 2021, @08:22PM (#1134428) Journal

      This. Digital is great for a lot of things, but it handles interference poorly if it isn't done right. And yes, audio is a small fraction of the bandwidth in audio-visual content so it's like "look, the dots on the i in this font use 10% less ink".

      --
      Appended to the end of comments you post. Max: 120 chars.
    • (Score: 3, Interesting) by knarf on Wednesday April 07 2021, @08:41PM (1 child)

      by knarf (2042) on Wednesday April 07 2021, @08:41PM (#1134431)

      > The tiny bandwidth savings over existing codecs is swamped by the video stream

      ...which is why I'm waiting for a similar effort on video compression. If instead of a bandwidth-guzzling compressor-defeating image of a choppy lake or blowing leaves the compressor just sends "lake, choppy(x,y,x)" or "leaves, blowing(x,y,z)" that video suddenly fits through a drinking straw. A large part of Utube will simply be "painted girl gyrating her behind to bad music" which has the potential to save piles of money in bandwidth not wasted, giving Google a real incentive to get crackin' on this technology.

      • (Score: 2) by Tork on Wednesday April 07 2021, @09:57PM

        by Tork (3914) Subscriber Badge on Wednesday April 07 2021, @09:57PM (#1134470)
        hah. I think you're right. I imagine somewhere in the middle, though, we'll have: "This codec for Michael Bay-type movies, this codec for star wars movies, this codec for rom-coms, and.." ... oh who am I kidding, the fleshtone codec's certainly coming first.
        --
        🏳️‍🌈 Proud Ally 🏳️‍🌈
    • (Score: 2) by takyon on Wednesday April 07 2021, @09:24PM (3 children)

      by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Wednesday April 07 2021, @09:24PM (#1134450) Journal

      One of the goals is video chat over a 56 Kbps connection using the AV1 video codec. That's probably easy to achieve at low resolution and less than 30 FPS. Compare to 12.1 Kbps Shrek [soylentnews.org]. The AV2 codec should be around within a few years.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 1, Interesting) by Anonymous Coward on Wednesday April 07 2021, @09:37PM (2 children)

        by Anonymous Coward on Wednesday April 07 2021, @09:37PM (#1134462)

        I had webcam chats at low resolution on a 56k back in the late 90's. It was just 240x160 @12 fps, but it worked. The camera could do 640x480, but that was too much for 56k at the time. I'm sure it'd look way better now than it did then, but it's already possible. It's not some groundbreaking new tech if it's not improving the quality.

        • (Score: 2) by takyon on Wednesday April 07 2021, @10:57PM

          by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Wednesday April 07 2021, @10:57PM (#1134498) Journal

          You can count on it. Codecs are obviously better now, although I think they do better at higher resolutions where there's lots of similar pixels. I assume you need to slash 56K to a 25 Kb/s bitrate to account for two-way communication. Subtract 3 for Lyra, leaving about 22 for AV1.

          --
          [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
        • (Score: 2) by darkfeline on Thursday April 08 2021, @07:37AM

          by darkfeline (1030) on Thursday April 08 2021, @07:37AM (#1134704) Homepage

          It is improving the quality, that's the entire point. RTFA.

          --
          Join the SDF Public Access UNIX System today!
    • (Score: 2, Funny) by Anonymous Coward on Thursday April 08 2021, @01:22AM

      by Anonymous Coward on Thursday April 08 2021, @01:22AM (#1134568)

      And if they really want lowest bandwidth on phone calls they could just use peach wreck ignition and texture messages and then revocalize them using tech stew speech tech.

    • (Score: 2) by bzipitidoo on Thursday April 08 2021, @05:37AM

      by bzipitidoo (4388) on Thursday April 08 2021, @05:37AM (#1134672) Journal

      In theory, you can treat error correction and data compression as completely separate and independent problems. In practice, there is a little bit of savings to be had from "joint source-channel coding" as it's called, that is, combining data compression and error correction into one encoding.

  • (Score: 0) by Anonymous Coward on Wednesday April 07 2021, @09:37PM (1 child)

    by Anonymous Coward on Wednesday April 07 2021, @09:37PM (#1134463)

    And G**gl* is designing smaller deck chairs.

    • (Score: 0) by Anonymous Coward on Thursday April 08 2021, @02:45PM

      by Anonymous Coward on Thursday April 08 2021, @02:45PM (#1134787)

      And the band is playing on smaller violins.

  • (Score: 1) by zion-fueled on Thursday April 08 2021, @02:32PM (1 child)

    by zion-fueled (8646) on Thursday April 08 2021, @02:32PM (#1134783)

    Now google will have the power to censor you real time over the phone. Who knows if what you said will really be what you said. It certainly won't be your voice.

    • (Score: 0) by Anonymous Coward on Thursday April 08 2021, @02:45PM

      by Anonymous Coward on Thursday April 08 2021, @02:45PM (#1134788)

      OMG my mom is going to be so pissed.

(1)