Stories
Slash Boxes
Comments

SoylentNews is people

posted by n1 on Monday October 19 2015, @07:02AM   Printer-friendly
from the your-critique-and-your-help-is-welcome dept.

Some of you may know Ken Starks as an advocate for those people (especially kids) who can't afford a computer of their own. In the process of placing used computers with those folks, Ken and his organization use Linux to cut costs and to avoid proprietary gotchas. As such, you may also know him as an advocate for FOSS.

I have mentioned previously that earlier this year, in his ongoing bout with cancer, Ken had his larynx removed.

As a kid, Ken had a frightening experience due to a guy using one of those buzzers pressed against his throat to "speak". Ken doesn't want any part of freaking out any kids in that way. He has been looking for an out-of-the-box text-to-speech app that runs under Linux.

What Ken has found is that, while there are several nice text-to-speech apps for Android, the state of text-to-speech for Desktop Linux is very sad. The developers of FOSS TTS apps tend to reach a state of suits-my-needs and leave it there.

Someone who has a bit of experience setting up software, is adept with a search engine, and has patience can install one of the existing FOSS TTS apps and get that working at a useful level. A nice out-of-the-box experience for Joe Average, however, has been lacking up to now. Ken has been trying to find developers who can bring MaryTTS to a state of usefulness and ease that will make it on par with the experience you would expect from a payware app.

Via his column at FOSS Force, Ken now reports:

[More after the break.]

Although the numbers behind the name do not reflect it, the currently-named "SpeechLess" front end for MaryTTS is now being released as beta software. I was able to assemble a three man team to create a GUI and, to my way of thinking, it has come along nicely. Although the demo is web-based, these guys have been able to construct it so the entire thing is local. That means little to no latency between hitting Enter and having the text replicated to speech.

I've talked at length about how TTS in the Linuxsphere is less than user-friendly at about every turn. Our goal is to create a front end that makes MaryTTS easy to use for everyone. We're getting there.

[...] The quality of the voices [in our current beta release is] acceptable, especially when measured against the majority of voices already available in Linux TTS applications.

[...] A bit of assistance here. Who can create a butler-type graphic character to represent the current application? The name "speechless" is only temporary. We'll decide on a more permanent name once you show us a great servant for the people.

Does the improvement/expansion of the catalog of FOSS apps interest you? Would an improved version of this specific Linux-compatible project be of use to you? Can you be a beta tester and supply feedback? Even better: Do you know someone who also has this disability and can be a beta tester? Can you lend your software development talents to the effort or do you know of someone who would be interested?


Original Submission

Related Stories

Sébastien Jodogne, Reglue Win FSF 2014 Awards 12 comments

The Free Software Foundation's annual conference LibrePlanet is over. Following a tradition two awards are given annually at the conference. The announcement of the event and the winners is here.

The winner of the Award for the Advancement of Free Software in 2014 is Belgian PhD Sébastien Jodogne for his work on medical imaging with his project Orthanc. The winner of the Award for Projects of Social Benefit in 2014 is Reglue which gives recycled GNU/Linux computers to underprivileged children and their families in Austin, TX.

Meta Says its New Speech-generating AI Tool is Too Dangerous to Release 37 comments

Meta says its new speech-generating AI tool is too dangerous to release:

Meta has unveiled a new AI tool, dubbed 'Voicebox', which it claims represents a breakthrough in AI-powered speech generation. However, the company won't be unleashing it on the public just yet - because doing so could be disastrous.

Voicebox is currently able to produce audio clips of speech in six languages (all of which are European of origin), and - according to a blog post from Meta - is the first AI model of its kind capable of completing tasks beyond what it was 'specifically trained to accomplish'. Meta claims that Voicebox handily outperforms competing speech-generation AIs in virtually every area.

So what exactly is it capable of? Well, for starters, it can spew out reasonably accurate text-to-speech replications of a person's voice using a sample audio file as short as two seconds, a seemingly innocuous ability that holds a huge amount of destructive potential in the wrong hands.

[...] Meta clearly believes its new tool is good enough to fool at least the majority of people [since] it's explicitly not releasing Voicebox to the public, but instead publishing a research paper and detailing a classifier tool that can identify Voicebox-generated speech from real human speech. Meta describes the classifier as "highly effective" - though notably not perfectly effective.

[...] A little caution, patience, and respect for the magnitude of this technology is a welcome sight - although I doubt Meta will sit on Voicebox for too long, since the shareholders will no doubt be wondering how much money it can make them...


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: -1, Offtopic) by Anonymous Coward on Monday October 19 2015, @07:54AM

    by Anonymous Coward on Monday October 19 2015, @07:54AM (#251728)

    frosty piss for all!

  • (Score: 2) by TheRaven on Monday October 19 2015, @09:09AM

    by TheRaven (270) on Monday October 19 2015, @09:09AM (#251743) Journal
    Huh? The Sphinx engine from CMU produces very nice output. Lots of things use it. Pocket Sphinx is a lot less advanced, but still impressive in comparison to its resource usage.
    --
    sudo mod me up
    • (Score: 0) by Anonymous Coward on Monday October 19 2015, @09:20AM

      by Anonymous Coward on Monday October 19 2015, @09:20AM (#251745)

      He doesn't want a speech recognition app.

      -- gewg_

      • (Score: 4, Informative) by TheRaven on Monday October 19 2015, @09:31AM

        by TheRaven (270) on Monday October 19 2015, @09:31AM (#251749) Journal
        Sorry, right research project, wrong library. Festival / Flite were the ones that I was thinking of. They're used by quite a lot of things. In GNUstep, the default NSSpeechSynthesizer implementation (and, therefore, the 'say' command-line tool, which is a vocal equivalent of echo) use Flite.
        --
        sudo mod me up
        • (Score: 1, Informative) by Anonymous Coward on Monday October 19 2015, @10:14AM

          by Anonymous Coward on Monday October 19 2015, @10:14AM (#251755)

          I've said before that I don't do video, but if someone has a link to a software novice getting this installed from scratch then running it, that would be proof that it's the right stuff.

          The situation Ken encountered is that the end user has to glue all the pieces together and he wants something that a 3 year old can do. [google.com]

          -- gewg_

          • (Score: 4, Informative) by fnj on Monday October 19 2015, @12:52PM

            by fnj (1654) on Monday October 19 2015, @12:52PM (#251783)

            Festival as packaged is just sad. It's a valuable building block, but a solution it ain't. There isn't even a rudimentary GUI. Selecting a non-default voice is not discoverable, and the default voice is pretty bad. There is no "man festival". "festival -h" is terse and not enough to get you going. A naive attempt to run "festival hello" yields the unhelpful error "SIOD ERROR: could not open file hello".

            Yeah, I googled how to get working output from it. "saytime" works fine if some divine intelligence tells you it is available to run after you install festival, but there is no simple "say" - WTF?

            As it stands, we are frozen in time circa mid-1990s.

            • (Score: 0) by Anonymous Coward on Monday October 19 2015, @02:22PM

              by Anonymous Coward on Monday October 19 2015, @02:22PM (#251822)

              I've found the British voice is much better sounding than the default. I also had it read pidgin messages, there's an add on for that.

  • (Score: 3, Interesting) by ledow on Monday October 19 2015, @10:04AM

    by ledow (5567) on Monday October 19 2015, @10:04AM (#251754) Homepage

    Slackware had SpeakUp available on their install disks for decades, for blind users. Not to mention Festival and stuff like that.

    What's new and different about this compared to anything else? Ease of setup alone?

    Surely the problem is not the setup (I would expect someone who can't talk to be able to set up their PC, but not someone who was blind), but the day-to-day-usage. How easy is it to browse a modern HTML5 + Javascript website, in a compatible browser, with speech to make the process useful? I'm guessing almost impossible, even if the website has bothered to add hints for speech software. How do you pay with PayPal, do your banking, or anything else that people are fast becoming required to do nowadays?

    The problem is not that it's not a one-click process. It's that it's an almost impossible task to use for anything but plain command-line stuff. And enabling your application to be speech capable is difficult for all but the dumbest of apps.

    And I don't believe that they are able to subvert Festival in terms of quality - that's been ongoing for decades.

  • (Score: 0) by Anonymous Coward on Monday October 19 2015, @10:46AM

    by Anonymous Coward on Monday October 19 2015, @10:46AM (#251763)

    FA-TTS is a Text-to-Speech engine with a REST API by mivoq.it It uses marytts-server

    docker pull fic2/fatts

    docker run -d -p 59125:59125 fic2/fatts

    Then you can browse http://localhost:59125 [localhost] to test it.

    If you don't have docker the dockerfile show how to install it:

    apt-get -y update \
            && apt-get -y install curl \
            && curl -s http://repository.mivoq.it/mivoq.gpg.key [mivoq.it] | apt-key add - \
            && echo "deb http://repository.mivoq.it/repositories/apt/debian [mivoq.it] experimental main" > /etc/apt/sources.list.d/fatts.list \
            && apt-get -y update \
            && apt-get -y install `apt-cache search '^marytts-voice-' | sed -e 's/ .*//'` \
            && apt-get -y clean

    • (Score: 1, Funny) by Anonymous Coward on Monday October 19 2015, @03:22PM

      by Anonymous Coward on Monday October 19 2015, @03:22PM (#251850)

      Brilliant! Any three-year-old can do that in their sleep!

      • (Score: 0) by Anonymous Coward on Monday October 19 2015, @09:31PM

        by Anonymous Coward on Monday October 19 2015, @09:31PM (#252054)

        After the web server is up, any four year old can use the web app. Snippy much?

  • (Score: 2) by opinionated_science on Monday October 19 2015, @11:59AM

    by opinionated_science (4031) on Monday October 19 2015, @11:59AM (#251779)

    How about just run the android app on the linux desktop? Or maybe in chrome?

    A native app would be nice...but possibly not necessary.

    For all its problems, android is a huge market, and some neat things exist!

    • (Score: 2) by Hairyfeet on Monday October 19 2015, @01:49PM

      by Hairyfeet (75) <bassbeast1968NO@SPAMgmail.com> on Monday October 19 2015, @01:49PM (#251804) Journal

      Or if they absolutely have to have it FOSS here is a thought....why not simply contact the companies that have ones that work really good, find out how much they want for the rights, and then have a kickstarter to just buy the thing? We've seen time and time again you can easily raise money for good causes through indie funding campaigns and I seriously doubt they'd have any issue raising the cash for a cheap device that allows those with no voice to speak. It just seems silly to have to reinvent the wheel when there are companies out there with working solutions that I'm sure would be willing to sell it for the right price.

      --
      ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
  • (Score: 1) by cmdr_tofu on Monday October 19 2015, @03:06PM

    by cmdr_tofu (5669) on Monday October 19 2015, @03:06PM (#251842)

    What's wrong with festival? It's free, open-source, supports many languages, and can be driven by commandline. For example:

    The commands:
      echo hi |festival --tts
      echo linux rules |festival --tts

    They sound a little mechanical, but it's completely understandable.

    • (Score: 0) by Anonymous Coward on Monday October 19 2015, @03:26PM

      by Anonymous Coward on Monday October 19 2015, @03:26PM (#251852)

      Point -- --- ---- ----- -- ---- ---- -- >
                                      Your
                                      Head

      I think you get the idea.

  • (Score: 0) by Anonymous Coward on Monday October 19 2015, @04:01PM

    by Anonymous Coward on Monday October 19 2015, @04:01PM (#251871)

    The developers of FOSS [...] apps tend to reach a state of suits-my-needs and leave it there.

    That's the open source spirit for ya.

    • (Score: 0) by Anonymous Coward on Tuesday October 20 2015, @11:23AM

      by Anonymous Coward on Tuesday October 20 2015, @11:23AM (#252243)

      "He has a mind so narrow that he can look through a keyhole with both eyes at the same time".

      There are people who give money for permission to use someone else's proprietary software and those folks don't "own" anything but a Certificate of Authenticity.

      Ever try to get something fixed with one of those packages?
      How'd that go?
      I'm betting they told you to wait and buy another CoA in several months--or years--when they release their next version.
      ...which may or may not have your pet peeve sorted.

      In contrast, with FOSS, you can
      1) make the changes to the codebase yourself (assuming you possess the proper skill level).
      2) give money directly to the original developer to get the improvements you need.
      3) hire another developer.
      4) put a bounty on the fix.

      None of those are an option unless you have the source code.

      -- gewg_