Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Sunday September 25 2016, @05:41PM   Printer-friendly
from the Say-"What?" dept.

A decade ago, we in the free and open-source community could build our own versions of pretty much any proprietary software system out there, and we did. Publishing, collaboration, commerce, you name it. Some apps were worse, some were better than closed alternatives, but much of it was clearly good enough to use every day.

But is this still true? For example, voice control is clearly going to be a primary way we interact with our gadgets in the future. Speaking to an Amazon Echo-like device while sitting on my couch makes a lot more sense than using a web browser. Will we ever be able to do that without going through somebody's proprietary silo like Amazon's or Apple's? Where are the free and/or open-source versions of Siri, Alexa and so forth?

The trouble, of course, is not so much the code, but in the training. The best speech recognition code isn't going to be competitive unless it has been trained with about as many millions of hours of example speech as the closed engines from Apple, Google and so forth have been. How can we do that?

[...] Who has a plan, and where can I sign up to it?

Perhaps a distributed computing project (along the lines of Folding@Home, SETI, etc.) would be a viable approach?


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Sunday September 25 2016, @05:48PM

    by Anonymous Coward on Sunday September 25 2016, @05:48PM (#406325)

    How's that Google-killer search engine (Nutch/Lucene?) coming along?

  • (Score: 2, Insightful) by Anonymous Coward on Sunday September 25 2016, @05:50PM

    by Anonymous Coward on Sunday September 25 2016, @05:50PM (#406327)

    I doubt they exist in a feasible form. I'm still waiting for https://cloud.open365.io/ [open365.io] to become a real thing that can replace Office365 or Google Docs/crap. But it's expensive to run these things. You need powerful servers that are up all the time in many geo locations.
    All too often, people still think that Open Source == Zero USD (which obviously is wrong) which doens't help this either. Until people contribute hard cash to Open Source, we won't see this happen. I'm saddened by it, but here we are...
    I think that is the next step Open Source needs to take: get people to donate money to the projects that are crucial, which is not the same as 'projects they like' (or their boat, car, whatever - maybe open source projects and http://www.npr.org/ [npr.org] [wait, what? No https? Really guys?] can exchange some ideas about pledge drives?)

    • (Score: 5, Insightful) by LoRdTAW on Sunday September 25 2016, @06:12PM

      by LoRdTAW (3755) on Sunday September 25 2016, @06:12PM (#406333) Journal

      The true spirit of open source died a long time ago. Originally it was created to allow people to have the freedom to modify code as they pleased. But today, open source is used as a cheap tool or foundation to build a bigger proprietary product. You will never see open source services like o365, siri, google, search, etc. But you will see open source doing all the heavy lifting for those commercial services.

      • (Score: 0) by Anonymous Coward on Sunday September 25 2016, @06:14PM

        by Anonymous Coward on Sunday September 25 2016, @06:14PM (#406334)

        OP here
        You speak truth. A sad truth, but truth...

      • (Score: 0) by Anonymous Coward on Monday September 26 2016, @06:45AM

        by Anonymous Coward on Monday September 26 2016, @06:45AM (#406556)

        The ideals of a generation make little sense to the next generation.
        Luckily there is a solution, GPL3.

    • (Score: 4, Insightful) by Adamsjas on Sunday September 25 2016, @07:19PM

      by Adamsjas (4507) on Sunday September 25 2016, @07:19PM (#406362)

      Quoting AC:
      "I think that is the next step Open Source needs to take: get people to donate money to the projects that are crucial,".

      That's probably exactly the wrong approach. Build some huge project that takes mountains of money to sustain is never going to work in the FOSS world?

      What is needed is a small engine you can trust to run 24/7 on your own Linux boxes (ok, maybe Windows and cell phones too) that listens to conversations on an open mic, your own speech, speech on the TV or radio heard in the room, and can run multiple competitive recondition plug-ins simultaneously, while offering periodic training and review sessions. The whole system would be heavily audited to make sure nothing goes across the TCP stack so people could trust it to run all the time.

      Part two would be some equally heavily audited recognition results, containing no actual recorded conversations (longer than two words), that would submit parameterized reco settings to all the plug-in writers that have the time, interest, and infrastructure to process them.

      Neither part needs to yield any instant translations to text, and probably shouldn't, at least until it builds up some semblance of reliability. It could be strictly a spare-cycles process.

    • (Score: 2) by NotSanguine on Sunday September 25 2016, @11:46PM

      by NotSanguine (285) <{NotSanguine} {at} {SoylentNews.Org}> on Sunday September 25 2016, @11:46PM (#406432) Homepage Journal

      I'm still waiting for https://cloud.open365.io/ [open365.io] to become a real thing that can replace Office365 or Google Docs/crap. But it's expensive to run these things. You need powerful servers that are up all the time in many geo locations.

      'Cloud' = Someone else's servers.

      Who do you trust with your data or your binaries (or even worse, their binaries)?

      Trust Ivanova. Trust yourself. Anyone else? shoot 'em!

      --JMS

      --
      No, no, you're not thinking; you're just being logical. --Niels Bohr
  • (Score: 2) by opinionated_science on Sunday September 25 2016, @05:55PM

    by opinionated_science (4031) on Sunday September 25 2016, @05:55PM (#406328)

    there was a product from IBM for voice recognition (ViaVoice), which worked quite well - once trained.

    However, tried it on my father (not a native English speaker) - really bad.

    Retrained it in my father's language - a definite improvement, and perhaps 90% usable. That was 10 years ago...

    I am going to venture the opinion this is a "solved" problem, however the general training matrix (the handwavy Deep Mind BS) might be hard to find without feedback.

    For voice recognition to be "good enough", it needs to be engineered to be failsafe - forget robots, you don't want your TV ordering things off the internet!!!

    • (Score: 2) by frojack on Sunday September 25 2016, @10:25PM

      by frojack (1554) on Sunday September 25 2016, @10:25PM (#406413) Journal

      Ibm has never been big in this field.

      Nuance (dragon dictate) pretty much owned it until Google launched Google Voice to obtain mountains of voice samples an built their own engine. Google's voice reco is at least as good as Nuance tech (which is used by Apple in the iphone and OSX).

      --
      No, you are mistaken. I've always had this sig.
  • (Score: 3, Interesting) by Anonymous Coward on Sunday September 25 2016, @06:00PM

    by Anonymous Coward on Sunday September 25 2016, @06:00PM (#406331)

    Will not build.

    Why assume sentence based natural language speech control is the future? I think most developers (like me) are not huge fans of such systems, or even hate them.

    Gaze based control or gesture based makes at least as much sense in the proposed scenario, as do simple sound and/or speech based menu navigation systems that already work today (I've known 2 developers who independently made such systems using existing open source tools). Its the automatically and inaccurately inferring meaning from natural speech thing thats hard, and as a developer, I prefer to work with my computer on a more exact and well defined basic that shitty human natural language. I also don't want an AI sifting through all my personal data so it can attempt to guess context better. I'm happy to provide specific content for what kind of action I want.

    • (Score: 3, Interesting) by Gaaark on Sunday September 25 2016, @07:27PM

      by Gaaark (41) on Sunday September 25 2016, @07:27PM (#406365) Journal

      I'd definitely be more comfortable (not privacy wise or security wise, unless open source and trusted) with thought recognition so i could think something and have it happen.
      Talking into my tablet was cute for 2 seconds, and now unused. I hate phones (would have bought one of the ubuntu phones if i'd had the money and it came through on indiegogo or whatever, simply because 'mobile computer') and hate talking to phones and people. I prefer face to face, i guess so i can more easily recognize sarcasm, humour etc.

      Talking to my computer would be tiresome for me, i think. Probably give me a headache, and i'd be looking for a keyboard/else.

      I think viavoice, etc didn't take off partially because of recognition problems, but also because people felt too uncomfortable talking to their computer with other people hanging around.

      (Although, i may prefer talking to a robot who was logical, just like i don't mind talking to intelligent logical people.... i just hate talking to morons... especially morons on the phone, except you can just hang up on them instead of having to try to walk away from them and have them follow you.)

      Yeah, give me a sexy femsexbot with logic circuits and i'd probably talk more, lol.
      I love my wife, but sometimes logic just ain't there. Intelligence yes, logic not as much.

      --
      --- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
      • (Score: 3, Interesting) by frojack on Sunday September 25 2016, @10:13PM

        by frojack (1554) on Sunday September 25 2016, @10:13PM (#406412) Journal

        Talking to my computer would be tiresome for me, i think. Probably give me a headache, and i'd be looking for a keyboard/else.

        Tiresome is the least of the problems. There is annoying, disruptive, not-private that should be added to the list.

        But right up there at the top of the list is Too Damn Easy to Weaponize for spying.

        Analyzing a lot of speech (sound) is hard and time consuming.
        But recording or real-time speech-to-text (mountains of it) for later analysis or instant analysis has a lot of potential for abuse.
        We just can't seem to keep ourselves from building skynet.

        --
        No, you are mistaken. I've always had this sig.
      • (Score: 1, Funny) by Anonymous Coward on Monday September 26 2016, @07:30AM

        by Anonymous Coward on Monday September 26 2016, @07:30AM (#406564)

        Siri, open web browser.
        Siri, go to pornhub.com

        Siri, hide browser.
        Hi mom, no I was just talking to myself.

    • (Score: 1, Informative) by Anonymous Coward on Sunday September 25 2016, @07:34PM

      by Anonymous Coward on Sunday September 25 2016, @07:34PM (#406370)

      "shitty natural language" - you obviously speak for yourself .

      • (Score: 1) by Francis on Sunday September 25 2016, @11:52PM

        by Francis (5544) on Sunday September 25 2016, @11:52PM (#406435)

        It is shitty though. Natural language was designed for flexible communication at the expense of precision.

        Usually close enough is sufficient. But with computers, you want fast and precise and probably don't need to be able to write poetry or discuss philosophy with a program to get your work done. And there's a limit on what novel uses you can find without resorting to programming new programs.

        Compare that to a proper ui and it's not even close for what's better.

  • (Score: 2, Funny) by Ethanol-fueled on Sunday September 25 2016, @06:14PM

    by Ethanol-fueled (2792) on Sunday September 25 2016, @06:14PM (#406335) Homepage

    My friend has an Alexa. You can tell it to suck your dick (or any other boorish phrase) and it will tell you something like "I don't think I want to do that." You can also ask it for a black dildo and receive an instantaneous suggestion from Amazon. It will turn off if you yell at it, "Shut up!" or "Shut the fuck up!"

    Pretty neat actually...can't wait until people hack them or make a fully user-configurable version of it. It's good to see that Alexa's developers at least had a sense of humor.

    • (Score: 0) by Anonymous Coward on Sunday September 25 2016, @06:55PM

      by Anonymous Coward on Sunday September 25 2016, @06:55PM (#406351)

      And every time you say something oddball or insulting to alexa, that action and the date and time will go into the permanent dossier that amazon is keeping on you, your family and friends. They'll eventually start looking for why you were grouchy, correlating the data from all the other devices that track your movements and your communications as well as those of all the people around you.

    • (Score: 1) by ShadowSystems on Sunday September 25 2016, @07:07PM

      by ShadowSystems (6185) <ShadowSystemsNO@SPAMGmail.com> on Sunday September 25 2016, @07:07PM (#406360)

      Oddly enough one of my friends likes her Iphone for this very reason.
      She'll instruct the phone to address her as some form of "Goddess High Bitch Muckymuck" or some other demented title, then laugh her ass off when her phone addresses her in public & turns heads in classic "WTF?" whiplash.
      Or she'll ask her phone obviously sexually oriented questions like "Hey Siri, can you help me get laid?" & then laugh her tits off when the phone replies with something like "Probably, but where would we find a chicken at this time of night?"
      I shake my head in amusement at the things she can get up to, the things she can think up to ask her phone, & then the side splitting hilarity the phone can reply with.
      It's not quite up to passing the Turing Test yet, but not for lack of trying.
      It's *almost* enough to make me want to buy an Iphone for myself JUST so I can mess with it's A.I.-inspired, machine learning taught, big data mining for clues & context, silicon brain inna chip, glorious metal ass.
      =-)p

      • (Score: 2) by Appalbarry on Sunday September 25 2016, @07:38PM

        by Appalbarry (66) on Sunday September 25 2016, @07:38PM (#406372) Journal

        She'll instruct the phone to address her as some form of "Goddess High Bitch Muckymuck" or some other demented title

        Pffft. Thirty years ago I had a cow-orker called John, who was often referred to as "God" because he actually was right about everything 99% of the time.

        One morning we arrived at work, and fired up our Macs - the pizza box variety, whose model I forget - to discover that any time someone made a mistake, instead of a "bing" noise, the computer would say "This is GOD. You made a mistake, and I KNOW."

        Around the same time I heard stories about people programming office phone systems in a similar way, replacing the provided ring tones with some very interesting alternatives.

        • (Score: 2) by RamiK on Sunday September 25 2016, @08:45PM

          by RamiK (1813) on Sunday September 25 2016, @08:45PM (#406386)

          Pffft. Thirty years ago I had a cow-orker called John, who was often referred to as "God" because he actually was right about everything 99% of the time.

          I think you got your prophetic half-man half-ork Kamadhenu friend mixed up with Brahma. Again [wikipedia.org]...

          --
          compiling...
  • (Score: 4, Insightful) by Dunbal on Sunday September 25 2016, @06:26PM

    by Dunbal (3515) on Sunday September 25 2016, @06:26PM (#406339)

    Solution looking for a problem.

  • (Score: 0) by Anonymous Coward on Sunday September 25 2016, @06:36PM

    by Anonymous Coward on Sunday September 25 2016, @06:36PM (#406343)

    There's a community, now, is there? And we're all one amorphous blob and we should all be doing what the pundits think we should be doing?

  • (Score: 1) by islisis on Sunday September 25 2016, @06:49PM

    by islisis (2901) on Sunday September 25 2016, @06:49PM (#406347) Homepage

    I don't want a computer condemned to hearing my voice until I can hear its.

    Voice is an analog gesture, one more personalised than most. Why I would want to retrain it in exchange for a comparitively vacuum-like response, when as other posts have pointed out I can managebly learn to interact with this generation of machines using far more visceral feedback schemes, I can't begin to imagine.

    I'm not going to perform for machines until they are on their way to finding the same level, which on an AI level will not be for decades. This makes the developmental priority already clear.

    An abstract reason to give perhaps, but plainly human enough for me.

  • (Score: 3, Interesting) by garfiejas on Sunday September 25 2016, @06:55PM

    by garfiejas (2072) on Sunday September 25 2016, @06:55PM (#406350)

    Brute force approaches such as Siri, Alexa, Echo etc are fantastic at getting input into a "dumb" machine - if you don't mind the privacy and security issues of every command being analysed and viewable by third parties, some of whom may not have your best interests at heart*

    Newer systems like https://solid.mit.edu/ [mit.edu] or https://maidsafe.net/ [maidsafe.net] to get the end-points doing the learning and a distributed network doing the processing would be more Open - theres over a few hundred million active iOS devices - many times that running Android and if were looking at the near future a few orders of magnitude more in IoT land

    But if you wanted machines to actually, really, understand you - well, there are "potentially" other ways of doing that https://en.wikipedia.org/wiki/Artificial_general_intelligence/ [wikipedia.org] - and hopefully some of those WILL be OpenSource...

  • (Score: 2, Insightful) by fubari on Sunday September 25 2016, @07:21PM

    by fubari (4551) on Sunday September 25 2016, @07:21PM (#406363)

    Quote: "It takes around five years to train a PhD student, and five years ago there weren’t that many PhD students starting a career in deep learning. What this means now is that those few that there are are being prized very highly." (from Why Google Is Investing In Deep Learning [fastcompany.com]).

    Asking for "open source advanced voice recognition" is like asking for...
    "open source trans-pacific fiber cables"
    "open source space shuttle"
    "open source open heart surgery"

    All of those things, including "state of the art voice recognition", require a some non-trivial infrastructure and rather deep human expertise.

    Consider the experts that build things like today's state-of-the-art voice recognition system.
    Where are you going to find them? I mean the mathematicians, linguists, neuroscientists and programmers that can do these sorts of things? (my guess would be Google, Baidu, Apple, Microsoft Research and maybe some universities).

    You need lots of processing power.
    You need lots of human brain power.
    What part of this fits "open source" ?

    But hey, if you can do it, Awesome :-) I wish you the best of luck.
    I suppose you could start here... some of the listed projects are open source.
    https://en.wikipedia.org/wiki/List_of_artificial_intelligence_projects [wikipedia.org]

    Good luck. Let us know how it goes.

    • (Score: 0) by Anonymous Coward on Sunday September 25 2016, @09:25PM

      by Anonymous Coward on Sunday September 25 2016, @09:25PM (#406400)

      So what you're saying is open source AI, like every other cool technology, is just five years away from market? WooHoo!

  • (Score: 1, Informative) by Anonymous Coward on Sunday September 25 2016, @07:42PM

    by Anonymous Coward on Sunday September 25 2016, @07:42PM (#406373)

    Check this out...

    http://lucida.ai/ [lucida.ai]

  • (Score: 0) by Anonymous Coward on Monday September 26 2016, @05:04AM

    by Anonymous Coward on Monday September 26 2016, @05:04AM (#406538)
  • (Score: 0) by Anonymous Coward on Monday September 26 2016, @08:24AM

    by Anonymous Coward on Monday September 26 2016, @08:24AM (#406576)

    I know what FOSS is, but where's the L come from?

    If you're going to go start inserting new letters in your bloody TLAs and FLAs then at least tell us what they mean now!

    I swear, I'm going crazy trying to keep track of all the acronyms mean, I work in an industry that is heavy on them, I hobby in another industry that is heavy on them. it's driving me nuts.

    -grumpy old dude.

    • (Score: 2) by linuxrocks123 on Monday September 26 2016, @09:42AM

      by linuxrocks123 (2557) on Monday September 26 2016, @09:42AM (#406586) Journal

      FLOSS is "libre", as in "Free/Libre Open Source Software". "Libre" is French for "free", but only in the sense of "freedom" and not in the sense of "free beer". I don't know who first put the L in there, but the modified acronym has been in use for at least several years. It hasn't 100% supplanted FOSS, though -- obviously, or you would have probably noticed before now :)

      Bottom line it means exactly the same thing as FOSS, but expands to a less ambiguous description. And also spells a cute word.

  • (Score: 2) by Lemming on Monday September 26 2016, @02:17PM

    by Lemming (1053) on Monday September 26 2016, @02:17PM (#406653)

    The Mycroft [mycroft.ai] project aims to develop an open source software and open hardware home A.I. platform using voice / natural language control. A year ago they had a Kickstarter [kickstarter.com] to get things rolling. Now they are getting ready for the first production run, aiming to ship the units to the Kickstarter backers in about two months.

    The hardware is based on Raspberry Pi 2 and Arduino, and the device runs on Snappy Core Ubuntu. The source code will be released under the GPLv3.

  • (Score: 0) by Anonymous Coward on Monday September 26 2016, @04:02PM

    by Anonymous Coward on Monday September 26 2016, @04:02PM (#406678)

    Why would TFA think the purpose of FOSS is to replicate every commercial endeavor out there? I do not think the purpose of FOSS is what TFA's author thinks the purpose is.

    No, voice control is not necessarily going to be a primary way of interacting with devices in the future. It will remain one way. But "primary?" meh.

    You want an Alexa, a Cortana, a Siri? Buy their products, have at it.

  • (Score: 2) by mcgrew on Monday September 26 2016, @04:14PM

    by mcgrew (701) <publish@mcgrewbooks.com> on Monday September 26 2016, @04:14PM (#406686) Homepage Journal

    I wrote it back in 1984 and had versions for the TS-1000, TRS-80MC10, Apple IIe, and later DOS. I'm sure I've lost the code, though. The US Copyright Office might still have it.

    But considering that almost all supercomputers are running Linux these days, Siri et al are probably running on top of GNU.

    --
    mcgrewbooks.com mcgrew.info nooze.org
  • (Score: 2) by mr_mischief on Monday September 26 2016, @05:19PM

    by mr_mischief (4884) on Monday September 26 2016, @05:19PM (#406701)

    Are BSD and Apache licenses not OSI this week? Are we too lazy to do a quick search and read a Wikipedia article?

    https://en.wikipedia.org/wiki/List_of_speech_recognition_software [wikipedia.org]
    https://jasperproject.github.io/ [github.io]
    https://www.researchgate.net/post/Any_open-source_speech_recognition_system_with_realtime_recognition_focus [researchgate.net]

    Open source voice recognition toolkits exist. Now go build your app. Or go download an existing app like Simon.

    https://simon.kde.org/ [kde.org]

    The hard part isn't in finding the engine anymore. The hard part is in getting the corpus and doing the training. Maybe we could crowdsource that.

     

  • (Score: 2) by DeathMonkey on Monday September 26 2016, @05:31PM

    by DeathMonkey (1380) on Monday September 26 2016, @05:31PM (#406703) Journal

    What are the FLOSS Community's Answers to Siri and AI?

    No