Mozilla wants to crowdsource thousands of hours of voice recordings for an open source voice recognition engine:
The Mozilla Foundation launched "Common Voice," which is a crowdsourced initiative to build an open source data set for voice recognition applications.
Many technology companies believe that voice control will be embedded into most devices in the future. This is why Apple, Google, Amazon, Microsoft, Baidu, and others are all trying to put their own voice-controlled artificial intelligence assistants into as many devices as they can and as fast as they can, in order to gain market share before the competition.
The problem with this, according to Mozilla, is that voice controlled technologies could end up being dominated by proprietary technology and data sets, which aren't made available to startups and academics. As some large companies already benefit from billion-dollar revenues, it could later become too difficult for startups to catch up with the big players. Though[sic] Common Voice, Mozilla aims to democratize voice recognition technology.
You could use this to build (the easy part of) a personal assistant that either does not use the cloud, or does so on your terms.
(Score: 3, Interesting) by krishnoid on Thursday July 20 2017, @04:51PM (5 children)
Why do these voice recognition toolkits start with English, which doesn't use a phonetic alphabet. Shouldn't it be easier and/or more reliable to perform voice recognition on a phonetic-alphabet language, or wouldn't that make a difference?
(Score: 2) by DannyB on Thursday July 20 2017, @06:10PM
For developers and testers fluent in other languages, there is nothing stopping them from building a similar learning / training / verification system. My best wishes, but I only speak English. Well, I mean, the language spoken in North America.
Q. What do you call someone who speaks two languages?
A. Bilingual
Q. What do you call someone who speaks three languages?
A. Trilingual
Q. What do you call someone who speaks only one language?
A. American
To transfer files: right-click on file, pick Copy. Unplug mouse, plug mouse into other computer. Right-click, paste.
(Score: 0) by Anonymous Coward on Friday July 21 2017, @01:05AM (1 child)
The sooner we get everybody speaking the same language, the better.
Supporting junk languages would delay progress.
(Score: 0) by Anonymous Coward on Saturday July 22 2017, @08:42AM
Another aggressive mandarin spotted.
(Score: 0) by Anonymous Coward on Friday July 21 2017, @01:59AM (1 child)
Because English has about 12k different unique syllables compared with a language like Mandarin that's only got about 1600. If it can handle English, then chances are it can handle other languages with some adjustment. English is also an incredibly popular language with many speakers that have the time and money necessary to fund the project.
(Score: 1, Informative) by Anonymous Coward on Friday July 21 2017, @04:17AM
Your examples call to mind the fact that Chinese is a tonal language [lexington.ro] whilst English is not. The difference is a stumbling block for English speakers when learning Chinese.
Last year, we had a story [soylentnews.org] about Baidu's effort at an engine that would recognize both English and Chinese.