Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday October 20 2016, @04:45AM   Printer-friendly
from the nature'll-anguish-wreck-ignition dept.

Microsoft on Tuesday said that its researchers have "made a major breakthrough in speech recognition."

In a paper [PDF] published a day earlier, Microsoft machine learning researchers describe how they developed an automated system that can recognize recorded speech as well as a professional transcriptionist.

Using the NIST 2000 dataset of recorded calls, Microsoft's software performed slightly (0.4 per cent) better than the error rate the company attributes to professional transcriptionists (5.9 per cent) for the Switchboard portion of the data, in which strangers discuss a specified topic.

There goes your bright future as a court recorder...


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Thursday October 20 2016, @08:51PM

    by Anonymous Coward on Thursday October 20 2016, @08:51PM (#416932)

    One too many promises about speech recognition in the world already. And I don't give a damn that their program beat typist(s) at accuracy in a contest. I only care that in real world conditions their program does better than a blooded transcriptionist in everyday working conditions, especially in getting Chinese, Indian, and Italian accented (multiple ones in different dialects) correct at medical or legal transcription. Then I'll buy Microsoft's alleged breakthrough. Maybe.

  • (Score: 2) by goodie on Friday October 21 2016, @02:53PM

    by goodie (1877) on Friday October 21 2016, @02:53PM (#417271) Journal

    Was thinking the same thing... The day I can give out interview audio to have them transcribed quickly and properly, it will become interesting. If anybody has done interview transcriptions, it's long, boring, and mind-numbing. Yes it makes you know your data by heart but it takes a very long time (about 5/6 hours per hour of interview). There can be issues with audio quality, people eating/chewing gum, interruptions, accents etc. that make it even harder to work. In the past, some of the software (e.g., Dragon something, I forgot) could technically work but required a lot of training for each voice to transcribe. Needless to say, unless you're constantly talking to yourself, it's not very useful... The other thing that a professional transcriber can pick up on much better than an algorithm (for now) is tone, hesitation etc. These can be very important in some contexts (in fact, more relevant than what people say). So losing those can be detrimental to the analysis. Perhaps that's why a good transcription costs an arm and a leg.

    • (Score: 0) by Anonymous Coward on Friday October 21 2016, @07:40PM

      by Anonymous Coward on Friday October 21 2016, @07:40PM (#417382)
      Does Microsoft's software fix its "work" by going through the audio again a few times? Or the final figure is it? How many passes did the humans get to do for MS's test?

      Is the accuracy/error rate the best Microsoft's system can do and it's better than a pro human that's only given one-pass? Or were the humans allowed to have multiple passes at deciphering the audio?