By Microsoft’s own account, the days for human transcribers are numbered, and the countdown is moving pretty fast. On Tuesday, Microsoft announced that it “made a major breakthrough in speech recognition.”
A detailed report on how Microsoft reached this feat is published here. The researchers behind it, describe how they were able to create an automated system, which can transcribe recorded speech with as good as accuracy as a professional human transcriber would.
However, the fact sheets show the speech recognition could be slightly better than the best of the human transcribers. A test was done using NIST 2000 dataset or recorded phone calls, and the software was able to improve by 0.4% in error rate compared to a professional transcriber.
In September, Microsoft said the software had achieved 6.3% word error rate, an impressive improvement considering that they stood at 8% error rate as by May 2015. The rapid progress the company is making only serves to prove the enormous interest tech companies have on machine learning and AI technology.
Microsoft has achieved a significant milestone, marking the first time human parity has been achieved for artificial intelligence conversational human speech. Success, the company, says was possible thanks to the use of convolutional and LSTM (long short term memory) neural networks. Coupled by use of techniques that improved the data models accuracy such as spatial smoothing.
The researcher also relied on the open source project, Computational Network Toolkit by Microsoft. According to the manager of Microsoft’s speech and dialog research group, Geoffrey Zweing, the achievement achieved was equal to the efforts of over 20 years.