Tech giant Google said it has reduced transcription errors by 49 percent with the latest improvements it’s rolled out to Google Voice and Project Fi. The search giant’s researchers and engineers achieved the performance upgrade using something called a “long short-term memory deep recurrent neural network.”
Combined with content provided by Google Voice users, that network enabled researchers to find a way to improve the accuracy of voicemail-to-text transcriptions, which in the past could prove either unintelligible or “humorously intelligible.” A phrase like “new transcription,” for example, could end up being written as “neon prescription.”
Google announced the improved system yesterday in a brief post on its official blog. Software engineer Zander Danko noted in the post that the new capabilities are just the beginning and work will continue on additional improvements to the voice transcription system.
No Software or Service Updates Needed
Launched in 2009, Google Voice lets people with verified U.S. telephone numbers manage calls and messages across devices. Each user gets one Google Voice number that is tied to that individual — rather than to any one phone or other device — so the user can screen calls, create personalized greetings, send text messages to multiple recipients at once, share voicemails and read voicemail transcripts.
Project Fi, offered by Google in partnership with Sprint and T-Mobile, provides a “network of networks” Wi-Fi and 4G LTE services to users with Nexus 6 smartphones. Announced in April, Project Fi is available to customers in the U.S. for $20 per month with an additional flat-rate fee of $10 per month per gigabtye of cellular data.
The improved transcription capabilities are available immediately to Google Voice and Project Fi users and they don’t require any manual service or software updates, Google said.
‘Sandy’s Advertising Maple Road’
“Everyone loves Google Voice, and their voicemail feature is great, especially how it transcribes the voice into a text or e-mail,” the “googlevoicefail” page on Reddit noted. “Well, that’s actually more humorous than it is great.”
Examples of transcription “fails” posted on the page include, “So mom’s I went ahead and Sandy’s advertising maple road”; “I was trying to call me if you do it without Pictage Gail”; and “The part that God will slip it’s been.”
Recurrent neural networks — or machine intelligence that enables dynamic behavior — in speech recognition have previously been limited to “phone recognition in small-scale tasks,” according to a 2014 study written by three Google researchers. Networks built on long short-term memory-based architectures, however, make it possible to create improved acoustic models for speech recognition involving much larger vocabularies, the researchers noted.
Earlier this month, Google also revealed that improvements in machine learning have helped it reduce spam and other unwanted e-mails in Gmail inboxes to less than 0.1 percent.
This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.