Executive Director of Applied Voice Input Output Society notes recent breakthrough

Share Article

The accomplishments of Google's AlphaGo demonstrate the recent rapid improvement in Artificial Intelligence technologies despite slow progress in the technology for decades, notes William Meisel, Executive Director of the voice industry organization, the Applied Voice Input Output Society (AVIOS). Meisel claims that the technology of interacting with digital systems using human language has reached a “tipping point” that will trigger exponential improvement in accuracy and capability, fundamentally changing humanity’s relation to technology.

If computer power were the only limit, the technology should have seen exponential improvement, according to Moore’s Law, notes Meisel. That we have crossed a barrier that that now allows exponential improvement in commercial applications of speech recognition and natural language understanding is illustrated by AVIOS’ Mobile Voice Conference beginning April 11.

Meisel explained that technologies such as machine learning depend on huge amounts of data labeled with known results, such as the many Go games by professionals that were used in machine learning by AlphaGo. The power of such techniques depends on models such as Deep Neural Networks being very complex; to estimate the best values of the many parameters in such models requires huge amounts of data. The lack of sufficient data can limit the complexity of the models that can be used and thus their predictive power and accuracy. For areas such as speech recognition and natural language understanding, improvement was hampered for decades by the cost of collecting and labeling appropriate data.

"Moore's Law has given companies the power to use available data," Meisel said. "And today companies are getting that data by a powerful feedback process as individuals begin to use technology such as digital assistants that interact with human language. As people increasingly use such technologies, often on smartphones, they generate data that can be used to improve the accuracy of the technology. As the technology improves, people use it more. As they increase usage, they generate even more data that can be used for further improvement, in particular by correcting the cases where the technology makes errors. That process can generate exponential improvement in the accuracy of technologies such as speech recognition and natural language understanding."

What happened to cause this accelerated development, Meisel explained, is that natural language interpretation technology reached the tipping point of utility, where it was good enough to trigger significant use by consumers. Deep-pocketed companies including Amazon, Apple, Google, and Microsoft invested in the technology and demonstrated its worth as an intuitive user interface through their digital assistants. Having motivated increasing use by consumers, the feedback process will cause an exponential growth in data available for analysis and an exponential growth in the complexity of models used in speech recognition and natural language processing.

This increased capability will motivate further investment by many companies not yet involved. Companies will adopt the technology to allow consumers to interact more naturally in the specialized cases of interacting with that specific company or a service they provide. The resulting number of specialized resources available to users will multiply quickly, further creating data in those cases that can be used to improve performance, increasing usage, and specialized data to improve those applications. An example is the "skills" that outside companies can add to Amazon's Alexa assistant for its Echo.

Talks at the sixth annual Mobile Voice Conference (April 11-12 in San Jose, CA) show the maturing and rapid adoption of language technologies (speech recognition, natural language understanding, and conversational text and voice interaction). The wide variety of talks and companies at the conference are indicative of this exponential growth in utility of the technology and opportunities for companies to take advantage of it.

About the Applied Voice Input Output Society

AVIOS is a not-for-profit private foundation founded in 1981 with the goals of informing, educating, and providing resources to developers and designers of new and changing speech technologies. AVIOS endeavors to create linkages between users, developers, and researchers to advance speech and multimodal technology with a long tradition of conferences and local chapter meetings around the US and internationally. See http://www.avios.org.

Mobile Voice Conference Sponsors

Primary Sponsor: Interactions
Supporting Sponsors: Sensory, Cyara, Cepstral, Cobalt Speech & Language

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Peggie Johnson
AVIOS
+1 (408) 323-1783
Email >
@mobilevoice2016
since: 02/2016
Follow >
Visit website