By Amalia Gerpe on Apr 12, 2019 3:10:06 PM
Although it is a very current and futuristic theme, the history of voice recognition technology dates back to the 50s. But it was not until 1987 when the doll Julie was released to the market; children could "train" to respond in their play talks. The defect was that there was a pause between each spoken word, limiting the impulse technology needed to grow.
In less than 40 years, technology advanced in voice commands and virtual assistance was invented that generated a vast amount of spoken data. Thanks to this training in the recognition of the voice it was possible to reach its fundamental impulse that today allows this technology to be increasingly sophisticated.
However, the biggest obstacle to more progress is the continuous challenge of being able to understand human speech with sufficient precision. Let's think that the world is made up of 7600 million people, united by cultures and languages that each human being makes unique with their idioms, dialects, and jargon. That is why, despite the rapidity with which computers have been able to learn with more robust automatic learning models, the unique and diverse nature of individuals' speech hinders the accuracy of the system. Natural language processing (PLN) comes into play, which is nothing less than the field that works to decipher the code for the universal understanding of human speech.
Added to this, the context in which certain words are said also changes their meanings. So it is vital also to work on understanding the limitless number of situations.
Siri, the pioneer
When Apple introduced Siri in 2011, consumers thought that Back to the Future was just around the corner. We were thrilled with this new feature and even played with the funny answers that certain key questions generated. In spite of that, in reality, Siri could only perform simple functions such as initiating a call or carrying out a simple online search. The charm was short-lived because the ability to decipher spoken orders in a noisy environment was very limited and Siri did not "understand", losing all its charm. Like almost everything on the Internet, the faults of Siri and her virtual companions are material of multiple memes.
Despite these setbacks in user experience, other mobile phone manufacturers quickly followed the concept and added search engines with speech recognition to their devices, due to the promising potential offered by this technology. The most recent advances in integration with other applications have now increased the complexity of the orders that can be carried out.
How to access relevant data then?
After the launch of Siri, other great technology began to publicize their assisting technologies. Each company focused on the unique strengths that its products offered to the users they were intended for. Amazon joined the race by bringing its smart home devices, Echo and Alexa, to the market; while the IBM supercomputer, Watson, went to the companies and Cortana, from Microsoft, was integrated into Windows 10.
In terms of accuracy, Google has the biggest advantage because of the data from its search engine that serves as the basis for training in speech recognition. Amazon has caught up quickly with a majority stake in the smart home devices market.
The data is matched to the real-life experience, which can process the automatic learning tools and use them to create a more efficient pattern of voice recognition.
Virtual assistants in the workplace
The global market for speech recognition software will grow at a steady 12% pace in the coming years, says BBC Research. The use of voice command in the online search will also increase, a trend that has been growing for years. In the home, virtual assistance is directed to household appliances, which are integrated with the internet of things.
In the workplace, virtual assistance has become something of growing interest for companies, due to its ability to optimize workflow. But for this to be really useful and allow companies to optimize workflows, processes and, consequently, profits, it is necessary to have access to large amounts of data and they must be of good quality.
With services such as the Atexto, responsible for providing large amounts of processed data to teach attendees to listen and understand more and better, voice-based assistance programs can help solve workplace problems more quickly and greater efficiency, creating a more productive office.
What can we expect for the near future?
Thanks to changes in user behavior that uses more voice-directed interactions, it is expected to improve the accuracy of speech recognition at an even faster rate than it has done so far. This means that the virtual assistants will collect the benefits of those improvements to offer more services.
At Atexto we have the participation of the largest community of transcriptionists in the world, with nearly 250,000 registered people, of multiple cultures and languages. This guarantees an integration between technology and human quality that we understand will be the key to achieving comprehension and total precision on the part of the speech recognition engines that serve as the basis for virtual assistance technology.
Do you want to know more about Atexto and the services we offer to improve your machine learning system? Complete the form below or send us an email to info@atexto.com
comments