Voices - An audio processing solution Made by CMC

After much research and development, CMC's Institute of Science and Technology (CIST) has come up with Voices - an audio processing solution. The solution features voice recognition and speech synthesis with high performance and fast automation, providing easy customization and integration for businesses during their digital transformation.

Voices also easily and accurately records the content of conversations or meetings, helping to speed up content production, thereby cutting costs for personnel and operations, and improving productivity of the business.

Vietnamese is considered a difficult language to learn to foreigners due to its grammar, tones and regional characteristics. A computer is like a foreigner - for it to understand and interpret Vietnamese into text is not a piece of cake. Therefore, the solution is developed to affirm the technology capacity and efforts of CMC engineers.

CMC Voices has 2 outstanding features

Text to Speech -Converting text to voice is compatible with any customer's system, allowing for the conversion from text to voice naturally with diverse regional characteristics. What's more, it can easily customize the speech rate, breaks and stresses as required. Therefore, Text to Speech is strongly applied to audio books, call centers, movie narration, video clip making, virtual assistant, etc.


Speech to Text -  Converting audio to text has high grammatical and spelling accuracy; is able to distinguish between regional accents with accuracy of up to 96% and fast processing time (10 seconds of audio in 300 miliseconds, on CPU); and is capable of responding to noises and different environments. The feature quickly proves its power in the fields of healthcare, smart-home, IoT, smart speakers, meeting room notes, etc.

CMC Voices aims at institutional customers who use smarthome systems, smart speakers, IoT devices, meeting room notes; as well as companies in real estate, finance, retail, healthcare, etc. who use voice customer service system, or those who use audio books and virtual assistants or need to make video clips and movie narration, etc.

Representatives of customers who tested the solution said that it has helped a lot in practice. Now at meetings and conferences, there is no need to record meeting minutes, as information is instantly converted to text while someone is speaking.

The solution is also a powerful tool to help video content creators to easily add Vietnamese subtitle to their videos. Voices can also shorten the information issuance time by 10 times (for example, with a 60-minute video, the software only needs 6 minutes to complete the conversion), convert audio files quickly with accuracy of up to 98%, thus save time for typing and issuing information and reduce the risk of misinformation.