“Gowajee” — a Thai Speech-Recognition AI from Chula

An engineering professor from Chula has designed “Gowajee”, a Thai-language speech recognition AI capable of delivering speech-to-text/ text-to-speech with the accuracy of a native speaker while keeping users’ data secure.  Having been rolled out in call centers, and depression patients screening process, Gowajee is set to be adapted to many other functions.

‘OK, Google’

We’re getting used to using our voice commands for AIs like Google or Siri to search or carry out tasks instead of typing them out.  But for Thai speakers, have you ever felt that those AI voices don’t seem to understand the Thai tone of voice that we use?  Many times, we get a transcription that doesn’t match our words which means we need to adjust our Thai pronunciation to the AI developed by a foreign company that was aimed for multilingual adaptability, mostly standard languages like English.  

AI Voice

Realizing this problem, a team led by Dr. Ekapol Chuangsuwanich of the Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University has developed Gowajee” a genuine Thai speech-recognition AI that understands and execute commands in the Thai language more naturally and accurately.  Actual usage has shown only a 9% incidence of linguistic inaccuracy compared to 15% for other language-recognition AIs.

Dr. Ekapol Chuangsuwanich
Dr. Ekapol Chuangsuwanich

The name Gowajee derives from the word ‘Go’ and ‘Wajee’ which means words.  The word is designed as a command similar to ‘OK Google’ or ‘Hey Siri’. The word Gowajee was designed in such a way as not to replicate any other word being used in the Thai language.

The Challenge of Developing Thai AI

Based on this problem, foreign-made AI often misunderstands Thai language with the main reason being Thai language structure that’s different from English.  Different pronunciations, tones, inflections, and homophones can lead to misinterpretation.  Thai language’s more complicated structure than English may be an obstacle in the development of Thai speech-to- text/ text-to-speech technology.  Therefore, Dr. Ekapol’s best solution to this problem is to “create the most extensive Thai language database”.

Thai language AI with a Thai sound database

Dr. Ekapol and his team began the task of compiling a Thai sound database from 2017 up until the present.   As he recalled,

“….we applied a variety of methods and formats such as creating a website for people to log in and read a text to be stored as a sound database, getting people to engage in a conversation or actors to perform emotional speaking.  Altogether, we achieved a compilation totaling five thousand hours which made us confident that we had a big enough database to transcribe Thai accurately so as to achieve the best possible Thai speech-to-text/text-to-speech, and speech recognition.”

This database was enough to enable the Gowajee team to develop an accurate Thai language recognition AI that could be adapted for use in three main features:

Automated Speech Recognition (ASR)

which turns speech into text. “For example, if we record a lecture, the AI will transcribe it into texts for us to read without having to transcribe it ourselves,”  Dr. Ekapol suggested.                                                                                                                        

Text-to-Speech (TTS)

works by transcribing a passage into spoken words in the same way that we might be familiar with the use of Google or Siri except that Gowajee will deliver more natural speech thanks to a larger Thai database. 

Automatic Speaker Verification (ASV)

is an identity verification through sound which can be used when contacting a call center or indicating the speaker and time frame. 

Gowajee – a perfect solution for call centers

Ever since it was developed, Gowajee has been used by various agencies, like universities, and the public and private sectors, especially at call centers, both for Thai speech-to-text, and text-to-speech functions. Gowajee’s error is only at 9% compared to 15% by other AIs.  

AI Call Center

“Most clients have been satisfied with Gowajee’s level of accuracy.  It is an improved version of what they have previously used and the price is also more affordable.  As for the errors, we are certain that they will decrease as the database grow.”

In search of meaning in the voice: Gowajee helps to screen patients with depression

As a result of data gathering of voices that convey various emotions, Gowajee has been able to help develop the systems used in DMIND for screening patients with depression.

“DIMIND proved to be very challenging for us.  Aside from transcriptions, a model of classifying and decoding emotions from voices in at-risk groups is also needed.  Crying is usually involved which makes voices difficult to transcribe and decode, but Gowajee was able to do considerably well by determining the important keywords for decoding.”

DMIND application for screening patients with depression

How can Gowajee be adapted for use in other areas?

Gowajee and AI technology can be used in many other areas such as …

  • A dental assistant taking notes while the dentist is doing dental work on the patient and needing to record some notes.   
  • It can be used to detect a stroke risk in patients with slurred speech.
  • Act as a life coach by asking questions and analyzing people’s life goals from video interviews, use as part of students’ and employees’ orientation.
  • Modify and amplify sounds for the hard-of-hearing so that they can hear more clearly. 

Your data is safe with Gowajee

“Data safety” is what puts Gowajee above other speech-recognition AIs.  As Dr. Ekapol tells us “Normally other transcription and speech recognition programs store their data on the cloud or compile them on users’ computer.  With Gowajee, all the data is stored on the user’s database ensuring its safety.  This is useful for organizations like banks which need high data security.”

AIs are becoming increasingly clever with the enhanced linguistic abilities that are getting closer and closer to human beings which have caused many to worry about being replaced by technologies.  In terms of AIs for Thai language transcription, Dr. Ekapol only sees them as enablers that will make life easier for us in the present and the future.   

“AIs aren’t that disrupting to our lives.  We are disrupting ourselves.  Aging societies, a shortage of working-age labor are making it necessary for us to create technologies to substitute what we can’t find humans to do.” Dr. Ekapol also concluded by saying “I’m not expecting that my work is going to be helpful to the aged of today but I’m thinking that in the future when I reach an old age I will be making use of these technologies.”

AI for elderly

Therefore, the Thai speech recognition AI (both speech-to-text and text-to-speech) that Dr. Ekapol has been dedicated to develop is not a fearsome technology, or one that will replace human labor.  But, it will bring more ease and convenience to many people. Just the ability to convert speech to text, and text to speech can be applied to various areas.  As we are transforming into an aging society, the speech recognition technologies can be applied for the better quality of life.

For more information and a trial of Gowajee Thai speech recognition AI, please visit https://www.gowajee.ai/.

Chula is the place to discover one’s true individuality and the years I spent here were most enjoyable.

Rossukhon Kongket Alumni, Faculty of Communication Arts, Chulalongkorn University

This website uses cookies to personalize content, provide the best user experience, and improve Chula website services.

Privacy Preferences

ท่านสามารถเลือกการตั้งค่าคุกกี้โดยเปิด/ปิด คุกกี้ในแต่ละประเภทได้ตามความต้องการ ยกเว้น คุกกี้ที่จำเป็น

Accept All
Manage Consent Preferences
  • คุกกี้ที่จำเป็น
    Always Active

    ประเภทของคุกกี้ที่มีความจำเป็นสำหรับการทำงานของเว็บไซต์ เพื่อให้คุณสามารถใช้เว็บไซต์ได้อย่างเป็นปกติ ท่านไม่สามารถปิดการทำงานของคุกกี้นี้ในระบบเว็บไซต์ของเราได้

  • คุกกี้เพื่อการวิเคราะห์

    คุกกี้ประเภทนี้จะทำการเก็บข้อมูลพฤติกรรมการใช้งานเว็บไซต์ของคุณ โดยมีจุดประสงค์คือนำข้อมูลมาวิเคราะห์เพื่อปรับปรุงและพัฒนาเว็บไซต์ให้มีคุณภาพ และสร้างประสบการณ์ที่ดีกับผู้ใช้งาน เพื่อให้เกิดประโยชน์สูงสุด หากท่านไม่ยินยอมให้เราใช้คุกกี้นี้ เราอาจไม่สามารถวัดผลเพื่อการปรับปรุงและพัฒนาเว็บไซต์ให้ดีขึ้นได้
    Cookies Details