Shyamala’s Substack
The Future is Spoken
Natural Sounding Synthetic Voices
0:00
-47:28

Natural Sounding Synthetic Voices

The Future is Spoken presents Rupal Patel as this week's guest. Rupal is the Founder and CEO of VocaliD, a voice Al company that creates unique synthetic voices. Unlike conventional methods, VocaliD's award-winning technology generates high-quality, natural-sounding voices within hours, not months. They leverage cutting-edge machine learning techniques, proprietary Voice blending algorithms, and our crowdsourced Voicebank dataset to enable brands and individuals to be heard in a voice that is uniquely theirs. Vocal Id is a spin-out from her research lab at Northeastern University. She is a tenured professor in the Department of Communication Science and Disorders and the Khoury College of Computer Sciences.

Starting with their own experiences, they end up discussing synthetic voices.

Tune in Now!

Conversation Highlights:

[00:23] The journey to Synthetic voices…..

● Rupal works on making customized synthetic voices for individuals as well as for companies. She started the mission to create voices for people who couldn't speak.

● She also explains how the world of Voice is touching the sky right now.

[03:32] Identifying the problem…….

● Rupal explains the reason behind creating the Vocal ID. She divulges the problems she identified while researching people with speech impairment.

● People with limited speech capabilities still can control the prosody of their Voice.

● What does it take to create a natural-sounding voice?

[11:47] Tuning the prompt according to your need.

● Rupal speaks about the different ways to tune the prompt for the pitch or speed or even the tone. The end-to-end synthesis methodologies allow controlling Speech differently.

● They have also started implementing a new method to make a change at the word level. She is also excited about some of the style modifications.

[20:09] The importance of Natural Sounding Voice

● She elaborates that almost every way we are consuming information is through our ears. Because of so much audible capability, you need to have a natural voice.

[22:18] What secret skill do you need to enter the 'Text to Speech' world?

● She touches on the skills you need to enter the world of Speech and design natural sounding voices.

● Linguistics is becoming the heart of Voice.

[26:11] Researching is the most crucial aspect of everything.

● Rupal explains that apart from doing experiments on building up the voices and making them sound more natural, they are also doing listening perception experiments to understand how consumers have different preferences for Voice.

● She also touches on how they ensure that the quality remains up to par and the operating system's role in amplifying the quality.

[40:36] How is Vocal ID different from others?

● Vocal ID is focused on customized Voice as supposed to specific libraries that other companies possess.

● Machine learning can get you to 90% of the way, but you will require an understanding of Speech to reach that last mile.

[46:40] Must Listen

● Rupal's piece of advice for someone trying to get into the world of Voice.

Special Reminder:

Celebrate The Diversity of Human Voices! Will You Share your Voice?

Join others from around the world in sharing the gift of Voice. Register today. 

Learn more about Rupal at

●     Vocalid.ai

●      LinkedIn

●      Vocalid.ai/voicebank

If you enjoyed this episode of The Future is Spoken Podcast

Discussion about this podcast

Shyamala’s Substack
The Future is Spoken
My personal Substack