Google Cloud Text-to-Speech API Adds Custom Voices

6shares

Google Cloud has announced a new feature within its TTS API that lets users generate a unique, new synthetic voice trained from recordings.

In a recent blog post, Google Cloud announced the general availability of Custom Voice within its Cloud Text-to-Speech (TTS) API. The new feature will offer users an alternative to the usual digital assistants and conversational interfaces they have grown accustomed to hearing.

Among other things, it lets them train custom voice models using their own audio recordings to create unique synthetic voice experiences.

The feature can be helpful for businesses looking to establish a strong brand identity, as the Custom Voice can, for example, turn the interactive voice responses (IVR) of a customer service interaction into a unique customer experience.

Up until now, Google’s TTS API provided predefined options for its speech synthesis service with a static list of voices only.

Users can access the new Custom Voice directly in the TTS API and simply submit their audio recordings to use the feature. The service offers users a guide on the audio requirements to help make sure the custom voice generated is of the best quality. Upon ending the training, users can start using the new custom voice by referencing the model ID in their calls to the Cloud TTS API.

Google Cloud has also reassured its users that the company has conducted a deep ethical evaluation of the new feature and its relation to synthetic media in order to “surface and mitigate potential harms that it may create.”

Users interested in creating their personalized synthetic voice will need to go through a review process to ensure each use case is “aligned with [Google’s] AI Principles and adequate voice actor consent is given.”

Google Cloud has announced a new feature within its TTS API that lets users generate a unique, new synthetic voice trained from recordings.

Click To Tweet

Furthermore, to ensure that the audio recording submitted to generate the new voice is the user’s and not someone else’s, the process will require the users to read a sentence that Google Cloud chooses – for example: “I agree that my voice will be used to create a synthetic custom Text-to-Speech voice.”

TTS Custom Voice is now GA in English (US, AU, and UK), Spanish (US and Spain), French (France and Canada), Italian, German, Portuguese (Brazil), and Japanese.

More languages will be made available in the future. Interested users can already contact their seller and begin undergoing the review process.

Photo by Craig Pattenaude on Unsplash