`<Say>`

TL;DR

<Say> a provided text using text-to-speech.

Need Help? Let's Talk

Join our Discord community - we're here to help.

Description

The <Say> verb converts text to speech that is read back to the caller.

Example

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say voice="man" language="en">Hello World</Say>
</Response>

An example of <Say> using SSML for describing the speech output may look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say>
    Here is an <say-as interpret-as="characters">SSML</say-as> example. You can pause <break time="3s"/>
    the text, play a sound such as <audio src="https://www.example.com/MY_MP3_FILE.mp3">(could not play audio file)</audio>
    or use other commands as specified in the Speech Synthesis Markup Language Version 1.1 specification.
  </Say>
</Response>

Attributes

The following attributes are supported:

Attribute Name	Allowed Values	Default Value
`answer`	`true`, `false`	`true`
`loop`	A number that is 0 or greater	1
`voice`	`man`, `woman` or See Premium Voices	`woman`
`language`	See Premium Voices	`en-US`
`statusCallback`	URL	none
`statusCallbackMethod`	`POST` or `GET`	`POST`

Attribute: `answer`

When set to false, and the call was not yet answered by another operation (Dial, for example, does not cause a call to be answered by itself - until the receiver answers the call), then the <Say> verb will cause the specified media to be played using "early media" (SIP response code 183) without answering the call.

DID YOU KNOW...

Please note, the usage of the <Say> verb with the answer = false that is followed by a <Reject> verb will generate a billable event.

Attribute: `loop`

How many times to repeat the same text to the caller.

Attribute: `voice`

Which voice model to use for generating the synthesized voice. Additional models may be offered in the future.

Attribute: `language`

In which language, of those supported, to generate the speech in. The language is a hint to the speech syntehsizer, where the text must actually be written in the specified language - no translation will be done on the text before performing speech synthesis.

Attribute: `statusCallback`

A URL to be called when the audio output has completed playing. This URL will be called with all the parameters of a standard CXML request, but its output is discarded.

Attribute: `statusCallbackMethod`

The HTTP method to use for the statusCallback URL.

We encourage you to use the <Say> verb in development. Production services are encouraged to use studio quality recordings.

Cloudonix supports SSML as the content of the <Say> element, where the <Say> element replaces the SSML document element <speak> (i.e. the content of the <speak> element from an SSML document can be used as is as the content of a <Say> element), with the following exceptions:

<lexicon> and <lookup> are unsupported and it is an error to include these in the SSML content.
<emphasis> and <prosody> are handled as simple text.
<phoneme> will pronounce the display content of the element (if exists) instead of the IPA code.

Premium Voices

Cloudonix supports multiple Text-To-Speech engines, which are supported Over-The-Top (OTT) or as Bring-Your-Own-Voice (BYOV).

Over The Top Voices

Cloudonix provides unified access to its built-in voice engines. Currently, the supported engines are: aws - Amazon Polly, and gcp - Google AI Voices. By default, Cloudonix uses Amazon Polly, with a female voice - however, you may change that.

Example - Google Voice AI with Female Journey Voice

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say voice="Google:en-US-Journey-F" language="en-US">Hello World</Say>
</Response>

Example - Amazon Polly with Male Neural Voice

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say voice="Polly:Gregory" language="en-US">Hello World</Say>
</Response>

List of supported Over-The-Top voices

Bring Your Own Voice

BYOV feature allows you to pay for TTS services by providing your own credentials. Some of the TTS services are available only through BYOV. In order to provide your own credentials for the TTS service you want to use, login to the Cloudonix Cockpit and use the 3rd-party Authorizations settings page to add the passwords or API keys you have received from the TTS service.

Retrieving list of voices

After the credentials have been set up properly, you can use the Cloudonix REST API to retrieve a list of all voices that are available through the BYOV feature - by using the REST API endpoint /domains/{domain}/resources/voices.

The result will be an array of JSON objects, one for each TTS voice that you can use, as per the credentials you have configured. For each JSON object, the following properties are displayed:

Property Name	Description
`provider`	The name of the Text-To-Speech (TTS) service provider through which this voice is available.
`voice`	The value to use with the `<Say>` verb's `voice` attribute to use this voice.
`languages`	An array of language codes, any of which can be used for the `<Say>` verb's `language` attribute, with this voice.
`gender`	A description of the gender this voice may sound like.
`pricing`	The Cloudonix pricing for this voice. Either: `standard` - included in the Cloudonix billing plan `premium` - consumes AI "usage minutes" `customer-pay` - available through customer provided 3rd-party credentials

Example

$ curl 'https://api.cloudonix.io/domains/cloudonix-demo-customer.cloudonix.net/resources/voices'
  --header 'Authorization: Bearer XI•••••••••••••••'

[
  {
    "voice": "AWS:Patrick",
    "gender": "male",
    "languages": [
      "en-US"
    ],
    "provider": "Polly",
    "pricing": "customer-pay"
  },
  …
    {
    "voice": "AWS-Neural:Inês",
    "gender": "female",
    "languages": [
      "pt-PT"
    ],
    "provider": "Neural",
    "pricing": "customer-pay"
  },
  …
    {
    "voice": "Eleven:Eric",
    "gender": "male",
    "languages": [

    ],
    "provider": "Eleven",
    "pricing": "customer-pay"
  },
  …
    {
    "voice": "Azure:en-AU-CarlyNeural",
    "gender": "female",
    "languages": [
      "en-AU"
    ],
    "provider": "Azure",
    "pricing": "customer-pay"
  },
  …
    {
    "voice": "Google:da-DK-Wavenet-C",
    "gender": "male",
    "languages": [
      "da-DK"
    ],
    "provider": "Google",
    "pricing": "customer-pay"
  },
  …
]

Description​

Example​

Attributes​

Attribute: answer​

Attribute: loop​

Attribute: voice​

Attribute: language​

Attribute: statusCallback​

Attribute: statusCallbackMethod​

Premium Voices​

Over The Top Voices​

Example - Google Voice AI with Female Journey Voice​

Example - Amazon Polly with Male Neural Voice​

List of supported Over-The-Top voices​

Bring Your Own Voice​

Retrieving list of voices​

Example​

Description

Example

Attributes

Attribute: `answer`

Attribute: `loop`

Attribute: `voice`

Attribute: `language`

Attribute: `statusCallback`

Attribute: `statusCallbackMethod`

Premium Voices

Over The Top Voices

Example - Google Voice AI with Female Journey Voice

Example - Amazon Polly with Male Neural Voice

List of supported Over-The-Top voices

Bring Your Own Voice

Retrieving list of voices

Example