Matxa Multispeaker Catalan Central Endpoint

Input parameters

Parameter
Description
Type
Options/Range
Recommended Values

text

Input text to be converted to speech

string

Any text input

N/A

voice

Speaker ID for voice output

int

47 different speakers available

N/A

temperature

Controls sampling variance during inference. Lower values yield higher quality but less variability

float

0.2 to 0.67

0.2 to 0.67

length_scale

Related to speech speed. Higher values make the speech slower, while lower values make it faster

float

0.75 to 0.9

0.75 to 0.9

Each integer value is associated with one of the speakers you see in the list below. All names starting with “caf_XXXXX” (female) and “cam_XXXXX” (male) belong to speakers in the OpenSLR69 dataset and short Catalan names in the festcat dataset.

"cam_03115": 0,
  "caf_04247": 1,
  "caf_05450": 2,
  "cam_08935": 3,
  "caf_09901": 4,
  "ona": 5,
  "pol": 6,
  "cam_02689": 7,
  "caf_06042": 8,
  "jan": 9,
  "caf_08106": 10,
  "cam_04910": 11,
  "cam_08664": 12,
  "caf_07803": 13,
  "cam_06582": 14,
  "caf_06311": 15,
  "caf_07245": 16,
  "cam_06279": 17,
  "caf_09598": 18,
  "caf_09796": 19,
  "eva": 20,
  "cam_00762": 21,
  "caf_09204": 22,
  "caf_03944": 23,
  "caf_05147": 24,
  "uri": 25,
  "mar": 26,
  "cam_00459": 27,
  "teo": 28,
  "caf_03655": 29,
  "bet": 30,
  "cam_06705": 31,
  "caf_05739": 32,
  "caf_06008": 33,
  "cam_04484": 34,
  "cam_03386": 35,
  "cam_08967": 36,
  "caf_06942": 37,
  "cam_07140": 38,
  "pau": 39,
  "caf_08001": 40,
  "pep": 41,
  "cam_04787": 42,
  "eli": 43,
  "caf_01591": 44,
  "caf_02452": 45,
  "cam_02992": 46

Code examples

Python

Curl

Please take into account the apostrophes. An effective way is to create a temporary JSON file to pass text and input parameters for inference.

Javascript

Executed with Node.js. Install NPM (Node Package Manager) and with NPM install fetch-node library.

Last updated