Matxa Multispeaker Catalan Central Endpoint

Input parameters

Parameter
Description
Type
Options/Range
Recommended Values

text

Input text to be converted to speech

string

Any text input

N/A

voice

Speaker ID for voice output

int

47 different speakers available

N/A

temperature

Controls sampling variance during inference. Lower values yield higher quality but less variability

float

0.2 to 0.67

0.2 to 0.67

length_scale

Related to speech speed. Higher values make the speech slower, while lower values make it faster

float

0.75 to 0.9

0.75 to 0.9

Each integer value is associated with one of the speakers you see in the list below. All names starting with “caf_XXXXX” (female) and “cam_XXXXX” (male) belong to speakers in the OpenSLR69 dataset and short Catalan names in the festcat dataset.

"cam_03115": 0,
  "caf_04247": 1,
  "caf_05450": 2,
  "cam_08935": 3,
  "caf_09901": 4,
  "ona": 5,
  "pol": 6,
  "cam_02689": 7,
  "caf_06042": 8,
  "jan": 9,
  "caf_08106": 10,
  "cam_04910": 11,
  "cam_08664": 12,
  "caf_07803": 13,
  "cam_06582": 14,
  "caf_06311": 15,
  "caf_07245": 16,
  "cam_06279": 17,
  "caf_09598": 18,
  "caf_09796": 19,
  "eva": 20,
  "cam_00762": 21,
  "caf_09204": 22,
  "caf_03944": 23,
  "caf_05147": 24,
  "uri": 25,
  "mar": 26,
  "cam_00459": 27,
  "teo": 28,
  "caf_03655": 29,
  "bet": 30,
  "cam_06705": 31,
  "caf_05739": 32,
  "caf_06008": 33,
  "cam_04484": 34,
  "cam_03386": 35,
  "cam_08967": 36,
  "caf_06942": 37,
  "cam_07140": 38,
  "pau": 39,
  "caf_08001": 40,
  "pep": 41,
  "cam_04787": 42,
  "eli": 43,
  "caf_01591": 44,
  "caf_02452": 45,
  "cam_02992": 46

Code examples

Python

example.py
import requests

API_URL = "https://x6g02u4lkf25gcjo.us-east-1.aws.endpoints.huggingface.cloud/api/tts"
headers = {
   "Authorization": "Bearer <hf_token>",
}

def query(text):
   data = {"text": text, "voice": 20}
   return requests.post(API_URL, headers=headers, json=data)

response = query("Bon dia")
with open("output.wav", "wb") as f:
   f.write(response.content)

Curl

Please take into account the apostrophes. An effective way is to create a temporary JSON file to pass text and input parameters for inference.

bash
printf '%s' '{
  "text": "L'\''Aina ha preparat aquest model de síntesi de veu.",
  "voice": 39,
  "type": "text"
}' > data.json

curl -X POST https://x6g02u4lkf25gcjo.us-east-1.aws.endpoints.huggingface.cloud/api/tts -H "Content-Type: application/json" -H "Authorization: Bearer <hf_token>" -d @data.json | play -t wav -

rm data.json// Some code

Javascript

Executed with Node.js. Install NPM (Node Package Manager) and with NPM install fetch-node library.

example.js
const fetch = require("node-fetch");
const fs = require("fs");

// Define the API URL and headers
const API_URL = "https://x6g02u4lkf25gcjo.us-east-1.aws.endpoints.huggingface.cloud/api/tts";
const headers = {
    "Authorization": "Bearer <hf_token>",
    "Content-Type": "application/json"
};

// Function to send the request
async function query(text) {
    const data = {
        text: text,
        voice: 20,
    };

    try {
        // POST request
        const response = await fetch(API_URL, {
            method: "POST",
            headers: headers,
            body: JSON.stringify(data)
        });

        // Check the response
        if (!response.ok) {
            throw new Error(`Error: ${response.status} ${response.statusText}`);
        }

        // Convert the response to a buffer
        const buffer = await response.buffer();

        // Write the buffer to an output file
        fs.writeFile("output.wav", buffer, (err) => {
            if (err) {
                console.error("Error saving the file:", err);
            } else {
                console.log("File saved as output.wav");
            }
        });

        
    } catch (error) {
        console.error("Error making request:", error);
    }
}

// Example usage
query("Bon dia.");

Last updated