๐ค Supernal TTS - Complete LLM Integration Guide
This is a comprehensive guide designed to be copied and pasted into an LLM conversation to give it complete context about integrating Supernal TTS.
Click to copy this entire guide to your clipboard, then paste it into your LLM conversation to give it complete instructions for integrating Supernal TTS.
Quick Summaryโ
Supernal TTS is a text-to-speech API service with multiple provider support (OpenAI, Cartesia, Azure, Mock). It provides REST API endpoints, JavaScript SDK, and embeddable web widgets for converting text to speech.
Core Endpointsโ
Base URL: https://tts.supernal.ai (or http://localhost:3030 for local development)
1. Generate Speechโ
POST /api/v1/generate
Content-Type: application/json
{
"text": "Text to convert to speech",
"options": {
"provider": "openai", # Options: "openai", "cartesia", "azure", "mock"
"voice": "coral", # Provider-specific voice name
"speed": 1.0, # Range: 0.25 - 4.0
"instructions": "optional tone/style instructions for OpenAI"
}
}
Response:
{
"audioUrl": "https://tts.supernal.ai/api/v1/audio/abc123...",
"hash": "abc123...",
"metadata": {
"provider": "openai",
"voice": "coral",
"duration": 5.2,
"size": 83840
}
}
2. Retrieve Audioโ
GET /api/v1/audio/{hash}
Returns the audio file (typically MP3 format).
3. Get Audio Metadataโ
GET /api/v1/audio/{hash}/metadata
Returns metadata about the audio file without downloading it.
4. List Available Providersโ
GET /api/v1/providers
Returns list of configured and available providers.
5. Health Checkโ
GET /health
Returns service status and available providers.
Provider Optionsโ
OpenAIโ
- Voices:
alloy,echo,fable,onyx,nova,shimmer,coral - Best for: High-quality natural speech
- Latency: ~200ms
- Cost: $15-30 per million characters
- Special features:
instructionsparameter for tone control
Cartesiaโ
- Best for: Real-time applications requiring ultra-low latency
- Latency: ~Low latency (fastest)
- Cost: ~$24 per million characters
- Special features: Emotion control
Azureโ
- Best for: Budget-conscious projects
- Latency: ~300ms
- Cost: $0-16 per million characters (500K free tier)
- Voices:
en-US-JennyNeural,en-US-GuyNeural,en-US-AriaNeural, etc.
Mockโ
- Best for: Testing without API keys
- Latency: ~500ms
- Cost: Free
- Note: Returns test audio, no real TTS generation
JavaScript SDK Usageโ
import { TTSClient } from '@supernal-tts/client';
// Initialize client
const client = new TTSClient({
apiUrl: 'https://tts.supernal.ai'
});
// Generate speech
const result = await client.generate({
text: "Hello from Supernal TTS!",
options: {
provider: 'openai',
voice: 'coral'
}
});
// Use the audio URL
console.log(result.audioUrl);
// Get metadata
const metadata = await client.getMetadata(result.hash);
Web Widget Integrationโ
Simple Button Widgetโ
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.supernal.ai/widget/tts-widget.js"></script>
</head>
<body>
<div class="supernal-tts-widget"
data-text="Your text here"
data-voice="coral"
data-provider="openai">
</div>
<script>
const tts = new SupernalTTS({
apiUrl: 'https://tts.supernal.ai',
provider: 'openai',
voice: 'coral'
});
</script>
</body>
</html>
Advanced Widget with Controlsโ
<div class="supernal-tts-widget"
data-text="Your text here"
data-voice="coral"
data-provider="openai"
data-speed="1.0"
data-controls="true">
</div>
The advanced widget includes voice selection, speed control, position seek bar, and time display.
Environment Setupโ
When deploying or running locally, set these environment variables:
# Required for specific providers
OPENAI_API_KEY=sk-...
CARTESIA_API_KEY=...
AZURE_API_KEY=...
AZURE_REGION=eastus
# Optional configuration
PORT=3030
NODE_ENV=production
CACHE_DIR=.tts-cache
ENABLE_MOCK_PROVIDER=true
DEFAULT_PROVIDER=openai
Common Integration Patternsโ
1. Blog Audio Generationโ
async function addAudioToBlogPost(postContent) {
const client = new TTSClient({ apiUrl: 'https://tts.supernal.ai' });
const result = await client.generate({
text: postContent,
options: {
provider: 'openai',
voice: 'fable'
}
});
return result.audioUrl;
}
2. Batch Processingโ
async function convertMultiplePosts(posts) {
const client = new TTSClient({ apiUrl: 'https://tts.supernal.ai' });
const results = await Promise.all(
posts.map(post => client.generate({
text: post.content,
options: { provider: 'openai', voice: 'nova' }
}))
);
return results;
}
3. Real-time Chatbot Responseโ
async function speakChatbotResponse(message) {
const client = new TTSClient({ apiUrl: 'https://tts.supernal.ai' });
// Use Cartesia for lowest latency
const result = await client.generate({
text: message,
options: {
provider: 'cartesia',
voice: 'confident-british-man'
}
});
// Play audio immediately
const audio = new Audio(result.audioUrl);
audio.play();
}
Cost Optimizationโ
- Use caching: The service automatically caches identical text+voice combinations
- Choose appropriate provider:
- Testing: Use
mockprovider - Budget: Use
azure(especially within 500K char/month free tier) - Quality: Use
openai - Speed: Use
cartesia
- Testing: Use
- Pre-generate static content: For content that doesn't change, generate audio once
Error Handlingโ
try {
const result = await client.generate({
text: "Hello world!",
options: { provider: 'openai', voice: 'coral' }
});
return result.audioUrl;
} catch (error) {
if (error.status === 503) {
// Provider unavailable - fallback to mock
return await client.generate({
text: "Hello world!",
options: { provider: 'mock' }
});
}
throw error;
}
Testing Locallyโ
# 1. Clone repository
git clone https://github.com/supernalintelligence/supernal-nova.git
cd supernal-nova/families/supernal-tts
# 2. Install dependencies
npm install
# 3. Set environment variables
cp .env.example .env
# Edit .env and add your API keys
# 4. Start server
npm run dev
# 5. Test with curl
curl -X POST http://localhost:3030/api/v1/generate \
-H "Content-Type: application/json" \
-d '{"text": "Test", "options": {"provider": "mock"}}'
Deploymentโ
The service can be deployed as:
- Docker container: See
docker-compose.yml - Vercel serverless: See
vercel.json - Node.js application: Run
node npm-packages/supernal-tts-server/bin/supernal-tts.js
Support & Documentationโ
- Full API Docs: https://tts.supernal.ai/docs/API
- Widget Guide: https://tts.supernal.ai/docs/widget-guide
- Examples: https://tts.supernal.ai/docs/examples
- GitHub: https://github.com/supernalintelligence/supernal-nova
- Issues: https://github.com/supernalintelligence/supernal-tts/issues
๐ Quick Copy-Paste Examplesโ
cURL Exampleโ
curl -X POST https://tts.supernal.ai/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Welcome to Supernal TTS!",
"options": {
"provider": "openai",
"voice": "coral"
}
}'
JavaScript Fetch Exampleโ
const response = await fetch('https://tts.supernal.ai/api/v1/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: "Welcome to Supernal TTS!",
options: { provider: 'openai', voice: 'coral' }
})
});
const data = await response.json();
console.log(data.audioUrl);
Python Exampleโ
import requests
response = requests.post(
'https://tts.supernal.ai/api/v1/generate',
json={
'text': 'Welcome to Supernal TTS!',
'options': {
'provider': 'openai',
'voice': 'coral'
}
}
)
data = response.json()
print(data['audioUrl'])
License: Fair Source License (see LICENSE file for commercial use terms)
Last Updated: October 2025
๐ฆ Complete Code Examplesโ
React Blog Component (Full Implementation)โ
import React, { useState, useRef, useEffect } from 'react';
const TTS_API_URL = 'https://tts.supernal.ai/api/v1';
function TTSButton({ text, voice = 'nova', speed = 1.0 }) {
const [isLoading, setIsLoading] = useState(false);
const [isPlaying, setIsPlaying] = useState(false);
const [error, setError] = useState(null);
const [progress, setProgress] = useState(0);
const audioRef = useRef(null);
useEffect(() => {
return () => {
if (audioRef.current) {
audioRef.current.pause();
audioRef.current = null;
}
};
}, []);
const handleTimeUpdate = () => {
if (audioRef.current) {
const percent = (audioRef.current.currentTime / audioRef.current.duration) * 100;
setProgress(percent);
}
};
const handlePlayPause = async () => {
if (isPlaying && audioRef.current) {
audioRef.current.pause();
setIsPlaying(false);
return;
}
if (audioRef.current && audioRef.current.src) {
audioRef.current.play();
setIsPlaying(true);
return;
}
setIsLoading(true);
setError(null);
try {
const response = await fetch(`${TTS_API_URL}/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text,
options: {
provider: 'openai',
voice,
speed,
format: 'mp3'
}
})
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const data = await response.json();
if (!audioRef.current) {
audioRef.current = new Audio();
audioRef.current.onended = () => {
setIsPlaying(false);
setProgress(0);
};
audioRef.current.ontimeupdate = handleTimeUpdate;
audioRef.current.onerror = () => {
setError('Failed to load audio');
setIsPlaying(false);
setIsLoading(false);
};
}
audioRef.current.src = `${TTS_API_URL}${data.url}`;
await audioRef.current.play();
setIsPlaying(true);
} catch (err) {
console.error('TTS Error:', err);
setError(err.message);
} finally {
setIsLoading(false);
}
};
return (
<div style={{ margin: '1rem 0' }}>
<button
onClick={handlePlayPause}
disabled={isLoading}
style={{
padding: '0.75rem 1.5rem',
fontSize: '1rem',
backgroundColor: isPlaying ? '#dc3545' : '#007bff',
color: 'white',
border: 'none',
borderRadius: '4px',
cursor: isLoading ? 'not-allowed' : 'pointer'
}}
>
{isLoading ? 'โณ Generating...' : isPlaying ? 'โธ๏ธ Pause' : '๐ Listen'}
</button>
{(isPlaying || progress > 0) && (
<div style={{
width: '100%',
height: '4px',
backgroundColor: '#e9ecef',
borderRadius: '2px',
marginTop: '0.5rem',
overflow: 'hidden'
}}>
<div style={{
width: `${progress}%`,
height: '100%',
backgroundColor: '#007bff',
transition: 'width 0.1s linear'
}} />
</div>
)}
{error && <div style={{ color: '#dc3545', marginTop: '0.5rem' }}>โ ๏ธ {error}</div>}
</div>
);
}
// Usage in blog post
function BlogPost({ title, author, content }) {
const [selectedVoice, setSelectedVoice] = useState('nova');
const fullText = `${title}. By ${author}. ${content}`;
return (
<article>
<h2>{title}</h2>
<p>By {author}</p>
<select
value={selectedVoice}
onChange={(e) => setSelectedVoice(e.target.value)}
style={{ marginBottom: '1rem' }}
>
<option value="nova">Nova (Warm female)</option>
<option value="fable">Fable (British)</option>
<option value="onyx">Onyx (Deep)</option>
<option value="alloy">Alloy (Neutral)</option>
</select>
<TTSButton text={fullText} voice={selectedVoice} />
<div>{content}</div>
</article>
);
}
export default BlogPost;
Node.js CLI Tool (Complete Implementation)โ
Save this as tts-cli.js:
#!/usr/bin/env node
const fs = require('fs');
const path = require('path');
const http = require('http');
const https = require('https');
const TTS_API_URL = process.env.TTS_API_URL || 'http://localhost:3030';
// Make HTTP request helper
function makeRequest(url, options = {}) {
return new Promise((resolve, reject) => {
const lib = url.startsWith('https:') ? https : http;
const req = lib.request(url, {
method: options.method || 'GET',
headers: {
'Content-Type': 'application/json',
...options.headers
}
}, (res) => {
if (options.binary) {
const chunks = [];
res.on('data', chunk => chunks.push(chunk));
res.on('end', () => resolve(Buffer.concat(chunks)));
} else {
let data = '';
res.on('data', chunk => data += chunk);
res.on('end', () => {
try {
resolve(JSON.parse(data));
} catch (e) {
resolve(data);
}
});
}
});
req.on('error', reject);
if (options.body) {
req.write(JSON.stringify(options.body));
}
req.end();
});
}
// Generate TTS from text
async function generateTTS(text, options = {}) {
const {
voice = 'fable',
provider = 'openai',
speed = 1.0,
output = 'output.mp3'
} = options;
console.log(`๐๏ธ Generating TTS...`);
console.log(` Text: "${text.substring(0, 50)}${text.length > 50 ? '...' : ''}"`);
console.log(` Provider: ${provider}`);
console.log(` Voice: ${voice}`);
try {
const response = await makeRequest(`${TTS_API_URL}/api/v1/generate`, {
method: 'POST',
body: {
text,
options: { provider, voice, speed, format: 'mp3' }
}
});
if (response.error) {
console.error(`โ Error: ${response.error}`);
process.exit(1);
}
console.log(`โ
Generated successfully!`);
console.log(` Hash: ${response.hash}`);
console.log(` Duration: ${response.duration}s`);
console.log(` URL: ${TTS_API_URL}${response.audioUrl}`);
// Download audio file
console.log(`\n๐ฅ Downloading audio to: ${output}`);
const audioData = await makeRequest(`${TTS_API_URL}${response.audioUrl}`, {
binary: true
});
fs.writeFileSync(output, audioData);
console.log(`โ
Audio saved: ${output} (${audioData.length} bytes)`);
return { hash: response.hash, file: output };
} catch (error) {
console.error(`โ Request failed: ${error.message}`);
process.exit(1);
}
}
// Process text file
async function processFile(filePath, options = {}) {
if (!fs.existsSync(filePath)) {
console.error(`โ File not found: ${filePath}`);
process.exit(1);
}
const text = fs.readFileSync(filePath, 'utf-8').trim();
if (!text) {
console.error(`โ File is empty: ${filePath}`);
process.exit(1);
}
console.log(`๐ Processing file: ${filePath}`);
console.log(` Length: ${text.length} characters`);
const outputFile = options.output || `${path.basename(filePath, path.extname(filePath))}.mp3`;
return await generateTTS(text, { ...options, output: outputFile });
}
// Show help
function showHelp() {
console.log(`
๐๏ธ Supernal TTS CLI Tool
Usage:
node tts-cli.js generate --text "Your text" [options]
node tts-cli.js file --file input.txt [options]
Commands:
generate Generate audio from text
file Process text file
Options:
--text <text> Text to convert to speech
--file <path> Text file to process
--output <path> Output audio file path (default: output.mp3)
--voice <voice> Voice to use (default: fable)
--provider <name> Provider to use (default: openai)
--speed <float> Speech speed 0.25-4.0 (default: 1.0)
Environment Variables:
TTS_API_URL API server URL (default: http://localhost:3030)
Examples:
# Generate from text
node tts-cli.js generate --text "Hello world" --voice nova --output hello.mp3
# Process a file
node tts-cli.js file --file article.txt --voice fable --output article.mp3
`);
}
// Parse command line arguments
function parseArgs() {
const args = process.argv.slice(2);
const options = {
command: args[0],
text: null,
file: null,
output: null,
voice: 'fable',
provider: 'openai',
speed: 1.0
};
for (let i = 1; i < args.length; i += 2) {
const flag = args[i];
const value = args[i + 1];
switch (flag) {
case '--text': options.text = value; break;
case '--file': options.file = value; break;
case '--output': options.output = value; break;
case '--voice': options.voice = value; break;
case '--provider': options.provider = value; break;
case '--speed': options.speed = parseFloat(value); break;
}
}
return options;
}
// Main function
async function main() {
const options = parseArgs();
if (!options.command || options.command === 'help') {
showHelp();
return;
}
try {
switch (options.command) {
case 'generate':
if (!options.text) {
console.error('โ --text is required for generate command');
process.exit(1);
}
await generateTTS(options.text, options);
break;
case 'file':
if (!options.file) {
console.error('โ --file is required for file command');
process.exit(1);
}
await processFile(options.file, options);
break;
default:
console.error(`โ Unknown command: ${options.command}`);
showHelp();
process.exit(1);
}
} catch (error) {
console.error(`โ Command failed: ${error.message}`);
process.exit(1);
}
}
// Run if called directly
if (require.main === module) {
main().catch(console.error);
}
module.exports = { generateTTS, processFile };
Make it executable:
chmod +x tts-cli.js
# Usage examples
./tts-cli.js generate --text "Hello world" --voice nova --output hello.mp3
./tts-cli.js file --file article.txt --voice fable --output article.mp3
Static HTML Widget Demoโ
Save this as demo.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Supernal TTS Widget Demo</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
max-width: 800px;
margin: 0 auto;
padding: 2rem;
line-height: 1.6;
}
.tts-button {
background: #007bff;
color: white;
border: none;
padding: 0.75rem 1.5rem;
font-size: 1rem;
border-radius: 4px;
cursor: pointer;
margin: 1rem 0;
}
.tts-button:hover {
background: #0056b3;
}
.tts-button:disabled {
background: #6c757d;
cursor: not-allowed;
}
.progress-bar {
width: 100%;
height: 4px;
background: #e9ecef;
border-radius: 2px;
margin-top: 0.5rem;
overflow: hidden;
}
.progress-fill {
height: 100%;
background: #007bff;
transition: width 0.1s linear;
}
.blog-post {
background: #f8f9fa;
padding: 2rem;
border-radius: 8px;
margin: 2rem 0;
}
</style>
</head>
<body>
<h1>๐๏ธ Supernal TTS Demo</h1>
<div class="blog-post">
<h2>Sample Blog Post</h2>
<p id="content">
Welcome to Supernal TTS! This is a demonstration of text-to-speech technology.
With just a click, you can listen to any written content. This makes information
more accessible and allows you to consume content while multitasking.
</p>
<select id="voice-select">
<option value="nova">Nova (Warm female)</option>
<option value="fable">Fable (British)</option>
<option value="onyx">Onyx (Deep)</option>
<option value="alloy">Alloy (Neutral)</option>
</select>
<button id="play-button" class="tts-button">๐ Listen to Article</button>
<div id="progress" class="progress-bar" style="display: none;">
<div id="progress-fill" class="progress-fill"></div>
</div>
<div id="error" style="color: #dc3545; margin-top: 0.5rem;"></div>
</div>
<script>
const API_URL = 'https://tts.supernal.ai/api/v1';
const playButton = document.getElementById('play-button');
const voiceSelect = document.getElementById('voice-select');
const content = document.getElementById('content');
const progressBar = document.getElementById('progress');
const progressFill = document.getElementById('progress-fill');
const errorDiv = document.getElementById('error');
let audio = null;
let isPlaying = false;
playButton.addEventListener('click', async () => {
if (isPlaying && audio) {
audio.pause();
isPlaying = false;
playButton.textContent = '๐ Listen to Article';
return;
}
if (audio && audio.src) {
audio.play();
isPlaying = true;
playButton.textContent = 'โธ๏ธ Pause';
return;
}
playButton.disabled = true;
playButton.textContent = 'โณ Generating...';
errorDiv.textContent = '';
try {
const response = await fetch(`${API_URL}/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: content.textContent,
options: {
provider: 'openai',
voice: voiceSelect.value,
speed: 1.0
}
})
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const data = await response.json();
audio = new Audio(`${API_URL}${data.url}`);
audio.addEventListener('timeupdate', () => {
const percent = (audio.currentTime / audio.duration) * 100;
progressFill.style.width = `${percent}%`;
});
audio.addEventListener('ended', () => {
isPlaying = false;
playButton.textContent = '๐ Listen to Article';
progressFill.style.width = '0%';
});
audio.addEventListener('error', () => {
errorDiv.textContent = 'โ ๏ธ Failed to load audio';
playButton.disabled = false;
playButton.textContent = '๐ Listen to Article';
});
await audio.play();
isPlaying = true;
playButton.textContent = 'โธ๏ธ Pause';
progressBar.style.display = 'block';
} catch (error) {
console.error('TTS Error:', error);
errorDiv.textContent = `โ ๏ธ ${error.message}`;
} finally {
playButton.disabled = false;
}
});
</script>
</body>
</html>
Python Integration Exampleโ
import requests
import json
class SupernalTTS:
def __init__(self, api_url='https://tts.supernal.ai'):
self.api_url = api_url
self.api_v1 = f"{api_url}/api/v1"
def generate(self, text, voice='nova', provider='openai', speed=1.0):
"""Generate speech from text"""
response = requests.post(
f"{self.api_v1}/generate",
json={
'text': text,
'options': {
'provider': provider,
'voice': voice,
'speed': speed
}
}
)
response.raise_for_status()
return response.json()
def download_audio(self, audio_url, output_path):
"""Download audio file"""
response = requests.get(f"{self.api_url}{audio_url}")
response.raise_for_status()
with open(output_path, 'wb') as f:
f.write(response.content)
return output_path
def generate_to_file(self, text, output_path, **options):
"""Generate speech and save to file"""
result = self.generate(text, **options)
self.download_audio(result['audioUrl'], output_path)
return result
# Usage
client = SupernalTTS()
# Generate and get URL
result = client.generate("Hello from Python!", voice='nova')
print(f"Audio URL: {result['audioUrl']}")
print(f"Duration: {result['duration']}s")
# Generate and save to file
client.generate_to_file(
"Welcome to Supernal TTS!",
"welcome.mp3",
voice='fable',
provider='openai'
)
# Batch processing
texts = [
"First paragraph of content",
"Second paragraph of content",
"Third paragraph of content"
]
for i, text in enumerate(texts):
print(f"Processing paragraph {i+1}...")
client.generate_to_file(text, f"paragraph_{i+1}.mp3", voice='nova')
print(f"Saved paragraph_{i+1}.mp3")
Discord Bot Integrationโ
const { Client, GatewayIntentBits } = require('discord.js');
const { joinVoiceChannel, createAudioPlayer, createAudioResource } = require('@discordjs/voice');
const fetch = require('node-fetch');
const TTS_API_URL = 'https://tts.supernal.ai/api/v1';
const client = new Client({
intents: [
GatewayIntentBits.Guilds,
GatewayIntentBits.GuildMessages,
GatewayIntentBits.MessageContent,
GatewayIntentBits.GuildVoiceStates
]
});
client.on('messageCreate', async (message) => {
if (message.content.startsWith('!tts ')) {
const text = message.content.slice(5);
try {
// Generate audio
const response = await fetch(`${TTS_API_URL}/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text,
options: {
provider: 'cartesia', // Low latency for real-time
voice: 'confident-british-man'
}
})
});
const data = await response.json();
// Join voice channel and play
const voiceChannel = message.member.voice.channel;
if (voiceChannel) {
const connection = joinVoiceChannel({
channelId: voiceChannel.id,
guildId: message.guild.id,
adapterCreator: message.guild.voiceAdapterCreator,
});
const player = createAudioPlayer();
const resource = createAudioResource(`${TTS_API_URL.replace('/api/v1', '')}${data.audioUrl}`);
player.play(resource);
connection.subscribe(player);
message.reply('๐ Playing audio in voice channel!');
} else {
message.reply('You need to be in a voice channel!');
}
} catch (error) {
console.error('TTS Error:', error);
message.reply('Sorry, I couldn\'t generate that audio.');
}
}
});
client.login('YOUR_BOT_TOKEN');
Batch Processing Scriptโ
const fs = require('fs');
const path = require('path');
const fetch = require('node-fetch');
const TTS_API_URL = 'https://tts.supernal.ai/api/v1';
class BatchTTSProcessor {
constructor(options = {}) {
this.apiUrl = options.apiUrl || TTS_API_URL;
this.provider = options.provider || 'openai';
this.voice = options.voice || 'fable';
this.outputDir = options.outputDir || './output';
this.processed = 0;
this.errors = 0;
}
async processTextFiles(inputDir) {
// Create output directory if it doesn't exist
if (!fs.existsSync(this.outputDir)) {
fs.mkdirSync(this.outputDir, { recursive: true });
}
const files = fs.readdirSync(inputDir)
.filter(file => file.endsWith('.txt'));
console.log(`Found ${files.length} text files to process\n`);
for (const file of files) {
await this.processFile(path.join(inputDir, file));
}
console.log(`\nProcessing complete:`);
console.log(`- Processed: ${this.processed}`);
console.log(`- Errors: ${this.errors}`);
}
async processFile(filePath) {
const filename = path.basename(filePath, '.txt');
const outputPath = path.join(this.outputDir, `${filename}.mp3`);
// Skip if already processed
if (fs.existsSync(outputPath)) {
console.log(`โญ๏ธ Skipping ${filename} (already exists)`);
return;
}
try {
const text = fs.readFileSync(filePath, 'utf8');
console.log(`๐๏ธ Processing ${filename}... (${text.length} chars)`);
const response = await fetch(`${this.apiUrl}/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text,
options: {
provider: this.provider,
voice: this.voice,
quality: 'high'
}
})
});
const data = await response.json();
// Download audio file
const audioResponse = await fetch(`${this.apiUrl.replace('/api/v1', '')}${data.audioUrl}`);
const audioBuffer = await audioResponse.buffer();
fs.writeFileSync(outputPath, audioBuffer);
console.log(`โ
Generated ${filename}.mp3 (${data.duration}s, $${data.cost.toFixed(4)})`);
this.processed++;
} catch (error) {
console.error(`โ Failed to process ${filename}:`, error.message);
this.errors++;
}
}
}
// Usage
const processor = new BatchTTSProcessor({
apiUrl: 'https://tts.supernal.ai/api/v1',
provider: 'openai',
voice: 'fable',
outputDir: './audio-output'
});
processor.processTextFiles('./text-input');
Run with:
node batch-processor.js
๐ฏ Common Use Case Patternsโ
Blog Audio Generation with Cachingโ
class BlogAudioManager {
constructor(apiUrl) {
this.apiUrl = apiUrl;
this.cache = new Map();
}
async generateBlogAudio(blogPost) {
const cacheKey = blogPost.id;
// Check cache first
if (this.cache.has(cacheKey)) {
return this.cache.get(cacheKey);
}
// Extract and clean content
const content = this.extractContent(blogPost);
// Split into manageable chunks (API limits)
const chunks = this.splitIntoChunks(content, 4000);
// Generate audio for each chunk
const audioSegments = await Promise.all(
chunks.map((chunk, index) => this.generateChunkAudio(chunk, index))
);
const result = {
blogId: blogPost.id,
totalDuration: audioSegments.reduce((sum, seg) => sum + seg.duration, 0),
segments: audioSegments,
generatedAt: new Date().toISOString()
};
this.cache.set(cacheKey, result);
return result;
}
extractContent(blogPost) {
let content = blogPost.content
.replace(/<[^>]*>/g, '') // Remove HTML tags
.replace(/\s+/g, ' ') // Normalize whitespace
.trim();
content = `${blogPost.title}. By ${blogPost.author}. ${content}`;
return content;
}
splitIntoChunks(text, maxLength) {
const chunks = [];
const sentences = text.split(/[.!?]+/);
let currentChunk = '';
for (const sentence of sentences) {
if (currentChunk.length + sentence.length > maxLength) {
if (currentChunk) {
chunks.push(currentChunk.trim());
currentChunk = '';
}
}
currentChunk += sentence + '. ';
}
if (currentChunk) {
chunks.push(currentChunk.trim());
}
return chunks;
}
async generateChunkAudio(text, index) {
const response = await fetch(`${this.apiUrl}/api/v1/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text,
options: {
provider: 'openai',
voice: 'fable',
quality: 'high'
}
})
});
return await response.json();
}
}
E-Learning Platform with Pre-loadingโ
class LessonAudioPlayer {
constructor(apiUrl) {
this.apiUrl = apiUrl;
this.audioCache = new Map();
}
async loadLesson(lessonId, segments) {
console.log(`Loading audio for lesson ${lessonId}...`);
// Pre-generate audio for all segments
await this.preGenerateAudio(segments);
console.log(`Lesson ${lessonId} ready to play`);
}
async preGenerateAudio(segments) {
const promises = segments.map(async (segment) => {
if (this.audioCache.has(segment.id)) return;
try {
const response = await fetch(`${this.apiUrl}/api/v1/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: segment.content,
options: {
provider: 'openai',
voice: 'fable',
speed: 0.9 // Slightly slower for learning
}
})
});
const data = await response.json();
this.audioCache.set(segment.id, data.audioUrl);
} catch (error) {
console.error(`Failed to generate audio for segment ${segment.id}`);
}
});
await Promise.all(promises);
}
async playSegment(segmentId) {
const audioUrl = this.audioCache.get(segmentId);
if (!audioUrl) {
throw new Error('Audio not available for this segment');
}
const audio = new Audio(`${this.apiUrl.replace('/api/v1', '')}${audioUrl}`);
return new Promise((resolve) => {
audio.onended = resolve;
audio.play();
});
}
async playFullLesson(segments) {
for (const segment of segments) {
await this.playSegment(segment.id);
// Pause between segments
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
}
๐ Quick Start Commandsโ
# Test with mock provider (no API key needed)
curl -X POST https://tts.supernal.ai/api/v1/generate \
-H "Content-Type: 'application/json" \
-d '{"text": "Hello world!", "options": {"provider": "mock"}}'
# Test with OpenAI (requires API key on server)
curl -X POST https://tts.supernal.ai/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Welcome to Supernal TTS!",
"options": {
"provider": "openai",
"voice": "nova"
}
}'
# Get audio file
curl https://tts.supernal.ai/api/v1/audio/{hash} -o output.mp3
# Check API health
curl https://tts.supernal.ai/api/v1/health
# List available providers
curl https://tts.supernal.ai/api/v1/providers
๐ง Local Development Setupโ
# Clone repository
git clone https://github.com/supernalintelligence/supernal-nova.git
cd supernal-nova/families/supernal-tts
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env
# Edit .env and add your API keys:
# OPENAI_API_KEY=sk-...
# CARTESIA_API_KEY=...
# AZURE_API_KEY=...
# Start server
npm run dev
# Server will be available at:
# http://localhost:3030
# Test the server
curl -X POST http://localhost:3030/api/v1/generate \
-H "Content-Type: application/json" \
-d '{"text": "Test", "options": {"provider": "mock"}}'
๐ Important Notes for LLMsโ
-
Always handle errors gracefully: API calls can fail for various reasons (rate limits, provider issues, network errors)
-
Use caching: The API automatically caches identical text+voice combinations, reducing costs
-
Provider selection:
- Use
mockfor testing without API keys - Use
openaifor best quality (requires API key) - Use
cartesiafor lowest latency (requires API key) - Use
azurefor budget-friendly option (requires API key)
- Use
-
Text length limits: Different providers have different limits. For long text, split into chunks
-
Voice availability: Not all voices are available for all providers. Check
/api/v1/providersendpoint -
Cost estimation: Use the
/api/v1/estimateendpoint before generating to calculate costs -
Rate limiting: Respect rate limits (100 requests per minute by default)
-
Audio format: Default is MP3, but WAV and OGG are also supported depending on provider
-
Speed range: Valid speed range is 0.25 to 4.0 (provider-dependent)
-
Browser compatibility: Audio playback works in all modern browsers. Use the HTML5
<audio>element