silk muga 1
Ultra-low-latency streaming TTS. Steer tone with a simple tag like
[happy] at the start of your text — perfect for real-time voice agents.silk mulberry 1.5
Expressive instruct-TTS. Describe the voice you want in natural language, or pick a preset studio speaker and tune its pitch.
Start here
Quickstart
Get an API key and synthesize your first audio clip in three steps.
Prompting Guide
Learn how to steer muga with tone tags and mulberry with voice descriptions.
Real-Time Streaming
Stream low-latency PCM audio over WebSocket for live playback.
Pipecat Integration
Drop rumik into a pipecat voice-agent pipeline in minutes.
Audio format
Every response from silk is 24 kHz, mono, signed 16-bit PCM. The HTTP endpoint wraps it in a WAV container (audio/wav); the WebSocket stream delivers raw PCM chunks so you can start playing before the full audio is ready.