build a voice agent with pipecat

drop rumik into a pipecat pipeline with pipecat-rumik.

building a real-time voice bot? skip the WebSocket plumbing and use pipecat-rumik, our official pipecat TTS service. it drops rumik straight into a pipecat pipeline with streaming audio, metrics, and interruption handling already wired up.

bash

pip install pipecat-rumik

pick a service#

it ships two TTS services. pick one for your transport:

service	transport	best for
`RumikTTSService`	WebSocket	interactive voice agents that need interruption-aware, streaming TTS.
`RumikHttpTTSService`	HTTP	simpler request/response synthesis and batch-style flows.

add it to a pipeline#

settings map to the same request fields as the REST API.

python

import os
from pipecat_rumik import RumikTTSService

# muga: low-latency streaming voice
tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(model="muga"),
)

# mulberry: expressive voice, steered by a description + preset speaker
tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(
        model="mulberry",
        voice="speaker_1",          # sent to Rumik as "speaker"
        description="warm, friendly Indian narrator with clear Hinglish diction",
        f0_up_key=3,                # pitch shift in semitones
    ),
)

# Then drop `tts` into your Pipecat Pipeline alongside your STT + LLM services.

set RUMIK_API_KEY to a key from your dashboard and RUMIK_GATEWAY_URL to https://silk-api.rumik.ai. see the PyPI package for runnable voice-agent examples, settings, and event handlers.

settings#

RumikTTSSettings is shared by RumikTTSService.Settings and RumikHttpTTSService.Settings. it maps to rumik request fields:

setting	request field	notes
`model`	`model`	`muga` or `mulberry`.
`voice`	`speaker`	preset speaker voice, e.g. `speaker_1`.
`description`	`description`	expressive voice description for `mulberry`.
`f0_up_key`	`f0_up_key`	pitch shift for preset speaker voices.
`temperature`	`temperature`	optional sampling temperature.
`top_p`	`top_p`	optional nucleus sampling value.
`top_k`	`top_k`	optional top-k sampling value.

next: hand your coding agent the rumik TTS skill so it integrates all of this correctly on the first try.