what you’ll build
a real-time loop running inside a livekit room: you speak, the agent transcribes you, an LLM writes a reply, and rumik speaks it back.before you start
you need:- python 3.10+ and a terminal.
- a free livekit cloud project from cloud.livekit.io (gives you a URL, API key, and secret).
- three more API keys:
- rumik for the voice, from your dashboard.
- deepgram for speech-to-text (deepgram.com).
- openai for the LLM (platform.openai.com).
you can swap deepgram or openai for any STT / LLM that livekit supports. we use
these two because they’re quick to set up.
step 1 · set up the project
make a folder, a virtual environment, and install the packages.livekit-plugins-rumik-ai is the official rumik TTS plugin.
step 2 · add your keys
create a file called.env:
.env
step 3 · write the agent
createagent.py. livekit wires the four steps together in an AgentSession. the
rumik part is the tts line.
agent.py
rumik_ai.TTS reads RUMIK_API_KEY from the environment automatically. the system
prompt makes the LLM emit muga’s [tone] tags. the full rules are in
prompting muga.
step 4 · run it
start the agent worker:customize it
change the voice
use
rumik_ai.TTS(model="mulberry", description="...") to design any voice. see
prompting mulberry.pin a preset voice
rumik_ai.TTS(model="mulberry", speaker="speaker_1") keeps one fixed voice
across the conversation.tune the personality
edit the
INSTRUCTIONS system prompt. that’s the agent’s character.ship to phone or web
livekit handles telephony and web/mobile SDKs from the same agent.
next steps
- livekit integration reference for every constructor option.
- prompting muga and prompting mulberry.
- prefer pipecat? build the same agent with pipecat.