what you’ll build
a real-time loop: you speak, the agent transcribes you, an LLM writes a reply, and rumik speaks it back.before you start
you need:- python 3.10+ and a terminal.
- three API keys:
- rumik for the voice, from your dashboard.
- deepgram for speech-to-text (deepgram.com).
- openai for the LLM (platform.openai.com).
you can swap deepgram or openai for any STT / LLM that pipecat supports. we use
these two because they’re quick to set up.
step 1 · set up the project
make a folder, a virtual environment, and install the packages.pipecat-rumik is the official rumik TTS service. the rest is pipecat plus the
STT and LLM plugins.
step 2 · add your keys
create a file called.env in the folder:
.env
step 3 · write the agent
createagent.py. this builds the four-step loop. the rumik part is the tts
line.
agent.py
the surrounding pieces (the transport that carries audio, the system prompt, the
context aggregator) come straight from the pipecat quickstart. the
pipecat-rumik examples ship a
complete, runnable agent you can copy.step 4 · make the LLM speak muga’s language
muga is steered by a[tone] tag at the start of each reply. tell your LLM to add
one. paste this into the LLM’s system prompt:
[happy] Haan ji, ho gaya! and rumik speaks it with the right
emotion. the full prompt rules are in prompting muga.
step 5 · run it
customize it
change the voice
switch to
model="mulberry" and add a description to design any voice. see
prompting mulberry.tune the personality
edit the LLM system prompt. that’s the agent’s character.
swap STT or LLM
pipecat supports many providers. change the
stt or llm line.let an agent do it
hand the rumik TTS skill to your coding agent.
next steps
- pipecat integration reference for every setting and both transports.
- prompting muga and prompting mulberry.
- prefer livekit? build the same agent with livekit.