Outbound lead qualification means your system places the call, not the customer. You feed a list of contacts to the Sautikit API, an AI voice agent dials each one, asks a short set of qualifying questions in natural language, and records the answers. When a lead is hot, the same call is warm-transferred to a live rep so the human picks up a conversation already in progress.
The whole thing runs over a real-time audio Stream. Sautikit forks the live call audio to your WebSocket, your bridge relays it to an LLM (Google Gemini Live, OpenAI, or a self-hosted model) that runs your script, and the model's spoken replies flow back down the same socket. You own the prompt, the scoring logic, and the transfer rules.
The outbound real-time loop:
POST /v1/calls with to and from. Sautikit dials the lead.routing_url on the from number, or a call-level answer URL — which returns XML <Response> containing a <Stream>.url in your <Stream>. Your server must advertise the audio.drachtio.org subprotocol during the handshake or the connection is rejected.<Dial> to transfer the call to a live rep. Cold leads get a polite close and a logged CDR.Endpoints you call:
POST /v1/calls: place the outbound call with to, from, and an optional clientRequestId for idempotency and tracking.PATCH /v1/numbers/{number_id}: set the routing_url on your from number so Sautikit knows which webhook to fetch on answer.GET /v1/calls/{call_sid}: fetch the call detail record (duration, status, timestamps) after the call ends.Voice actions used:
Stream: fork live call audio to your WebSocket for real-time AI. Available via the XML form today.Dial: warm-transfer a hot lead to a live rep or external number.Say: text-to-speech for a fixed intro or fallback message.curl -X POST "https://api.sautikit.com/v1/calls" \
-H "Authorization: Bearer $SAUTIKIT_API_KEY" \
-H "Content-Type: application/json" \
-d '{"to":["+254712345678"],"from":"+254711000001","clientRequestId":"lead-42"}'Sautikit dials +254712345678. When the lead answers, it fetches the routing_url configured on +254711000001.
Your answer webhook responds with an application/xml body that opens the audio stream:
<Response>
<Stream
name="lead-qualifier"
url="wss://your-app.example.com/audio"
track="both_tracks"
outputSamplingRate="16000"
statusCallback="https://your-app.example.com/stream-status"
statusEvents="stream-started stream-stopped stream-error" />
</Response>track="both_tracks" forks both the caller and callee legs. outputSamplingRate="16000" is the rate AI models expect. Audio frames are 16-bit little-endian PCM.
import { WebSocketServer } from "ws";
// Advertise the required subprotocol or Sautikit rejects the handshake.
const wss = new WebSocketServer({
port: 8080,
handleProtocols: () => "audio.drachtio.org",
});
const SCRIPT = [
"Hi, this is Zawadi from Acme. Is now a good time for a couple of quick questions?",
"Are you the person who handles buying decisions for your team?",
"Roughly how many seats would you need?",
"What is your timeline to get started?",
];
wss.on("connection", (socket) => {
const llm = connectToLLM(); // Gemini Live / OpenAI / self-hosted
let step = 0;
llm.ask(SCRIPT[step]); // speak the first question
// Live PCM from the call -> your LLM for transcription + turn detection
socket.on("message", (pcmFrame) => llm.pushAudio(pcmFrame));
// LLM audio replies -> back down the same socket to the caller
llm.on("audio", (pcmFrame) => socket.send(pcmFrame));
// When the model finishes an answer, advance the script or route the lead
llm.on("answer", ({ text, hot }) => {
logAnswer(step, text);
if (hot) return warmTransfer(socket); // return <Dial> to a live rep
if (++step < SCRIPT.length) llm.ask(SCRIPT[step]);
else closePolitely(socket);
});
});warmTransfer ends the stream and hands the in-progress call to your flow's <Dial> step, so a human rep joins the same live call.
Outbound calls are billed per second in KES on the connected leg — you pay for the time the lead is actually on the line, from answer to hangup. A 90-second qualification call bills 90 seconds; there is no separate charge for running the Stream or for the WebSocket round-trips.
LLM inference runs on your own provider (Gemini, OpenAI, or self-hosted), so those tokens or minutes are billed by that provider, not by Sautikit. When a hot lead is warm-transferred, the connected rep leg continues to bill per second for its duration.
Because these are outbound calls, responsible use matters: only dial opted-in contacts, disclose recording and AI assistance where required, and keep to lawful calling hours in each market.
<Stream> attribute reference and status events.