Stay compliant with call recording: capture every call to your own storage

Call recording on Sautikit is triggered by the Record voice action. When a call leg hits a Record action, the PBX captures audio for the duration specified (or until the caller hangs up or presses a finish key). When recording stops, Sautikit POSTs the recording metadata, including a URL where the audio file can be downloaded, to your action callback URL. From there, your server fetches the file and stores it wherever your compliance or review workflow requires.

Financial services companies that must record customer calls for regulatory compliance (e.g., under local financial services rules or MiFID-equivalent frameworks).
Support and sales teams that record calls for quality assurance and agent coaching.
Applications that offer call recordings as a product feature for their end users.
Legal and dispute-resolution workflows where a verbatim record of a conversation is required.
Any team building audit trails for voice interactions.

Recording laws vary by jurisdiction. Before enabling recording, confirm that your use case complies with applicable regulations in both the caller's and recipient's countries, and that you obtain any required consent, commonly via an announcement at the start of the call.

Recording lifecycle

Pattern A: Record the full call without agent interaction: Use Dial with the record parameter set to "record-from-answer". This records the bridged two-party conversation from the moment the callee answers, without needing a separate Record action.

Pattern B: Record a voicemail or one-party monologue: Use the standalone Record action after a Say prompt. Useful for voicemail boxes, recorded consent captures, or customer feedback.

Set transcribe: true on the Record action to request speech-to-text. When the transcript is ready, Sautikit POSTs it to your transcribeCallback URL as a separate event. Transcription runs asynchronously and typically arrives within seconds to a few minutes depending on recording length.

Endpoints you call:

POST /v1/calls: initiate a call where you will trigger recording.
GET /v1/calls/{call_sid}: retrieve call metadata after the session ends.

Voice actions used:

Record: capture audio for the current call leg.
Say: play a consent or notification message before recording starts.
Dial: connect to another party; includes a record field for bridged recording.
Conference: conferences support record: true for whole-room recording.
Hangup: end the call after recording, e.g., voicemail flow.

import express from "express";
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
 
const app = express();
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
 
const s3 = new S3Client({ region: process.env.AWS_REGION });
 
// Sautikit calls this when the caller reaches the voicemail box
app.post("/voicemail/prompt", (req, res) => {
  res.json({
    actions: [
      {
        say: {
          text: "You have reached the voicemail of Acme Corp. Please leave a message after the tone. Press the hash key when done.",
          language: "en-US",
        },
      },
      {
        record: {
          action: "https://yourapp.example.com/voicemail/recording",
          method: "POST",
          timeout: 10,          // stop after 10 s of silence
          maxLength: 120,       // cap at 2 minutes
          finishOnKey: "#",
          transcribe: true,
          transcribeCallback: "https://yourapp.example.com/voicemail/transcript",
        },
      },
    ],
  });
});
 
// Sautikit posts recording metadata here when recording ends
app.post("/voicemail/recording", async (req, res) => {
  const { RecordingUrl, RecordingDuration, CallId, RecordingSid } = req.body;
 
  // Download the recording from Sautikit
  const audioResponse = await fetch(RecordingUrl);
  const audioBuffer = Buffer.from(await audioResponse.arrayBuffer());
 
  // Upload to S3
  await s3.send(new PutObjectCommand({
    Bucket: process.env.S3_BUCKET,
    Key: `voicemails/${CallId}/${RecordingSid}.wav`,
    Body: audioBuffer,
    ContentType: "audio/wav",
    Metadata: {
      callId: CallId,
      duration: RecordingDuration,
    },
  }));
 
  // Save reference in your database
  console.log(`Stored voicemail for call ${CallId}, duration ${RecordingDuration}s`);
 
  // Return empty actions; call ends after recording
  res.json({ actions: [{ hangup: {} }] });
});
 
// Transcription delivered asynchronously
app.post("/voicemail/transcript", (req, res) => {
  const { TranscriptionText, TranscriptionStatus, CallId } = req.body;
  if (TranscriptionStatus === "completed") {
    console.log(`Transcript for ${CallId}: ${TranscriptionText}`);
    // Save transcript to DB alongside the recording record
  }
  res.sendStatus(204);
});
 
app.listen(3000);

# Place an outbound call; action_url will return a Dial with record enabled
curl -X POST "https://api.sautikit.com/v1/calls" \
  -H "Authorization: Bearer $SAUTIKIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "to":         "+254722999000",
    "from":       "+254700000001",
    "action_url": "https://yourapp.example.com/calls/connect-and-record"
  }'

// action_url handler: returns Dial with recording
app.post("/calls/connect-and-record", (req, res) => {
  res.json({
    actions: [
      { say: { text: "This call will be recorded. Please hold while we connect you." } },
      {
        dial: {
          number: "+254720000001",
          record: "record-from-answer",
          callerId: req.body.To,
        },
      },
    ],
  });
});

Recording costs have two components:

1. Recording generation: no separate per-recording fee; the call is billed at the standard per-minute rate. The Record action does not add per-minute surcharge above the normal call rate.

2. Recording storage and retrieval: Sautikit does not store recordings permanently on your behalf. The RecordingUrl in the callback is a short-lived link. Your server must download and store the file in your own object storage (S3, GCS, Azure Blob, or similar) within the retrieval window. Once you have copied the file, your storage costs are those of your chosen object store.

For high-volume recording use cases, estimate storage as: (calls per day) × (average duration in minutes) × (WAV file size per minute ≈ 960 KB for 8-bit mono, or 1.9 MB for 16-bit mono). MP3 re-encoding on your server can reduce this by 70–80%.

Transcription, when enabled, is billed separately per second of audio transcribed. Check the pricing page for the current transcription rate. Transcription latency is not guaranteed; do not use it for real-time routing decisions.

Record voice action reference: full parameter table, silence timeout, finishOnKey.
Dial voice action reference: record field for bridged call recording.
Conference voice action reference: record: true for whole-room recording.
Calls concept: call lifecycle and recording delivery timing.
Conference calling use case: recording multi-party sessions.

Financial services companies that must record customer calls for regulatory compliance (e.g., under local financial services rules or MiFID-equivalent frameworks).
Support and sales teams that record calls for quality assurance and agent coaching.
Applications that offer call recordings as a product feature for their end users.
Legal and dispute-resolution workflows where a verbatim record of a conversation is required.
Any team building audit trails for voice interactions.