Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developer.kallglot.com/llms.txt

Use this file to discover all available pages before exploring further.

Sessions vs Calls

You always begin with a Session (POST /v1/sessions): it returns sess_…, stream URL/token, and the resolved routing that lets Kallglot process audio for translation, transcript, recordings, etc.

What is a Session?

A session is simply the identifier you correlate with every API call thereafter (GET session, transcripts, recordings, wss stream, webhook handlers). Plug in telephony whichever way fits your deployment:
PathTypical next step
Programmatic SIPPublish sip:{session_id}@sip.kallglot.com from your PBX
Vendor trunkTie numbers or routing.connection_id per Choose telephony integration
Raw streamingPoint your client at the stream.url plus token

What is a Call?

A Call is a telephony connection managed by a provider (Twilio, Telnyx, or SIP). Calls handle:
  • Phone number dialing
  • PSTN connectivity
  • Call routing
  • Telephony features (hold, transfer, etc.)
A single session can involve multiple calls:
  • Incoming call from a customer
  • Outgoing call to an agent
  • Conference call with multiple participants

Practical flow

  1. Call POST /v1/sessions with mode, optional languages, routing (or defaults from the Developer Portal).
  2. Send media on the stream WebSocket (audio.input), or attach telephony streams per vendor guide (/guides/twilio, /guides/telnyx, /guides/sip).
  3. Tear down with POST /v1/sessions/{id}/end once your call leg finishes.

Example: Inbound call (pseudo-code)

When your HTTP webhook fires for inbound telephony:
// 1. Webhook receives payload from Twilio (example keys)
app.post('/twilio/incoming', async (req, res) => {
  const { CallSid, From, To } = req.body;

  // 2. Create session: telephony uses `routing` (E.164 of your Kallglot-managed number = the called number To)
  const session = await createKallglotSession({
    mode: 'bidirectional_translation',
    source_language: 'de',
    target_language: 'en',
    routing: {
      phone_number: To
    },
    metadata: {
      twilio_call_sid: CallSid,
      from_number: From,
      to_number: To
    }
  });

  // 3. Connect the call to Kallglot's media stream
  const response = new twiml.VoiceResponse();
  response.connect().stream({
    url: session.stream.url,
    parameters: { token: session.stream.token }
  });

  res.type('text/xml').send(response.toString());
});

Example: WebRTC Session

// 1. Create session for browser-based voice
const session = await createKallglotSession({
  mode: 'ai_agent',
  source_language: 'en',
  target_language: 'en'
  // No `routing` — browser sends audio on stream WebSocket only
});

// 2. Connect from browser
const ws = new WebSocket(`${session.stream.url}?token=${session.stream.token}`);

// 3. Stream microphone audio
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
// ... stream audio to WebSocket

Billing Model

Billing is based on session duration, not call duration:
What’s BilledDescription
Active session timeTime from first audio received to session end
Translation minutesMinutes where translation was active
AI agent minutesMinutes where AI agent was processing
This means:
  • A 5-minute call with a 5-minute session = 5 minutes billed
  • A 5-minute call with a 10-minute session (including hold) = 10 minutes billed
  • Two 3-minute calls in one session = 6 minutes billed total

Session Lifecycle

Status Descriptions

StatusDescription
createdSession created, waiting for audio connection
connectingAudio source connecting
activeProcessing audio in real-time
endingFinalizing recording and transcript
endedSession complete, resources available

Multiple Participants

A single session can handle multiple audio sources:
// Conference session with multiple participants
const session = await createKallglotSession({
  mode: 'bidirectional_translation',
  source_language: 'de',
  target_language: 'en',
  participants: [
    { type: 'agent', language: 'en' },
    { type: 'customer', language: 'de' }
  ]
});

// Both participants connect to the same session
// Agent connects via WebRTC
// Customer connects via phone call

Best Practices

Create the session before the call connects. This ensures the processing pipeline is ready when audio starts flowing.
End sessions as soon as the conversation is complete. This stops billing and triggers recording/transcript finalization.
Attach custom metadata to sessions for easy correlation with your CRM, ticketing, or routing systems:
{
  "metadata": {
    "customer_id": "cust_123",
    "ticket_id": "ticket_456",
    "agent_id": "agent_789"
  }
}
If a call disconnects but may reconnect, keep the session active. You can reconnect new audio to the same session within 5 minutes.