Skip to main content

Rate Limits

Kallglot uses rate limiting to ensure fair usage and maintain service quality. This page explains the limits and how to work with them.

Rate Limit Tiers

PlanAPI RequestsConcurrent SessionsWebSocket Messages
Free/Test60/min2100/sec
Starter120/min10500/sec
Pro600/min501000/sec
EnterpriseCustomCustomCustom

Types of Limits

API Request Limits

Limits on HTTP API calls per minute:
Endpoint GroupStarterPro
Sessions (create, end)60/min300/min
Sessions (read)120/min600/min
Recordings60/min300/min
Analysis30/min150/min
Webhooks60/min300/min

Concurrent Session Limits

Maximum number of active sessions at any time:
PlanConcurrent Sessions
Free/Test2
Starter10
Pro50
EnterpriseUnlimited*
*Subject to fair use policy

WebSocket Message Limits

Limits on messages sent to WebSocket per second:
PlanMessages/SecondAudio Data/Second
Free/Test10032KB
Starter50064KB
Pro1000128KB
EnterpriseCustomCustom

Rate Limit Headers

API responses include headers indicating your current rate limit status:
HTTP/1.1 200 OK
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1711454460
HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the limit resets

Handling Rate Limits

When you exceed a rate limit, the API returns a 429 Too Many Requests response:
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded the rate limit. Please retry after 30 seconds.",
    "type": "rate_limit_error"
  }
}
The response includes a Retry-After header:
HTTP/1.1 429 Too Many Requests
Retry-After: 30

Retry Logic

async function makeRequestWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await request();
    } catch (error) {
      if (error.status === 429) {
        const retryAfter = error.headers['retry-after'] || 30;
        console.log(`Rate limited. Retrying in ${retryAfter}s`);
        await sleep(retryAfter * 1000);
      } else {
        throw error;
      }
    }
  }
  throw new Error('Max retries exceeded');
}

Proactive Rate Limiting

Check headers before hitting limits:
class RateLimiter {
  constructor() {
    this.remaining = Infinity;
    this.resetTime = 0;
  }

  updateFromResponse(response) {
    this.remaining = parseInt(response.headers['x-ratelimit-remaining']);
    this.resetTime = parseInt(response.headers['x-ratelimit-reset']);
  }

  async waitIfNeeded() {
    if (this.remaining <= 5) {
      const waitTime = (this.resetTime * 1000) - Date.now();
      if (waitTime > 0) {
        console.log(`Approaching rate limit. Waiting ${waitTime}ms`);
        await sleep(waitTime);
      }
    }
  }
}

Best Practices

When rate limited, use exponential backoff to avoid hammering the API:
const delay = Math.min(
  baseDelay * Math.pow(2, attempt),
  maxDelay
);
Cache session info, transcripts, and recordings locally instead of fetching repeatedly:
const sessionCache = new Map();

async function getSession(id) {
  if (sessionCache.has(id)) {
    return sessionCache.get(id);
  }
  const session = await retrieveKallglotSession(id);
  sessionCache.set(id, session);
  return session;
}
Use batch endpoints or reduce redundant calls:
// Instead of this:
for (const id of sessionIds) {
  await retrieveKallglotSession(id);
}

// Do this:
const sessions = await listKallglotSessions({
  ids: sessionIds
});
Subscribe to webhook events instead of polling for changes:
// Don't poll
setInterval(async () => {
  const session = await retrieveKallglotSession(id);
  if (session.status === 'ended') { ... }
}, 1000);

// Use webhooks
app.post('/webhooks', (req, res) => {
  if (req.body.type === 'session.ended') { ... }
});
Track API usage to identify optimization opportunities:
const apiMetrics = {
  requests: 0,
  rateLimits: 0,
  errors: 0
};

function recordApiResponse(response) {
  apiMetrics.requests++;
  if (response.status === 429) {
    apiMetrics.rateLimits++;
  }
}

Increasing Limits

Upgrading Your Plan

Upgrade to a higher plan for increased limits:
  1. Go to Settings > Billing
  2. Select a new plan
  3. Limits increase immediately

Enterprise Custom Limits

Enterprise customers can request custom limits:
  • Higher API request limits
  • More concurrent sessions
  • Dedicated infrastructure
  • SLA guarantees
Contact sales@kallglot.com to discuss your needs.

Limit-Specific Considerations

Session Creation Limits

Session creation has stricter limits to prevent abuse:
// Reuse sessions when possible
const session = await createKallglotSession({
  mode: 'bidirectional_translation',
  // ...
});

// If a call drops and reconnects within 5 minutes,
// reconnect to the same session instead of creating new one
const existingSession = await findRecentSession(callerId);
if (existingSession && existingSession.status === 'active') {
  return existingSession;
}

Analysis Request Limits

Analysis is resource-intensive and has lower limits:
// Queue analysis requests
const analysisQueue = [];

async function processAnalysisQueue() {
  while (analysisQueue.length > 0) {
    const sessionId = analysisQueue.shift();
    try {
      await requestKallglotAnalysis(sessionId, {
        analyses: ['sentiment', 'summary']
      });
    } catch (error) {
      if (error.code === 'rate_limit_exceeded') {
        // Put back in queue and wait
        analysisQueue.unshift(sessionId);
        await sleep(60000);
      }
    }
    // Small delay between requests
    await sleep(2000);
  }
}

WebSocket Throttling

If you exceed WebSocket message limits:
{
  "type": "throttle",
  "message": "Message rate exceeded. Slow down.",
  "wait_ms": 100
}
Handle throttling:
ws.on('message', (data) => {
  const message = JSON.parse(data);

  if (message.type === 'throttle') {
    // Pause sending
    pauseSending = true;
    setTimeout(() => {
      pauseSending = false;
    }, message.wait_ms);
  }
});

function sendAudio(chunk) {
  if (pauseSending) {
    audioBuffer.push(chunk);
    return;
  }
  ws.send(JSON.stringify({ type: 'audio', data: chunk }));
}