SAGEA Logo
SAGEABeta

API Reference

Welcome to the SAGEA API reference documentation. Our APIs provide access to VORA™, our ultra-efficient voice synthesis engine that generates hyper-realistic, emotionally expressive speech from text in real time.

Quick Start with VORA

Get started with VORA voice synthesis in minutes. Generate high-fidelity speech with real-time emotion control.

curl -X POST https://api.sagea.space/v1/vora/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from VORA!", "model": "vora-v1"}'

Base URL

All API requests should be made to:

https://api.sagea.space/v1

Authentication

SAGEA uses API keys for authentication. Include your API key in the Authorization header:

curl -H "Authorization: Bearer YOUR_API_KEY" \
     https://api.sagea.space/v1/vora/synthesize

VORA Voice Synthesis API

VORA leverages a highly optimized hybrid attention-convolutional architecture that dramatically reduces inference latency while preserving voice fidelity, expressiveness, and personality.

Synthesize Speech

Generate hyper-realistic speech with emotional expressiveness using VORA.

POST /vora/synthesize

Request Body

{
  "text": "Hello, this is VORA voice synthesis with emotional control.",
  "model": "vora-v1",
  "language": "en-US",
  "voice": "default",
  "options": {
    "emotion": "friendly",
    "speaking_rate": 1.0,
    "pitch_shift": 0.0,
    "format": "wav",
    "sample_rate": 22050,
    "enable_watermark": true
  }
}

Parameters

ParameterTypeRequiredDescription
textstringYesText to synthesize (max 5000 characters)
modelstringYesVORA model (vora-v1, vora-l1, vora-l2)
languagestringNoLanguage code (default: en-US)
voicestringNoVoice variant (default, alt1, alt2)
emotionstringNoEmotion control (neutral, friendly, professional, excited, calm, warm)
speaking_ratefloatNoSpeaking speed (0.5 - 2.0, default: 1.0)
pitch_shiftfloatNoPitch adjustment (-1.0 to 1.0, default: 0.0)
formatstringNoAudio format (wav, mp3, ogg, default: wav)
sample_rateintegerNoSample rate (16000, 22050, 44100, default: 22050)
enable_watermarkbooleanNoEnable audio watermarking (default: true)

Supported Languages

VORA supports 30+ languages including:

  • English (en-US, en-GB, en-AU)
  • Nepali (ne-NP)
  • Hindi (hi-IN)
  • French (fr-FR, fr-CA)
  • Swahili (sw-KE, sw-TZ)
  • Japanese (ja-JP)
  • Spanish (es-ES, es-MX)
  • German (de-DE)
  • Portuguese (pt-BR, pt-PT)
  • Chinese (zh-CN, zh-TW)
  • Arabic (ar-SA)
  • Russian (ru-RU)

VORA Models

ModelDescriptionUse Case
vora-v1High-fidelity multilingual voice modelProduction applications, media, accessibility
vora-l1Lightweight edge-deployable TTSMobile apps, IoT devices
vora-l2Ultra-lightweight for constrained environmentsEmbedded systems, real-time applications

Response

{
  "audio_url": "https://api.sagea.space/audio/vora_abc123.wav",
  "duration": 3.45,
  "format": "wav",
  "sample_rate": 22050,
  "model_used": "vora-v1",
  "language": "en-US",
  "watermarked": true
}

Voice Cloning

Create custom voice models with VORA's voice cloning capabilities.

POST /vora/clone

Request Body

{
  "name": "custom_voice",
  "audio_samples": ["base64_audio_1", "base64_audio_2"],
  "text_transcripts": ["Sample text one", "Sample text two"],
  "language": "en-US",
  "options": {
    "training_steps": 1000,
    "quality": "high"
  }
}

Real-time Streaming

Stream audio synthesis for real-time applications.

POST /vora/stream

Supports WebSocket connections for continuous audio streaming with dynamic emotion shifting and adaptive speaking styles.

Enterprise API

Custom Model Deployment

Deploy custom VORA models in your infrastructure.

POST /enterprise/deploy

Request Body

{
  "model_id": "custom-vora-model",
  "deployment_type": "cloud",
  "configuration": {
    "auto_scaling": true,
    "max_instances": 10,
    "regions": ["us-east-1", "eu-west-1"]
  }
}

Error Handling

SAGEA uses standard HTTP status codes and returns detailed error information:

{
  "error": {
    "type": "invalid_request",
    "code": "unsupported_language",
    "message": "The specified language 'xx-XX' is not supported by VORA.",
    "supported_languages": ["en-US", "hi-IN", "ne-NP", "..."]
  }
}

Common Error Codes

StatusCodeDescription
400invalid_requestInvalid request parameters
401unauthorizedInvalid or missing API key
403forbiddenInsufficient permissions for requested operation
429rate_limit_exceededRate limit exceeded
500internal_errorInternal server error

Rate Limits

API rate limits vary by plan and model:

  • VORA-V1: 50 requests/minute (Starter), 500 requests/minute (Pro), Custom (Enterprise)
  • VORA-L1/L2: 100 requests/minute (Starter), 1000 requests/minute (Pro), Custom (Enterprise)
  • Voice Cloning: 5 requests/day (Pro), Custom (Enterprise)

Performance Optimizations

VORA achieves up to 6× faster inference and 50% smaller memory footprint compared to traditional TTS models:

  • Edge Deployment: VORA-L models run efficiently on mobile and IoT devices
  • Real-time Processing: Sub-100ms latency for streaming applications
  • Batch Processing: Optimize costs with batch synthesis for multiple texts
  • Caching: Automatic audio caching for repeated requests

Enterprise Features

  • Custom Voice Cloning: Create branded voices for your organization
  • Audio Watermarking: Built-in audio watermarking for content protection
  • Guardrails: Content filtering and safety controls
  • On-premise Deployment: Deploy VORA in your own infrastructure
  • 24/7 Support: Dedicated enterprise support team

Support

For enterprise deployments and custom integrations, contact our team at [email protected].

Development in Progress

The docs are under active development. If you have questions or feedback, please reach out to us.