API Reference

Welcome to the SAGEA API reference documentation. Our APIs provide access to VORA™, our ultra-efficient voice synthesis engine that generates hyper-realistic, emotionally expressive speech from text in real time.

Quick Start with VORA

Get started with VORA voice synthesis in minutes. Generate high-fidelity speech with real-time emotion control.

curl -X POST https://api.sagea.space/v1/vora/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from VORA!", "model": "vora-v1"}'

Base URL

All API requests should be made to:

https://api.sagea.space/v1

Authentication

SAGEA uses API keys for authentication. Include your API key in the Authorization header:

curl -H "Authorization: Bearer YOUR_API_KEY" \
     https://api.sagea.space/v1/vora/synthesize

VORA Voice Synthesis API

VORA leverages a highly optimized hybrid attention-convolutional architecture that dramatically reduces inference latency while preserving voice fidelity, expressiveness, and personality.

Synthesize Speech

Generate hyper-realistic speech with emotional expressiveness using VORA.

POST /vora/synthesize

Request Body

{
  "text": "Hello, this is VORA voice synthesis with emotional control.",
  "model": "vora-v1",
  "language": "en-US",
  "voice": "default",
  "options": {
    "emotion": "friendly",
    "speaking_rate": 1.0,
    "pitch_shift": 0.0,
    "format": "wav",
    "sample_rate": 22050,
    "enable_watermark": true
  }
}

Parameters

Parameter	Type	Required	Description
`text`	string	Yes	Text to synthesize (max 5000 characters)
`model`	string	Yes	VORA model (vora-v1, vora-l1, vora-l2)
`language`	string	No	Language code (default: en-US)
`voice`	string	No	Voice variant (default, alt1, alt2)
`emotion`	string	No	Emotion control (neutral, friendly, professional, excited, calm, warm)
`speaking_rate`	float	No	Speaking speed (0.5 - 2.0, default: 1.0)
`pitch_shift`	float	No	Pitch adjustment (-1.0 to 1.0, default: 0.0)
`format`	string	No	Audio format (wav, mp3, ogg, default: wav)
`sample_rate`	integer	No	Sample rate (16000, 22050, 44100, default: 22050)
`enable_watermark`	boolean	No	Enable audio watermarking (default: true)

Supported Languages

VORA supports 30+ languages including:

English (en-US, en-GB, en-AU)
Nepali (ne-NP)
Hindi (hi-IN)
French (fr-FR, fr-CA)
Swahili (sw-KE, sw-TZ)
Japanese (ja-JP)
Spanish (es-ES, es-MX)
German (de-DE)
Portuguese (pt-BR, pt-PT)
Chinese (zh-CN, zh-TW)
Arabic (ar-SA)
Russian (ru-RU)

VORA Models

Model	Description	Use Case
`vora-v1`	High-fidelity multilingual voice model	Production applications, media, accessibility
`vora-l1`	Lightweight edge-deployable TTS	Mobile apps, IoT devices
`vora-l2`	Ultra-lightweight for constrained environments	Embedded systems, real-time applications

Response

{
  "audio_url": "https://api.sagea.space/audio/vora_abc123.wav",
  "duration": 3.45,
  "format": "wav",
  "sample_rate": 22050,
  "model_used": "vora-v1",
  "language": "en-US",
  "watermarked": true
}

Voice Cloning

Create custom voice models with VORA's voice cloning capabilities.

POST /vora/clone

Request Body

{
  "name": "custom_voice",
  "audio_samples": ["base64_audio_1", "base64_audio_2"],
  "text_transcripts": ["Sample text one", "Sample text two"],
  "language": "en-US",
  "options": {
    "training_steps": 1000,
    "quality": "high"
  }
}

Real-time Streaming

Stream audio synthesis for real-time applications.

POST /vora/stream

Supports WebSocket connections for continuous audio streaming with dynamic emotion shifting and adaptive speaking styles.

{
  "model_id": "custom-vora-model",
  "deployment_type": "cloud",
  "configuration": {
    "auto_scaling": true,
    "max_instances": 10,
    "regions": ["us-east-1", "eu-west-1"]
  }
}

Error Handling

SAGEA uses standard HTTP status codes and returns detailed error information:

{
  "error": {
    "type": "invalid_request",
    "code": "unsupported_language",
    "message": "The specified language 'xx-XX' is not supported by VORA.",
    "supported_languages": ["en-US", "hi-IN", "ne-NP", "..."]
  }
}

Common Error Codes

Status	Code	Description
400	`invalid_request`	Invalid request parameters
401	`unauthorized`	Invalid or missing API key
403	`forbidden`	Insufficient permissions for requested operation
429	`rate_limit_exceeded`	Rate limit exceeded
500	`internal_error`	Internal server error

Rate Limits

API rate limits vary by plan and model:

VORA-V1: 50 requests/minute (Starter), 500 requests/minute (Pro), Custom (Enterprise)
VORA-L1/L2: 100 requests/minute (Starter), 1000 requests/minute (Pro), Custom (Enterprise)
Voice Cloning: 5 requests/day (Pro), Custom (Enterprise)

Performance Optimizations

VORA achieves up to 6× faster inference and 50% smaller memory footprint compared to traditional TTS models:

Edge Deployment: VORA-L models run efficiently on mobile and IoT devices
Real-time Processing: Sub-100ms latency for streaming applications
Batch Processing: Optimize costs with batch synthesis for multiple texts
Caching: Automatic audio caching for repeated requests

Enterprise Features

Custom Voice Cloning: Create branded voices for your organization
Audio Watermarking: Built-in audio watermarking for content protection
Guardrails: Content filtering and safety controls
On-premise Deployment: Deploy VORA in your own infrastructure
24/7 Support: Dedicated enterprise support team

Support

For enterprise deployments and custom integrations, contact our team at [email protected].

Development in Progress

The docs are under active development. If you have questions or feedback, please reach out to us.

API Reference

Quick Start with VORA

Base URL

Authentication

VORA Voice Synthesis API

Synthesize Speech

Request Body

Parameters

Supported Languages

VORA Models

Response

Voice Cloning

Request Body

Real-time Streaming

Enterprise API

Custom Model Deployment

Request Body

Error Handling

Common Error Codes

Rate Limits

Performance Optimizations

Enterprise Features

Support

Development in Progress

On this page