SAGEA Logo
SAGEABeta
Models

SAGE Models

SAGE Language Models

SAGE represents SAGEA's advanced language understanding capabilities, providing sophisticated reasoning, contextual memory, and multilingual support for natural conversations and complex problem-solving.

🧠

Advanced Reasoning Engine

SAGE models excel at complex logical reasoning, contextual understanding, and multimodal integration.

Model Overview

ModelContext WindowReasoningSpeedCostUse Cases
SAGE200K tokensAdvancedStandardHigherResearch, Analysis, Complex Tasks
SAGE-mini32K tokensStandardFastLowerChat, Q&A, High-Volume

SAGE (Flagship Model)

Our most capable language model designed for complex reasoning and comprehensive understanding.

Key Capabilities

  • Advanced Reasoning: Multi-step logical reasoning and problem-solving
  • Extended Context: 200K token context window for long documents
  • Multimodal Integration: Seamless combination with vision and voice
  • Cultural Understanding: Deep knowledge of diverse languages and cultures
  • Memory Management: Sophisticated contextual memory across conversations

Core Features

import sagea
 
client = sagea.ChatClient(api_key="your-api-key")
 
# Complex reasoning task
response = client.chat(
    messages=[{
        "role": "user",
        "content": """Analyze the economic implications of renewable energy 
                     adoption on developing nations, considering infrastructure 
                     costs, job market changes, and long-term sustainability."""
    }],
    model="sage",
    reasoning_mode="analytical"
)
 
print(response.content)

Advanced Reasoning Modes

SAGE supports different reasoning approaches for optimal results:

🔍 Analytical

Step-by-step analysis with detailed reasoning chains

💡 Creative

Innovative solutions and creative problem-solving

⚖️ Balanced

Optimal mix of accuracy and creativity

📊 Factual

Precise, fact-based responses with citations

Long-Context Understanding

# Process long documents
with open("research_paper.txt", "r") as f:
    document = f.read()
 
response = client.chat(
    messages=[
        {
            "role": "user", 
            "content": f"""Here's a research paper: {document}
                          
                          Please provide a comprehensive summary including:
                          1. Main findings
                          2. Methodology critique
                          3. Implications for future research
                          4. Potential applications"""
        }
    ],
    model="sage",
    context_management="extended"
)

Performance Metrics

  • Context Window: 200,000 tokens (~150,000 words)
  • Response Quality: 4.9/5.0 (expert evaluation)
  • Reasoning Accuracy: 94% on complex logic tasks
  • Multilingual Performance: Native-level in 50+ languages

SAGE-mini (Efficient Model)

Optimized for speed and cost-effectiveness while maintaining strong language understanding.

Key Features

  • Fast Response Times: 2-3x faster than SAGE
  • Cost Effective: 70% lower cost per token
  • Efficient Processing: Optimized for high-volume applications
  • Good Reasoning: Solid performance on standard tasks
  • Streamlined Architecture: Focused on essential capabilities

Optimal Use Cases

# Customer support chatbot
response = client.chat(
    messages=[
        {"role": "user", "content": "How do I reset my password?"}
    ],
    model="sage-mini",
    response_style="helpful"
)
 
# Quick content generation
response = client.chat(
    messages=[{
        "role": "user",
        "content": "Write a brief product description for wireless headphones"
    }],
    model="sage-mini",
    max_tokens=150
)
 
# Real-time chat assistance
for user_message in chat_stream:
    response = client.chat(
        messages=[{"role": "user", "content": user_message}],
        model="sage-mini",
        stream=True
    )
    
    for chunk in response:
        yield chunk.content

Performance Metrics

  • Context Window: 32,000 tokens (~24,000 words)
  • Response Speed: 500-800ms average
  • Quality Score: 4.3/5.0 (user evaluation)
  • Cost Efficiency: 70% reduction vs SAGE

Conversation Management

Both SAGE models excel at maintaining context across long conversations.

Memory and Context

# Start a persistent conversation
conversation = client.start_conversation(
    model="sage",
    memory_type="long_term"
)
 
# Multiple interactions with maintained context
response1 = conversation.add_message(
    "I'm planning a trip to Japan. What should I know about the culture?"
)
 
response2 = conversation.add_message(
    "What about the best time to visit for cherry blossoms?"
)
 
response3 = conversation.add_message(
    "Can you recommend some traditional foods to try?"
)
 
# SAGE remembers the entire conversation context
response4 = conversation.add_message(
    "Based on our conversation, create a 7-day itinerary"
)

Conversation Features

  • Contextual Memory: Remember key details across messages
  • Personality Consistency: Maintain consistent tone and style
  • Topic Tracking: Follow conversation threads naturally
  • Reference Resolution: Understand pronouns and implicit references

Function Calling and Tools

SAGE models can interact with external tools and APIs for enhanced capabilities.

Function Integration

# Define available functions
functions = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            }
        }
    },
    {
        "name": "search_web",
        "description": "Search the internet for current information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string"}
            }
        }
    }
]
 
# Chat with function calling enabled
response = client.chat(
    messages=[{
        "role": "user",
        "content": "What's the weather like in Tokyo, and find recent news about AI development there?"
    }],
    model="sage",
    functions=functions,
    function_call="auto"
)
 
# SAGE will automatically call appropriate functions
if response.function_calls:
    for call in response.function_calls:
        print(f"Function: {call.name}")
        print(f"Arguments: {call.arguments}")

Available Tool Categories

  • Information Retrieval: Web search, database queries
  • Data Processing: File analysis, calculations
  • External APIs: Third-party service integration
  • Multimodal Tools: Image analysis, voice synthesis

Multimodal Capabilities

SAGE models seamlessly integrate with VORA voice synthesis and vision capabilities.

Voice + Language Integration

# Combined voice and language processing
multimodal_client = sagea.MultimodalClient(api_key="your-api-key")
 
# Process audio input and respond with voice
response = multimodal_client.process(
    audio_input="user_question.wav",
    response_mode="voice",
    voice_model="vora-v1",
    language_model="sage",
    emotion="helpful"
)
 
# Get both text and audio response
print(response.text)  # Text response
response.audio.save("answer.wav")  # Voice response

Vision + Language Integration

# Analyze image and discuss findings
response = client.chat(
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this chart and explain the trends"},
            {"type": "image", "image_url": "https://example.com/chart.png"}
        ]
    }],
    model="sage"
)

Multilingual Excellence

SAGE models provide native-level understanding across 100+ languages.

Language Switching

# Seamless multilingual conversation
response = client.chat(
    messages=[
        {"role": "user", "content": "Hello, I speak multiple languages"},
        {"role": "assistant", "content": "Great! I can communicate in many languages too."},
        {"role": "user", "content": "Parfait! Parlez-vous français?"},
        {"role": "assistant", "content": "Oui, je parle français couramment!"},
        {"role": "user", "content": "¿Y español también?"},
        {"role": "assistant", "content": "¡Por supuesto! Hablo español también."},
        {"role": "user", "content": "Now explain quantum physics in Hindi"}
    ],
    model="sage",
    auto_language_detection=True
)

Cultural Context

SAGE understands cultural nuances and context:

  • Regional Expressions: Local idioms and phrases
  • Cultural References: Historical and cultural knowledge
  • Social Norms: Appropriate communication styles
  • Business Customs: Professional interaction patterns

Customization and Fine-tuning

Domain Adaptation

# Create domain-specific assistant
response = client.chat(
    messages=[{
        "role": "system",
        "content": """You are a medical research assistant with deep knowledge 
                     of oncology, cardiology, and clinical trials. Provide 
                     evidence-based responses with appropriate citations."""
    }, {
        "role": "user",
        "content": "Explain the latest developments in CAR-T cell therapy"
    }],
    model="sage",
    domain="medical"
)

Custom Instructions

  • System Prompts: Define behavior and expertise
  • Response Format: Structure outputs consistently
  • Tone and Style: Match brand or user preferences
  • Safety Guidelines: Implement content filtering

Performance Optimization

Efficient Prompting

# Optimized prompt structure
response = client.chat(
    messages=[{
        "role": "user",
        "content": """Task: Summarize the key points
                     
                     Context: [Your context here]
                     
                     Requirements:
                     - 3-5 bullet points
                     - Focus on actionable insights
                     - Include data where relevant
                     
                     Text to summarize: [Your text here]"""
    }],
    model="sage-mini",  # Use efficient model for structured tasks
    temperature=0.3     # Lower temperature for consistency
)

Caching and Optimization

  • Response Caching: Store common query results
  • Prompt Templates: Reuse effective prompt patterns
  • Batch Processing: Process multiple queries efficiently
  • Streaming: Improve perceived response times

Enterprise Features

Security and Compliance

  • Data Encryption: End-to-end encryption for all communications
  • Privacy Controls: Configure data retention and processing
  • Audit Logging: Comprehensive activity tracking
  • Compliance: SOC 2, GDPR, and industry standards

Custom Deployment

  • Private Models: Dedicated model instances
  • On-Premises: Local deployment for data sovereignty
  • Hybrid Architecture: Combine cloud and on-premises
  • Custom Training: Fine-tune models for specific domains

Best Practices

Model Selection

  1. Complex Tasks: Use SAGE for research, analysis, creative work
  2. Simple Queries: Use SAGE-mini for basic Q&A, chat
  3. Long Documents: Use SAGE for extended context needs
  4. High Volume: Use SAGE-mini for cost efficiency

Prompt Engineering

  1. Be Specific: Clear, detailed instructions work best
  2. Provide Context: Include relevant background information
  3. Structure Requests: Use clear formatting and organization
  4. Iterate and Refine: Test and improve prompts over time

Error Handling

import time
from sagea.exceptions import RateLimitError, ModelError
 
def robust_chat(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat(
                messages=messages,
                model="sage"
            )
        except RateLimitError:
            # Exponential backoff
            time.sleep(2 ** attempt)
        except ModelError as e:
            # Fall back to SAGE-mini
            return client.chat(
                messages=messages,
                model="sage-mini"
            )
    
    raise Exception("Max retries exceeded")

Pricing and Limits

Usage-Based Pricing

  • Input Tokens: Cost per token processed
  • Output Tokens: Cost per token generated
  • Function Calls: Additional cost for tool usage
  • Context Storage: Cost for maintaining conversation memory

Rate Limits

  • SAGE: 60 requests/minute (Pro), Custom (Enterprise)
  • SAGE-mini: 100 requests/minute (Pro), Custom (Enterprise)
  • Context Window: No additional charges for context usage

Next Steps

SAGE models represent the cutting edge of language understanding, providing the reasoning capabilities needed for sophisticated AI applications. Choose the right model for your use case and explore the possibilities of advanced AI conversation.