Hippo: RAG & Retrieval 🦛

Build intelligent Q&A systems over documents with flagship accuracy at mini model cost. Hippo delivers precision retrieval with 70% smaller context windows, achieving 80% cost reduction while maintaining 99.5% accuracy match to flagship models.

80% COST REDUCTION Only retrieve relevant chunks, not entire documents

99.5% ACCURACY MATCH Flagship model quality with intelligent retrieval

70% SMALLER CONTEXT Precision RAG eliminates noise, keeps what matters

What is Hippo?

Hippo is Cerevox’s RAG (Retrieval-Augmented Generation) API that enables AI agents to search and query document collections with natural language. Instead of sending entire documents to your LLM (expensive, slow, noisy), Hippo:

Indexes your documents with semantic understanding
Retrieves only the most relevant chunks (70% smaller context)
Generates AI answers with source citations
Saves you 80% on LLM costs while matching flagship accuracy

Perfect for: Customer support bots, internal knowledge bases, document Q&A, research assistants, and any AI system that needs to “know” information from documents.

Core Concepts

Folders - Organize Documents

Folders are collections of documents that form a searchable knowledge base.

Each folder is an isolated knowledge domain
Upload PDFs, DOCX, PPTX, and more
Automatically indexed for semantic search
Support 1 to 10,000+ documents per folder

Use cases: Product docs, customer records, research papers, legal cases

Files - Your Data Sources

Files are the documents you upload to folders.

Support 12+ formats: PDF, DOCX, PPTX, XLSX, TXT, HTML, CSV, etc.
Upload from local files or URLs
Automatic processing and indexing
Rich metadata extraction

Processing: Files are automatically parsed, chunked, and indexed for retrieval

Chats - Conversation Context

Chat sessions maintain conversation context for Q&A.

Each chat is connected to a folder
Maintains conversation history
Supports follow-up questions
Multiple chats per folder

Use cases: Support conversations, research sessions, document analysis

Asks - Questions & Answers

Asks are questions submitted to a chat that generate AI-powered answers.

Natural language questions
AI-generated answers with source citations
Confidence scores for each answer
Full conversation history accessible

Returns: Answer text + source documents + page numbers + confidence scores

How It Works

Create a Folder

Organize documents into a knowledge base

Upload Files

Add documents from local files or URLs

Create Chat

Start a conversation session linked to the folder

Ask Questions

Submit natural language questions and get AI answers with sources

Quick Example

from cerevox import Hippo

# Initialize
hippo = Hippo(api_key="your-api-key")

# 1. Create knowledge base
folder = hippo.create_folder("Product Documentation")

# 2. Upload documents
hippo.upload_file(folder.id, "user-guide.pdf")
hippo.upload_file_from_url(folder.id, "https://example.com/api-docs.pdf")

# 3. Create chat
chat = hippo.create_chat(folder.id, "Support Q&A")

# 4. Ask questions
answer = hippo.submit_ask(
    chat.id,
    "How do I authenticate users?"
)

print(f"Answer: {answer.response}")
print(f"Sources: {[s.file_name for s in answer.sources]}")
# 80% cost reduction vs. full document retrieval!

Key Features

Semantic Search

AI-powered understanding

Finds relevant content by meaning, not just keywords
Handles synonyms and context
Multi-language support

Source Citations

Verify every answer

Exact source documents
Page numbers included
Confidence scores

Conversation Memory

Contextual follow-ups

Chats remember previous questions
Support clarifying questions
Full history accessible

Multi-format Support

12+ file formats

PDF, DOCX, PPTX, XLSX
TXT, HTML, CSV, and more
Automatic format detection

Async Operations

High performance

Full async/await support
Concurrent uploads
Batch processing

Enterprise Ready

Production proven

Automatic retries
Error handling
Usage tracking

The Cost Savings Advantage

Traditional RAG
Hippo RAG

# Traditional approach: Send entire documents
documents = load_all_documents()  # Large context
context = "\n\n".join([doc.content for doc in documents])

# Send to LLM - EXPENSIVE
llm_response = openai.chat.completions.create(
    messages=[{
        "role": "user",
        "content": f"Context: {context}\n\nQuestion: {question}"
    }],
    model="gpt-4"  # Flagship model required
)
# High token costs: 10,000+ tokens per query
# Slow: Large context = slower processing
# Noisy: Irrelevant content confuses the model

# Hippo approach: Precision retrieval
answer = hippo.submit_ask(chat_id, question)

# BEHIND THE SCENES:
# 1. Semantic search finds relevant chunks only
# 2. 70% smaller context (3,000 tokens vs 10,000)
# 3. Same accuracy as flagship models
# 4. Source citations included

print(f"Answer: {answer.response}")
print(f"Sources: {answer.sources}")

# 80% COST REDUCTION
# Faster: Smaller context = faster responses
# Cleaner: Only relevant content
# Verified: Source citations for every answer

Use Cases

Customer Support Automation

Build AI support agents that answer customer questions

Upload help docs, FAQs, and knowledge base
Customers ask questions in natural language
Get instant answers with source citations
80% reduction in support costs

Example: “How do I reset my password?” → Answer + link to help article

Internal Knowledge Bases

Make company knowledge searchable

Upload policies, procedures, onboarding docs
Employees ask questions, get instant answers
Reduce time spent searching for information
Keep knowledge always accessible

Example: “What’s our remote work policy?” → Answer from HR handbook

Legal & Compliance

Search contracts and legal documents

Upload contracts, agreements, legal cases
Ask questions about terms, clauses, precedents
Get answers with exact citations
Verify every response with sources

Example: “What are the termination clauses?” → Answer with contract references

Research Assistants

Query research papers and technical docs

Upload papers, reports, technical documentation
Ask research questions
Get synthesized answers from multiple sources
Citations to original papers

Example: “What methods did Smith et al. use?” → Answer from relevant papers

Financial Analysis

Query financial reports and filings

Upload 10-Ks, earnings reports, analyst notes
Ask about metrics, trends, risks
Get answers with exact page references
Compare across multiple documents

Example: “What were Q3 revenue drivers?” → Answer from earnings call

Hippo vs. Traditional RAG

Feature	Traditional RAG	Hippo RAG
Context Size	Full documents (10,000+ tokens)	Relevant chunks only (3,000 tokens)
Cost per Query	$0.10 -$ 0.50	$0.02 -$ 0.10 (80% reduction)
Accuracy	Good (with flagship models)	99.5% match (with mini models)
Response Time	Slow (large context)	Fast (smaller context)
Source Citations	Manual implementation	Built-in with confidence scores
Setup Complexity	High (vector DB, embeddings, retrieval logic)	Low (API-only, no infrastructure)
Maintenance	Ongoing (infrastructure, tuning)	None (managed service)

API Clients

Hippo provides both synchronous and asynchronous clients:

from cerevox import Hippo

# Best for: Simple scripts, notebooks, learning
hippo = Hippo(api_key="your-api-key")

folder = hippo.create_folder("Docs")
chat = hippo.create_chat(folder.id)
answer = hippo.submit_ask(chat.id, "Question?")

Next Steps

Quickstart Guide

Build your first Q&A system in 5 minutes

Folder Management

Organize documents effectively

File Operations

Upload and manage documents

Chat Sessions

Create conversation contexts

Q&A System

Ask questions and get answers

Best Practices

Optimize retrieval quality and costs

Ready to save 80%? Check out the quickstart guide or explore RAG examples.

Getting Started

Hippo - RAG & Retrieval

Lexa - Document Parsing

Account Management

Examples

Guides

Use Cases

Company

Legal

Hippo - RAG & Retrieval

Hippo: RAG & Retrieval 🦛

What is Hippo?

Core Concepts

How It Works

Quick Example

Key Features

Semantic Search

Source Citations

Conversation Memory

Multi-format Support

Async Operations

Enterprise Ready

The Cost Savings Advantage

Use Cases

Hippo vs. Traditional RAG

API Clients

Next Steps

Quickstart Guide

Folder Management

File Operations

Chat Sessions

Q&A System

Best Practices

Getting Started

Hippo - RAG & Retrieval

Lexa - Document Parsing

Account Management

Examples

Guides

Use Cases

Company

Legal

​Hippo: RAG & Retrieval 🦛

​What is Hippo?

​Core Concepts

​How It Works

​Quick Example

​Key Features

Semantic Search

Source Citations

Conversation Memory

Multi-format Support

Async Operations

Enterprise Ready

​The Cost Savings Advantage

​Use Cases

​Hippo vs. Traditional RAG

​API Clients

​Next Steps

Quickstart Guide

Folder Management

File Operations

Chat Sessions

Q&A System

Best Practices

Hippo: RAG & Retrieval 🦛

What is Hippo?

Core Concepts

How It Works

Quick Example

Key Features

The Cost Savings Advantage

Use Cases

Hippo vs. Traditional RAG

API Clients

Next Steps