Getting Started
What makes Lexa different?
What makes Lexa different?
Lexa is Cerevox’s enterprise-grade document parsing API that delivers 10x better performance and accuracy compared to traditional solutions. Unlike other APIs that struggle with complex layouts and structured data, Lexa uses state-of-the-art AI models to extract content with 99.9% accuracy while maintaining native async support and vector database optimization.Key differentiators:
- SOTA accuracy with advanced ML models
- 10x faster processing with sub-second response times
- Vector DB ready chunks optimized for RAG applications
- 12+ file formats with consistent results
- Enterprise-grade reliability with 99.9% SLA
How quickly can I integrate?
How quickly can I integrate?
You can start parsing documents in under 5 minutes:Our Python SDK handles authentication, retries, and error handling automatically. We also provide comprehensive examples for Django, Flask, FastAPI, and async applications.
What formats does Lexa support?
What formats does Lexa support?
Lexa supports 12+ file formats with consistent, high-accuracy parsing:Documents: PDF, DOCX, PPTX, TXT, HTML, RTF
Spreadsheets: XLSX, CSV, TSV
Google Workspace: Google Docs, Sheets, Slides
Data: JSON, ParquetAll formats support advanced table extraction, image detection, and metadata preservation. File size limits range from 100MB for complex documents to 1GB+ for simple text files.
Spreadsheets: XLSX, CSV, TSV
Google Workspace: Google Docs, Sheets, Slides
Data: JSON, ParquetAll formats support advanced table extraction, image detection, and metadata preservation. File size limits range from 100MB for complex documents to 1GB+ for simple text files.
Technical Implementation
How does async processing work?
How does async processing work?
Lexa provides native async support with the This enables concurrent processing of multiple documents, significantly improving throughput for batch operations.
AsyncLexa
client:How do I optimize for RAG applications?
How do I optimize for RAG applications?
Lexa is designed specifically for RAG workflows with built-in vector database optimization:We provide pre-built integration examples for Pinecone, Weaviate, ChromaDB, and Qdrant.
What cloud storage integrations are available?
What cloud storage integrations are available?
Lexa integrates with 7+ major cloud storage platforms:
- Amazon S3: Direct parsing from S3 buckets and folders
- Microsoft SharePoint: Sites, drives, and document libraries
- Google Drive: Files and folders with permission management
- Box: Enterprise file storage with advanced metadata
- Dropbox: Personal and business accounts
- Salesforce: Document attachments and files
- Coming Soon: Azure Blob, OneDrive, Notion
Performance and Scaling
What are Lexa's performance benchmarks?
What are Lexa's performance benchmarks?
Lexa delivers industry-leading performance across all metrics:Speed:
- Simple PDFs: < 1 second
- Complex documents (100+ pages): 15-45 seconds
- Batch processing: 10-50 documents/minute
- Concurrent async: 100+ documents/minute
- Text extraction: 99.9%
- Table structure: 92.5%
- Metadata extraction: 99.2%
- Multi-format consistency: 99.7%
- API uptime: 99.9% SLA
- Auto-retry on failures: 3 attempts with exponential backoff
- Rate limiting: 1000 requests/minute (enterprise plans)
How does Lexa handle large-scale document processing?
How does Lexa handle large-scale document processing?
Lexa is built for enterprise scale with several optimization strategies:Horizontal Scaling: Our API automatically scales to handle spikes in demand
Batch Processing: Process up to 100 documents per API call
Async Processing: Non-blocking operations with progress callbacks
Caching: Intelligent caching reduces processing time for similar documents
Load Balancing: Global infrastructure ensures low latency worldwide
Batch Processing: Process up to 100 documents per API call
Async Processing: Non-blocking operations with progress callbacks
Caching: Intelligent caching reduces processing time for similar documents
Load Balancing: Global infrastructure ensures low latency worldwide
What are the API rate limits and pricing?
What are the API rate limits and pricing?
See pricingFree Plan (Free):
- 1000 Documents Parsed
- Community support
- Start with 100 pages
- $0.05 per additional page
- 100 requests/minute
- Email support
- Vector DB integrations
- Start with 10,000 pages
- $0.01 per additional page
- 3x cost for advanced processing
- 100 requests/minute
- Email support
- Vector DB integrations
- Unlimited pages
- 1000+ requests/minute
- Dedicated support
- On-premise deployment
- Custom integrations
Advanced Features
How does table extraction work in complex documents?
How does table extraction work in complex documents?
Lexa uses advanced computer vision and ML models to extract tables with high fidelity:Features include:
- Structure preservation: Maintains cell relationships and formatting
- Multi-page tables: Automatically combines split tables
- Header detection: Identifies and preserves table headers
- Data type inference: Automatically detects numbers, dates, etc.
Can I customize the parsing behavior for my use case?
Can I customize the parsing behavior for my use case?
Lexa offers several processing modes and customization options:Contact our team for specialized processing modes for specific document types or industries.
What security and compliance features does Lexa provide?
What security and compliance features does Lexa provide?
Security is built into every aspect of Lexa:Data Security:
- TLS 1.3 encryption for all API communications
- Documents processed in isolated environments
- No document storage - processed and deleted immediately
- SOC 2 Type II certified infrastructure
- API key authentication with rotation support
- Role-based access control (enterprise plans)
- IP whitelisting and VPC connectivity options
- Audit logging for all API operations
- GDPR compliant data processing
- HIPAA compliance available (enterprise)
- Regional data processing options (US, EU, Asia)
- On-premise deployment for maximum security
Support and Community
What support options are available?
What support options are available?
We provide comprehensive support across multiple channels:Community Support:
- Discord Community - Real-time chat with developers
- GitHub Discussions - Technical discussions
- Stack Overflow - Q&A with the community
- Email support for Pro and Enterprise customers
- Video calls for Enterprise customers
- Dedicated Slack channels for large deployments
- 24/7 support for mission-critical applications
How can I stay updated on new features and improvements?
How can I stay updated on new features and improvements?
Stay connected with the Cerevox developer community:
- GitHub Repository: Star for updates and releases
- Discord Community: Join 1000+ developers
- Documentation: Always up-to-date guides and examples