Reference
Complete technical specifications for developers and power users.
Language Model Configuration
Lumi uses Groq's high-performance inference infrastructure for all LLM processing. The system maintains four API keys and automatically rotates between them to ensure high availability and distribute load. This rotation is handled at the system level—users cannot configure API keys directly.
Default Model
llama-3.3-70b-versatile is the recommended and default model. It provides the best balance of capability, speed, and multilingual support for general chatbot use cases.
Alternative Models
llama-3.1-70b-versatile offers extended context windows for longer conversations or documents. mixtral-8x7b-32768 prioritizes speed and excels at multilingual tasks. gemma2-9b-it is the fastest option, suitable for simple, high-volume interactions.
Generation Parameters
Temperature defaults to 0.7, which balances coherent responses with enough variability to feel natural. Increase toward 1.0 for more creative responses, decrease toward 0.0 for more deterministic output.
Top-P is set to 0.9, using nucleus sampling to maintain response quality while allowing some diversity.
Max tokens caps at 1000, roughly equivalent to 750 words. This is sufficient for most chatbot responses without allowing runaway generation.
RAG and Embeddings
The retrieval system uses Google AI's text-embedding-004 model to convert both user questions and knowledge base memories into 768-dimensional vectors. These embeddings are compared using cosine similarity to find the most relevant memories.
Embedding Specifications
Each embedding is a 768-dimensional dense vector. The model supports inputs up to 2048 tokens and handles over 100 languages with strong performance in Thai, English, Chinese, Japanese, and major European languages.
Similarity Threshold
The default threshold is 0.6. Memories with similarity scores above this threshold are sent to the LLM as context. Below 0.6, the bot responds "I don't have that information."
RAG Settings
Attach Source URL controls whether responses include "Read more" links to source documentation. Link confidence threshold (0.3 to 0.8, default 0.5) determines when these links appear based on similarity scores.
Voice Transcription
Voice messages use Groq's Whisper large-v3 model for speech-to-text conversion. The system accepts audio files up to 25MB in MP3, WAV, M4A, or WebM format.
Audio is stored in the make-36a84ccb-chat-audio bucket for 3 days, then automatically purged by a cleanup process that runs in the background after each upload.
Rate Limiting
Rate limits are enforced per IP address and vary by plan tier.
Free plan allows 10 messages per minute. Starter increases this to 30 per minute. Pro allows 60 per minute. Enterprise raises the limit to 120 per minute.
When you hit the rate limit, users see a "Please wait" message and can retry after 60 seconds. The limit resets on a rolling window, not at fixed intervals.
Groq API has its own rate limits, but the four-key rotation system makes hitting these limits extremely rare in production use.
Data Storage Architecture
All persistent data is stored in Supabase PostgreSQL with AES-256 encryption at rest. Backups run daily with 30-day retention.
Key-Value Store
The application uses a key-value abstraction over PostgreSQL:
Widgets:
widget:{id}Memories:
memory:{widgetId}:{id}Integrations:
integration:{widgetId}:{id}Conversations:
conversation:{conversationId}Workspaces:
workspace:{id}Audio metadata:
audio-metadata:{filename}
File Storage
Supabase Storage manages three buckets:
make-36a84ccb-chat-audio(private, 3-day retention)make-36a84ccb-chat-images(private, manual deletion)make-36a84ccb-avatars(public)
Files in private buckets are accessed via signed URLs with 24-hour expiration.
Data Retention
Active accounts retain data indefinitely. Cancelled subscriptions keep data for 90 days to allow reactivation. After 90 days, or 30 days after explicit deletion, all data is permanently purged.
Security Model
Authentication
Supabase Auth issues JWT tokens valid for 24 hours. Passwords are hashed using bcrypt with appropriate work factor. Session tokens are stored in httpOnly cookies to prevent XSS attacks.
API Security
CORS headers whitelist only the website URL specified in widget settings. Rate limiting protects against abuse. All input is validated and sanitized before processing.
System-level API keys for Groq and Google AI are never exposed to the frontend or visible to users.
Integration Credentials
Platform credentials (LINE tokens, Facebook secrets, etc.) are stored encrypted in the key-value store. They're never sent to the frontend. Webhook URLs include unique integration IDs to prevent cross-account access.
Plan Limits and Quotas
Free Plan
100 messages per month, 1 bot, 1 workspace, 0 team members, 100 memories, no integrations.
Starter Plan
4,000 messages per month, 3 bots, 1 workspace, 3 team members, 1,000 memories per bot, Gitbook integration only.
Pro Plan
6,000 messages per month, 5 bots, 3 workspaces, 10 team members, 5,000 memories per bot, Gitbook, LINE, and Facebook integrations.
Enterprise Plan
Unlimited messages, bots, workspaces, team members, and memories. All integrations available: Gitbook, LINE, Facebook, WhatsApp, Telegram.
File Upload Limits
Avatar images: 2MB maximum, 40x40 pixels recommended. Chat images: 10MB maximum. Audio files: 25MB maximum. Accepted formats: JPG, PNG, SVG, WebP for images; MP3, WAV, M4A, WebM for audio.
Network Requirements
Required Domains
The widget and backend communicate with several domains that must be accessible:
*.supabase.cofor API, database, and storageai.google.devfor embeddings (server-side only)api.groq.comfor LLM inference (server-side only)api.line.mefor LINE integration (server-side only)graph.facebook.comfor Facebook integration (server-side only)api.telegram.orgfor Telegram integration (server-side only)
Ports
HTTPS traffic on port 443 is required. WebSocket connections (WSS) are used for real-time features.
Performance Expectations
Response Times
Widget initial load: under 1 second Message send/receive latency: under 100ms LLM response without RAG: 0.5-2 seconds LLM response with RAG: 1-3 seconds Voice transcription: 2-5 seconds Gitbook page import: 5-30 seconds per page
Service Level Agreements
Free and Starter plans: 99% uptime target Pro plan: 99.5% uptime target Enterprise plan: 99.9% uptime target
API Endpoints
All endpoints use the base URL:
https://[project-id].supabase.co/functions/v1/make-server-36a84ccb
Widget Management
GET /widgets- List all widgetsPOST /widgets- Create new widgetGET /widgets/:id- Get widget detailsPATCH /widgets/:id- Update widget settingsDELETE /widgets/:id- Delete widget
Memory Management
GET /widgets/:id/memories- List memoriesPOST /widgets/:id/memories- Add memoryDELETE /widgets/:id/memories/:memoryId- Delete memory
Integration Management
GET /widgets/:id/integrations- List integrationsPOST /widgets/:id/integrations- Create integrationDELETE /widgets/:id/integrations/:integrationId- Delete integration
Conversation Management
GET /conversations- List conversationsGET /conversations/:id- Get conversation detailsPOST /conversations/:id/messages- Send message
File Uploads
POST /upload-audio- Upload voice messagePOST /upload-avatar- Upload avatar image
Gitbook Integration
POST /widgets/:id/gitbook/import- Import GitbookPOST /widgets/:id/gitbook/sources/:id/sync- Resync sourceDELETE /widgets/:id/gitbook/sources/:id- Delete source
Testing and Debugging
GET /test/google-ai- Test embedding service
Webhooks
POST /webhooks/line/:integrationIdPOST /webhooks/facebook/:integrationIdPOST /webhooks/whatsapp/:integrationIdPOST /webhooks/telegram/:integrationId
Language Support
The dashboard interface is English only. Bot responses support 100+ languages through the multilingual LLM.
Common languages with strong support include Thai, English, Chinese (Simplified and Traditional), Japanese, Korean, Spanish, French, German, Italian, Portuguese, Arabic, Hindi, Indonesian, Vietnamese, and Russian.
Support Channels
Response Time by Plan
Free and Starter: 24-48 hours Pro: 12-24 hours Enterprise: Under 4 hours (under 1 hour for urgent issues)
Business Hours
Monday through Friday, 9am to 6pm Indochina Time (UTC+7)
Last updated