Reference

Complete technical specifications for developers and power users.

Language Model Configuration

Lumi uses Groq's high-performance inference infrastructure for all LLM processing. The system maintains four API keys and automatically rotates between them to ensure high availability and distribute load. This rotation is handled at the system level—users cannot configure API keys directly.

Default Model

llama-3.3-70b-versatile is the recommended and default model. It provides the best balance of capability, speed, and multilingual support for general chatbot use cases.

Alternative Models

llama-3.1-70b-versatile offers extended context windows for longer conversations or documents. mixtral-8x7b-32768 prioritizes speed and excels at multilingual tasks. gemma2-9b-it is the fastest option, suitable for simple, high-volume interactions.

Generation Parameters

Temperature defaults to 0.7, which balances coherent responses with enough variability to feel natural. Increase toward 1.0 for more creative responses, decrease toward 0.0 for more deterministic output.

Top-P is set to 0.9, using nucleus sampling to maintain response quality while allowing some diversity.

Max tokens caps at 1000, roughly equivalent to 750 words. This is sufficient for most chatbot responses without allowing runaway generation.

RAG and Embeddings

The retrieval system uses Google AI's text-embedding-004 model to convert both user questions and knowledge base memories into 768-dimensional vectors. These embeddings are compared using cosine similarity to find the most relevant memories.

Embedding Specifications

Each embedding is a 768-dimensional dense vector. The model supports inputs up to 2048 tokens and handles over 100 languages with strong performance in Thai, English, Chinese, Japanese, and major European languages.

Similarity Threshold

The default threshold is 0.6. Memories with similarity scores above this threshold are sent to the LLM as context. Below 0.6, the bot responds "I don't have that information."

RAG Settings

Attach Source URL controls whether responses include "Read more" links to source documentation. Link confidence threshold (0.3 to 0.8, default 0.5) determines when these links appear based on similarity scores.

Voice Transcription

Voice messages use Groq's Whisper large-v3 model for speech-to-text conversion. The system accepts audio files up to 25MB in MP3, WAV, M4A, or WebM format.

Audio is stored in the make-36a84ccb-chat-audio bucket for 3 days, then automatically purged by a cleanup process that runs in the background after each upload.

Rate Limiting

Rate limits are enforced per IP address and vary by plan tier.

Free plan allows 10 messages per minute. Starter increases this to 30 per minute. Pro allows 60 per minute. Enterprise raises the limit to 120 per minute.

When you hit the rate limit, users see a "Please wait" message and can retry after 60 seconds. The limit resets on a rolling window, not at fixed intervals.

Groq API has its own rate limits, but the four-key rotation system makes hitting these limits extremely rare in production use.

Data Storage Architecture

All persistent data is stored in Supabase PostgreSQL with AES-256 encryption at rest. Backups run daily with 30-day retention.

Key-Value Store

The application uses a key-value abstraction over PostgreSQL:

Widgets: widget:{id}
Memories: memory:{widgetId}:{id}
Integrations: integration:{widgetId}:{id}
Conversations: conversation:{conversationId}
Workspaces: workspace:{id}
Audio metadata: audio-metadata:{filename}

File Storage

Supabase Storage manages three buckets:

make-36a84ccb-chat-audio (private, 3-day retention)
make-36a84ccb-chat-images (private, manual deletion)
make-36a84ccb-avatars (public)

Files in private buckets are accessed via signed URLs with 24-hour expiration.

Data Retention

Active accounts retain data indefinitely. Cancelled subscriptions keep data for 90 days to allow reactivation. After 90 days, or 30 days after explicit deletion, all data is permanently purged.

Security Model

Authentication

Supabase Auth issues JWT tokens valid for 24 hours. Passwords are hashed using bcrypt with appropriate work factor. Session tokens are stored in httpOnly cookies to prevent XSS attacks.

API Security

CORS headers whitelist only the website URL specified in widget settings. Rate limiting protects against abuse. All input is validated and sanitized before processing.

System-level API keys for Groq and Google AI are never exposed to the frontend or visible to users.

Integration Credentials

Platform credentials (LINE tokens, Facebook secrets, etc.) are stored encrypted in the key-value store. They're never sent to the frontend. Webhook URLs include unique integration IDs to prevent cross-account access.

Plan Limits and Quotas

Free Plan

100 messages per month, 1 bot, 1 workspace, 0 team members, 100 memories, no integrations.

Starter Plan

4,000 messages per month, 3 bots, 1 workspace, 3 team members, 1,000 memories per bot, Gitbook integration only.

Pro Plan

6,000 messages per month, 5 bots, 3 workspaces, 10 team members, 5,000 memories per bot, Gitbook, LINE, and Facebook integrations.

Enterprise Plan

Unlimited messages, bots, workspaces, team members, and memories. All integrations available: Gitbook, LINE, Facebook, WhatsApp, Telegram.

File Upload Limits

Avatar images: 2MB maximum, 40x40 pixels recommended. Chat images: 10MB maximum. Audio files: 25MB maximum. Accepted formats: JPG, PNG, SVG, WebP for images; MP3, WAV, M4A, WebM for audio.

Network Requirements

Required Domains

The widget and backend communicate with several domains that must be accessible:

*.supabase.co for API, database, and storage
ai.google.dev for embeddings (server-side only)
api.groq.com for LLM inference (server-side only)
api.line.me for LINE integration (server-side only)
graph.facebook.com for Facebook integration (server-side only)
api.telegram.org for Telegram integration (server-side only)

Ports

HTTPS traffic on port 443 is required. WebSocket connections (WSS) are used for real-time features.

Performance Expectations

Response Times

Widget initial load: under 1 second Message send/receive latency: under 100ms LLM response without RAG: 0.5-2 seconds LLM response with RAG: 1-3 seconds Voice transcription: 2-5 seconds Gitbook page import: 5-30 seconds per page

Service Level Agreements

Free and Starter plans: 99% uptime target Pro plan: 99.5% uptime target Enterprise plan: 99.9% uptime target

API Endpoints

All endpoints use the base URL: https://[project-id].supabase.co/functions/v1/make-server-36a84ccb

Widget Management

GET /widgets - List all widgets
POST /widgets - Create new widget
GET /widgets/:id - Get widget details
PATCH /widgets/:id - Update widget settings
DELETE /widgets/:id - Delete widget

Memory Management

GET /widgets/:id/memories - List memories
POST /widgets/:id/memories - Add memory
DELETE /widgets/:id/memories/:memoryId - Delete memory

Integration Management

GET /widgets/:id/integrations - List integrations
POST /widgets/:id/integrations - Create integration
DELETE /widgets/:id/integrations/:integrationId - Delete integration

Conversation Management

GET /conversations - List conversations
GET /conversations/:id - Get conversation details
POST /conversations/:id/messages - Send message

File Uploads

POST /upload-audio - Upload voice message
POST /upload-avatar - Upload avatar image

Gitbook Integration

POST /widgets/:id/gitbook/import - Import Gitbook
POST /widgets/:id/gitbook/sources/:id/sync - Resync source
DELETE /widgets/:id/gitbook/sources/:id - Delete source

Testing and Debugging

GET /test/google-ai - Test embedding service

Webhooks

POST /webhooks/line/:integrationId
POST /webhooks/facebook/:integrationId
POST /webhooks/whatsapp/:integrationId
POST /webhooks/telegram/:integrationId

Language Support

The dashboard interface is English only. Bot responses support 100+ languages through the multilingual LLM.

Common languages with strong support include Thai, English, Chinese (Simplified and Traditional), Japanese, Korean, Spanish, French, German, Italian, Portuguese, Arabic, Hindi, Indonesian, Vietnamese, and Russian.

Support Channels

Email

[email protected]

Response Time by Plan

Free and Starter: 24-48 hours Pro: 12-24 hours Enterprise: Under 4 hours (under 1 hour for urgent issues)

Business Hours

Monday through Friday, 9am to 6pm Indochina Time (UTC+7)

PreviousTroubleshooting NextPrivacy Policy

Last updated 4 months ago

hashtagLanguage Model Configuration

hashtagRAG and Embeddings

hashtagVoice Transcription

hashtagRate Limiting

hashtagData Storage Architecture

hashtagSecurity Model

hashtagPlan Limits and Quotas

hashtagNetwork Requirements

hashtagPerformance Expectations

hashtagAPI Endpoints

hashtagLanguage Support

hashtagSupport Channels