Memory & RAG
RAG (Retrieval-Augmented Generation) is what makes your bot intelligent. Instead of relying on the LLM's general knowledge, the system searches your curated knowledge base before generating responses,
How RAG Works
When a user asks a question, their message is converted into a mathematical representation called an embedding using Google AI's text-embedding-004 model. This embedding is then compared against all the memories in your knowledge base to find the most similar content.
The top matching memories are sent to the LLM as context, along with the user's question. The LLM then crafts a response grounded in your knowledge base rather than guessing or hallucinating information.
Strict Mode Behavior
Lumi operates in strict mode by default. If the similarity score between the user's question and your knowledge base is below 0.6, the bot will honestly say "I don't have that information" rather than making something up. This prevents misinformation but requires a well-populated knowledge base.
Building Your Knowledge Base
Navigate to the Memory tab in your widget settings. Click Add Memory to create a new entry.
Each memory should be self-contained and focused on a single topic. For example:
The Pro plan costs ฿599 per user per month. It includes 6,000 messages
per month, up to 5 bots, 3 workspaces, and integrations with LINE and
Facebook Messenger. Team size is limited to 10 members.Notice how this memory is complete, specific, and written in natural language. This is much more effective than fragments like "Pro plan is good" or overly technical bullet lists.
Memory Types
When creating memories, you'll select a type. The default "general" type works for most use cases. Types are simply organizational labels and don't affect how the RAG system processes content.
Optimal Memory Structure
Each memory should contain 100-500 words focused on one concept. If you're documenting pricing, create separate memories for each plan. For product features, one memory per feature. For policies, one per policy.
The goal is granularity—specific, searchable chunks that the RAG system can precisely match to user questions.
Importing from Gitbook
If you maintain documentation on Gitbook, you can import entire sites automatically rather than manually entering memories.
Navigate to the Integrations section and find Gitbook Integration. Enter your Gitbook URL—this should be the public URL like https://your-org.gitbook.io/docs or https://docs.example.com. Click Import.
The system will scrape all pages, convert them to memories, and show progress as "Processing X/Y pages". This happens in the background and typically takes 5-30 seconds per page depending on content length. The interface auto-refreshes every 5 seconds to show current status.
RAG Settings for Gitbook
Click the RAG Settings button to configure two important options:
Attach Source URL controls whether the bot includes "Read more" links to the original Gitbook pages in responses. Enable this if you want users to dive deeper into your documentation. Disable it for cleaner, standalone answers.
Link Confidence Threshold determines when to show these links. At 30% (Liberal), links appear frequently. At 50% (Balanced, default), links only appear when the match is reasonably strong. At 80% (Strict), links only appear for very confident matches. This prevents the bot from linking to tangentially related pages.
Managing Gitbook Sources
Once imported, you can resync to pull fresh content when your Gitbook is updated. Click the resync icon on any imported source—this will replace all existing memories with the current content.
If a source gets stuck in "Processing" status, use the Force Complete button to manually mark it as completed. This is safe to use and won't corrupt data.
Delete a source by clicking the trash icon. This removes all associated memories immediately and cannot be undone.
Export and Import
Export your entire knowledge base to JSON for backup or transfer between widgets. Click Export and save the file.
To import memories, click Import and select a JSON file. All memories will be added to the current widget—duplicates are not automatically removed, so clean your export file first if needed.
Debugging and Testing
Use the Debug Memories button to see a breakdown of your knowledge base in the browser console. It shows total memory count, breakdown by source (manual, Gitbook, etc.), and other useful metadata.
The Test Google AI button verifies that embeddings are working correctly. If this fails, RAG won't function—contact support if you see errors here.
Last updated