Memory & RAG

RAG (Retrieval-Augmented Generation) is what makes your bot intelligent. Instead of relying on the LLM's general knowledge, the system searches your curated knowledge base before generating responses,

How RAG Works

When a user asks a question, their message is converted into a mathematical representation called an embedding using Google AI's text-embedding-004 model. This embedding is then compared against all the memories in your knowledge base to find the most similar content.

The top matching memories are sent to the LLM as context, along with the user's question. The LLM then crafts a response grounded in your knowledge base rather than guessing or hallucinating information.

Strict Mode Behavior

Lumi operates in strict mode by default. If the similarity score between the user's question and your knowledge base is below 0.6, the bot will honestly say "I don't have that information" rather than making something up. This prevents misinformation but requires a well-populated knowledge base.

Building Your Knowledge Base

Navigate to the Memory tab in your widget settings. Click Add Memory to create a new entry.

Each memory should be self-contained and focused on a single topic. For example:

The Pro plan costs ฿599 per user per month. It includes 6,000 messages 
per month, up to 5 bots, 3 workspaces, and integrations with LINE and 
Facebook Messenger. Team size is limited to 10 members.

Notice how this memory is complete, specific, and written in natural language. This is much more effective than fragments like "Pro plan is good" or overly technical bullet lists.

Memory Types

When creating memories, you'll select a type. The default "general" type works for most use cases. Types are simply organizational labels and don't affect how the RAG system processes content.

Optimal Memory Structure

Each memory should contain 100-500 words focused on one concept. If you're documenting pricing, create separate memories for each plan. For product features, one memory per feature. For policies, one per policy.

The goal is granularity—specific, searchable chunks that the RAG system can precisely match to user questions.

Importing from Gitbook

If you maintain documentation on Gitbook, you can import entire sites automatically rather than manually entering memories.

Navigate to the Integrations section and find Gitbook Integration. Enter your Gitbook URL—this should be the public URL like https://your-org.gitbook.io/docs or https://docs.example.com. Click Import.

The system will scrape all pages, convert them to memories, and show progress as "Processing X/Y pages". This happens in the background and typically takes 5-30 seconds per page depending on content length. The interface auto-refreshes every 5 seconds to show current status.

RAG Settings for Gitbook

Click the RAG Settings button to configure two important options:

Attach Source URL controls whether the bot includes "Read more" links to the original Gitbook pages in responses. Enable this if you want users to dive deeper into your documentation. Disable it for cleaner, standalone answers.

Link Confidence Threshold determines when to show these links. At 30% (Liberal), links appear frequently. At 50% (Balanced, default), links only appear when the match is reasonably strong. At 80% (Strict), links only appear for very confident matches. This prevents the bot from linking to tangentially related pages.

Managing Gitbook Sources

Once imported, you can resync to pull fresh content when your Gitbook is updated. Click the resync icon on any imported source—this will replace all existing memories with the current content.

If a source gets stuck in "Processing" status, use the Force Complete button to manually mark it as completed. This is safe to use and won't corrupt data.

Delete a source by clicking the trash icon. This removes all associated memories immediately and cannot be undone.

Export and Import

Export your entire knowledge base to JSON for backup or transfer between widgets. Click Export and save the file.

To import memories, click Import and select a JSON file. All memories will be added to the current widget—duplicates are not automatically removed, so clean your export file first if needed.

Debugging and Testing

Use the Debug Memories button to see a breakdown of your knowledge base in the browser console. It shows total memory count, breakdown by source (manual, Gitbook, etc.), and other useful metadata.

The Test Google AI button verifies that embeddings are working correctly. If this fails, RAG won't function—contact support if you see errors here.

Last updated