Memory System
Mirror Mate includes a memory system that enables persistent user context through RAG (Retrieval-Augmented Generation). The system stores user information, extracts memories from conversations, and provides relevant context to the AI.
Overview
Configuration
Memory settings are configured in:
config/providers.yaml- Provider and RAG settingsconfig/locales/[lang]/memory.yaml- Extraction prompts (locale-specific)
The database is also locale-specific: data/mirrormate.[lang].db
providers:
embedding:
enabled: true
provider: ollama # PLaMo server provides Ollama-compatible API
ollama:
model: plamo-embedding-1b
baseUrl: "http://studio:8000" # PLaMo embedding server
memory:
enabled: true
# RAG settings
rag:
topK: 8 # Max memories to retrieve
threshold: 0.3 # Minimum similarity score
# Memory extraction settings
extraction:
autoExtract: true # Auto-extract from conversations
minConfidence: 0.5 # Minimum confidence thresholdNote: PLaMo-Embedding-1B is recommended for Japanese. See Recommended Setup for details. You can also use
bge-m3via Ollama as an alternative.
Memory Types
| Type | Description | Example |
|---|---|---|
profile | User preferences and traits | "Favorite color: blue" |
episode | Recent interactions and events | "Asked about weather on 2024-01-01" |
knowledge | Facts and learned information | "User works at ACME Corp" |
Profile Memories
Profile memories store persistent user information:
- User preferences (language, style)
- Personality traits
- Communication preferences
- Recurring topics of interest
Profile memories are always included in the RAG context.
Episode Memories
Episode memories capture recent interactions:
- Recent conversations
- Events and activities
- Time-sensitive information
Episodes have a recency factor that prioritizes recent memories.
Knowledge Memories
Knowledge memories store factual information:
- User's work, hobbies, relationships
- Learned facts from conversations
- Important dates and information
RAG (Retrieval-Augmented Generation)
The RAG system retrieves relevant memories to provide context-aware responses.
How It Works
- Embed Query: Convert user input to a vector using Ollama embedding
- Semantic Search: Find similar memories using cosine similarity
- Rank Results: Sort by similarity score and filter by threshold
- Format Context: Combine profiles and relevant memories into a prompt
Configuration Options
| Option | Type | Description | Default |
|---|---|---|---|
topK | number | Maximum memories to retrieve | 8 |
threshold | number | Minimum similarity score (0.0-1.0) | 0.3 |
Example Context Output
[User Profile]
- Preferred language: Japanese
- Interests: programming, music
[Related Information]
- [Important] (Note) User works at a tech company
- (Recent) Asked about weather forecast yesterdayMemory Extraction
The system automatically extracts memories from conversations using the LLM.
How It Works
- Analyze Conversation: Send recent messages to LLM for analysis
- Extract Information: LLM identifies memorable facts and updates
- Validate Results: Filter by confidence score
- Store Memories: Save to database with embeddings
Configuration Options
| Option | Type | Description | Default |
|---|---|---|---|
autoExtract | boolean | Enable automatic extraction | true |
minConfidence | number | Minimum confidence for saving (0.0-1.0) | 0.5 |
Prompt Configuration
Extraction prompts are configured in config/memory.yaml:
memory:
extraction:
# System prompt for LLM
systemPrompt: |
あなたは会話から重要な情報を抽出する専門家です。
...
# Labels for user prompt
labels:
user: ユーザー
assistant: アシスタント
conversationHistory: "## 会話履歴"
existingProfiles: "## 既存の Profile"
relatedMemories: "## 関連する既存の記憶"
task: |
## タスク
上記の会話から、記憶として保存すべき情報を抽出してください。
...This allows customizing the extraction behavior without modifying code.
Extraction Process
The LLM is prompted to extract:
- Profile Updates: Changes to user preferences or traits
- New Memories: Facts worth remembering
- Archive Candidates: Outdated or superseded information
Database Schema
Mirror Mate uses SQLite with Drizzle ORM for persistence.
Tables
| Table | Description |
|---|---|
users | User accounts |
sessions | Conversation sessions |
messages | Chat messages |
memories | Stored memories |
memory_embeddings | Vector embeddings for semantic search |
Memory Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier |
userId | string | Owner user ID |
kind | enum | profile, episode, or knowledge |
title | string | Memory title/key |
content | string | Memory content |
tags | string[] | Categorization tags |
importance | number | Importance score (0.0-1.0) |
status | enum | active, archived, or deleted |
source | enum | manual or extracted |
createdAt | datetime | Creation timestamp |
updatedAt | datetime | Last update timestamp |
lastUsedAt | datetime | Last retrieval timestamp |
Memory Management UI
Access the memory management interface at /control/memory.
Features
- View Memories: List all memories with filtering
- Create Memory: Manually add new memories
- Edit Memory: Update existing memories
- Delete Memory: Soft delete or permanently remove
- Filter: By type (profile/episode/knowledge) and status
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/memories | List memories |
| POST | /api/memories | Create memory |
| GET | /api/memories/[id] | Get memory details |
| PUT | /api/memories/[id] | Update memory |
| DELETE | /api/memories/[id] | Delete memory |
Query Parameters
GET /api/memories
| Parameter | Type | Description |
|---|---|---|
userId | string | Filter by user ID |
kind | string | Filter by type (profile/episode/knowledge) |
status | string | Filter by status (active/archived/deleted) |
DELETE /api/memories/[id]
| Parameter | Type | Description |
|---|---|---|
hard | boolean | If true, permanently delete |
Setup
1. Set Up Embedding Service
Option A: PLaMo-Embedding-1B (Recommended for Japanese)
See Recommended Setup for PLaMo server setup on Mac Studio.
Option B: Ollama with bge-m3 (Alternative)
# Start Ollama
ollama serve
# Pull the embedding model
ollama pull bge-m32. Initialize Database
# Create data directory
mkdir -p data
# Run database migration
bun run db:push3. Configure Providers
Edit config/providers.yaml:
providers:
embedding:
enabled: true
provider: ollama # PLaMo server provides Ollama-compatible API
ollama:
model: plamo-embedding-1b
baseUrl: "http://studio:8000" # PLaMo (or http://localhost:11434 for Ollama)
memory:
enabled: true
rag:
topK: 8
threshold: 0.3
extraction:
autoExtract: true
minConfidence: 0.54. Verify Setup
# Start the development server
bun run dev
# Open memory management
open http://localhost:3000/control/memoryDocker Setup
When running in Docker, the database is persisted in a volume:
# compose.yaml
services:
mirrormate:
volumes:
- mirrormate-data:/app/data
volumes:
mirrormate-data:Configure embedding to use PLaMo server:
# config/providers.yaml
providers:
embedding:
enabled: true
provider: ollama # PLaMo server provides Ollama-compatible API
ollama:
model: plamo-embedding-1b
baseUrl: "http://studio:8000" # PLaMo embedding serverSee Docker Documentation and Recommended Setup for details.
Troubleshooting
Embedding Service Not Available
Error: Ollama embed API error: 404 or connection refused
Solution (PLaMo):
- Check PLaMo server is running:
curl http://studio:8000/health - View logs:
docker compose -f compose.studio.yaml logs plamo-embedding
Solution (Ollama/bge-m3):
- Ensure Ollama is running:
ollama serve - Pull the model:
ollama pull bge-m3 - Verify the model exists:
ollama list
Database Not Found
Error: SQLITE_CANTOPEN
Solution:
- Create data directory:
mkdir -p data - Run migration:
bun run db:push
Memory Not Being Extracted
Solution:
- Check
memory.enabledistruein config - Check
extraction.autoExtractistrue - Verify LLM provider is working
- Check console logs for extraction errors
Low Quality Retrieval
Solution:
- Lower the
thresholdvalue (e.g., 0.2) - Increase the
topKvalue - Add more profile memories for better context
- Use a higher quality embedding model
Best Practices
Memory Organization
- Use profile memories for persistent info: Things that rarely change
- Use episode memories for recent events: Time-sensitive information
- Use knowledge memories for facts: Learned information
Performance Tips
- Set appropriate thresholds: Too low = irrelevant results, too high = missing context
- Keep topK reasonable: 5-10 is usually sufficient
- Periodic cleanup: Archive or delete outdated memories
Privacy Considerations
- Review extracted memories: Check what the LLM is storing
- Manual cleanup: Remove sensitive information if needed
- User-specific memories: Memories are scoped to user IDs
