Memory System

Mirror Mate includes a memory system that enables persistent user context through RAG (Retrieval-Augmented Generation). The system stores user information, extracts memories from conversations, and provides relevant context to the AI.

Overview

Configuration

Memory settings are configured in:

config/providers.yaml - Provider and RAG settings
config/locales/[lang]/memory.yaml - Extraction prompts (locale-specific)

The database is also locale-specific: data/mirrormate.[lang].db

yaml

providers:
  embedding:
    enabled: true
    provider: ollama  # PLaMo server provides Ollama-compatible API
    ollama:
      model: plamo-embedding-1b
      baseUrl: "http://studio:8000"  # PLaMo embedding server

  memory:
    enabled: true
    # RAG settings
    rag:
      topK: 8           # Max memories to retrieve
      threshold: 0.3    # Minimum similarity score
    # Memory extraction settings
    extraction:
      autoExtract: true      # Auto-extract from conversations
      minConfidence: 0.5     # Minimum confidence threshold

Note: PLaMo-Embedding-1B is recommended for Japanese. See Recommended Setup for details. You can also use bge-m3 via Ollama as an alternative.

Memory Types

Type	Description	Example
`profile`	User preferences and traits	"Favorite color: blue"
`episode`	Recent interactions and events	"Asked about weather on 2024-01-01"
`knowledge`	Facts and learned information	"User works at ACME Corp"

Profile Memories

Profile memories store persistent user information:

User preferences (language, style)
Personality traits
Communication preferences
Recurring topics of interest

Profile memories are always included in the RAG context.

Episode Memories

Episode memories capture recent interactions:

Recent conversations
Events and activities
Time-sensitive information

Episodes have a recency factor that prioritizes recent memories.

Knowledge Memories

Knowledge memories store factual information:

User's work, hobbies, relationships
Learned facts from conversations
Important dates and information

RAG (Retrieval-Augmented Generation)

The RAG system retrieves relevant memories to provide context-aware responses.

How It Works

Embed Query: Convert user input to a vector using Ollama embedding
Semantic Search: Find similar memories using cosine similarity
Rank Results: Sort by similarity score and filter by threshold
Format Context: Combine profiles and relevant memories into a prompt

Configuration Options

Option	Type	Description	Default
`topK`	number	Maximum memories to retrieve	`8`
`threshold`	number	Minimum similarity score (0.0-1.0)	`0.3`

Example Context Output

[User Profile]
- Preferred language: Japanese
- Interests: programming, music

[Related Information]
- [Important] (Note) User works at a tech company
- (Recent) Asked about weather forecast yesterday

Memory Extraction

The system automatically extracts memories from conversations using the LLM.

How It Works

Analyze Conversation: Send recent messages to LLM for analysis
Extract Information: LLM identifies memorable facts and updates
Validate Results: Filter by confidence score
Store Memories: Save to database with embeddings

Configuration Options

Option	Type	Description	Default
`autoExtract`	boolean	Enable automatic extraction	`true`
`minConfidence`	number	Minimum confidence for saving (0.0-1.0)	`0.5`

Prompt Configuration

Extraction prompts are configured in config/memory.yaml:

yaml

memory:
  extraction:
    # System prompt for LLM
    systemPrompt: |
      あなたは会話から重要な情報を抽出する専門家です。
      ...

    # Labels for user prompt
    labels:
      user: ユーザー
      assistant: アシスタント
      conversationHistory: "## 会話履歴"
      existingProfiles: "## 既存の Profile"
      relatedMemories: "## 関連する既存の記憶"
      task: |
        ## タスク
        上記の会話から、記憶として保存すべき情報を抽出してください。
        ...

This allows customizing the extraction behavior without modifying code.

Extraction Process

The LLM is prompted to extract:

Profile Updates: Changes to user preferences or traits
New Memories: Facts worth remembering
Archive Candidates: Outdated or superseded information

Database Schema

Mirror Mate uses SQLite with Drizzle ORM for persistence.

Tables

Table	Description
`users`	User accounts
`sessions`	Conversation sessions
`messages`	Chat messages
`memories`	Stored memories
`memory_embeddings`	Vector embeddings for semantic search

Memory Fields

Field	Type	Description
`id`	string	Unique identifier
`userId`	string	Owner user ID
`kind`	enum	profile, episode, or knowledge
`title`	string	Memory title/key
`content`	string	Memory content
`tags`	string[]	Categorization tags
`importance`	number	Importance score (0.0-1.0)
`status`	enum	active, archived, or deleted
`source`	enum	manual or extracted
`createdAt`	datetime	Creation timestamp
`updatedAt`	datetime	Last update timestamp
`lastUsedAt`	datetime	Last retrieval timestamp

Memory Management UI

Access the memory management interface at /control/memory.

Features

View Memories: List all memories with filtering
Create Memory: Manually add new memories
Edit Memory: Update existing memories
Delete Memory: Soft delete or permanently remove
Filter: By type (profile/episode/knowledge) and status

API Endpoints

Method	Endpoint	Description
GET	`/api/memories`	List memories
POST	`/api/memories`	Create memory
GET	`/api/memories/[id]`	Get memory details
PUT	`/api/memories/[id]`	Update memory
DELETE	`/api/memories/[id]`	Delete memory

Query Parameters

GET /api/memories

Parameter	Type	Description
`userId`	string	Filter by user ID
`kind`	string	Filter by type (profile/episode/knowledge)
`status`	string	Filter by status (active/archived/deleted)

DELETE /api/memories/[id]

Parameter	Type	Description
`hard`	boolean	If true, permanently delete

Setup

1. Set Up Embedding Service

Option A: PLaMo-Embedding-1B (Recommended for Japanese)

See Recommended Setup for PLaMo server setup on Mac Studio.

Option B: Ollama with bge-m3 (Alternative)

bash

# Start Ollama
ollama serve

# Pull the embedding model
ollama pull bge-m3

2. Initialize Database

bash

# Create data directory
mkdir -p data

# Run database migration
bun run db:push

3. Configure Providers

Edit config/providers.yaml:

yaml

providers:
  embedding:
    enabled: true
    provider: ollama  # PLaMo server provides Ollama-compatible API
    ollama:
      model: plamo-embedding-1b
      baseUrl: "http://studio:8000"  # PLaMo (or http://localhost:11434 for Ollama)

  memory:
    enabled: true
    rag:
      topK: 8
      threshold: 0.3
    extraction:
      autoExtract: true
      minConfidence: 0.5

4. Verify Setup

bash

# Start the development server
bun run dev

# Open memory management
open http://localhost:3000/control/memory

Docker Setup

When running in Docker, the database is persisted in a volume:

yaml

# compose.yaml
services:
  mirrormate:
    volumes:
      - mirrormate-data:/app/data

volumes:
  mirrormate-data:

Configure embedding to use PLaMo server:

yaml

# config/providers.yaml
providers:
  embedding:
    enabled: true
    provider: ollama  # PLaMo server provides Ollama-compatible API
    ollama:
      model: plamo-embedding-1b
      baseUrl: "http://studio:8000"  # PLaMo embedding server

See Docker Documentation and Recommended Setup for details.

Troubleshooting

Embedding Service Not Available

Error: Ollama embed API error: 404 or connection refused

Solution (PLaMo):

Check PLaMo server is running: curl http://studio:8000/health
View logs: docker compose -f compose.studio.yaml logs plamo-embedding

Solution (Ollama/bge-m3):

Ensure Ollama is running: ollama serve
Pull the model: ollama pull bge-m3
Verify the model exists: ollama list

Database Not Found

Error: SQLITE_CANTOPEN

Solution:

Create data directory: mkdir -p data
Run migration: bun run db:push

Memory Not Being Extracted

Solution:

Check memory.enabled is true in config
Check extraction.autoExtract is true
Verify LLM provider is working
Check console logs for extraction errors

Low Quality Retrieval

Solution:

Lower the threshold value (e.g., 0.2)
Increase the topK value
Add more profile memories for better context
Use a higher quality embedding model

Best Practices

Memory Organization

Use profile memories for persistent info: Things that rarely change
Use episode memories for recent events: Time-sensitive information
Use knowledge memories for facts: Learned information

Performance Tips

Set appropriate thresholds: Too low = irrelevant results, too high = missing context
Keep topK reasonable: 5-10 is usually sufficient
Periodic cleanup: Archive or delete outdated memories

Privacy Considerations

Review extracted memories: Check what the LLM is storing
Manual cleanup: Remove sensitive information if needed
User-specific memories: Memories are scoped to user IDs

Memory System ​

Overview ​

Configuration ​

Memory Types ​

Profile Memories ​

Episode Memories ​

Knowledge Memories ​

RAG (Retrieval-Augmented Generation) ​

How It Works ​

Configuration Options ​

Example Context Output ​

Memory Extraction ​

How It Works ​

Configuration Options ​

Prompt Configuration ​

Extraction Process ​

Database Schema ​

Tables ​

Memory Fields ​

Memory Management UI ​

Features ​

API Endpoints ​

Query Parameters ​

Setup ​

1. Set Up Embedding Service ​

2. Initialize Database ​

3. Configure Providers ​

4. Verify Setup ​

Docker Setup ​

Troubleshooting ​

Embedding Service Not Available ​

Database Not Found ​

Memory Not Being Extracted ​

Low Quality Retrieval ​

Best Practices ​

Memory Organization ​

Performance Tips ​

Privacy Considerations ​

Memory System

Overview

Configuration

Memory Types

Profile Memories

Episode Memories

Knowledge Memories

RAG (Retrieval-Augmented Generation)

How It Works

Configuration Options

Example Context Output

Memory Extraction

How It Works

Configuration Options

Prompt Configuration

Extraction Process

Database Schema

Tables

Memory Fields

Memory Management UI

Features

API Endpoints

Query Parameters

Setup

1. Set Up Embedding Service

2. Initialize Database

3. Configure Providers

4. Verify Setup

Docker Setup

Troubleshooting

Embedding Service Not Available

Database Not Found

Memory Not Being Extracted

Low Quality Retrieval

Best Practices

Memory Organization

Performance Tips

Privacy Considerations