Docker Setup
Mirror Mate can be deployed using Docker with a recommended split architecture.
Recommended Architecture
Run heavy services on a powerful server (e.g., Mac Studio) and the UI on a lightweight device (e.g., Raspberry Pi), connected via Tailscale:
Quick Start
Mac Studio Setup
- Install Ollama (native)
brew install ollama
brew services start ollama
ollama pull gpt-oss:20b- Start Docker Services
docker compose -f compose.studio.yaml up -dThis starts:
- faster-whisper (:8080) - Speech-to-text (Whisper)
- VOICEVOX (:50021) - Text-to-speech
- PLaMo-Embedding-1B (:8000) - Japanese-optimized embedding
- Verify
curl http://localhost:11434/api/tags # Ollama
curl http://localhost:8080/health # Whisper
curl http://localhost:50021/speakers | head # VOICEVOX
curl http://localhost:8000/health # PLaMoRaspberry Pi Setup
- Create Environment File
cp .env.example .env
# Edit .env with your API keys- Configure providers.yaml
Ensure config/providers.yaml points to your Mac Studio:
providers:
llm:
provider: ollama
ollama:
baseUrl: "http://studio:11434" # Tailscale hostname
tts:
provider: voicevox
voicevox:
baseUrl: "http://studio:50021" # Tailscale hostname
stt:
provider: local # Use local Whisper instead of Web Speech API
local:
baseUrl: "http://studio:8080" # Tailscale hostname
model: large-v3
language: ja
embedding:
provider: ollama
ollama:
baseUrl: "http://studio:8000" # PLaMo server- Start UI
docker compose up -d- Access
Compose Files
compose.yaml (UI - Raspberry Pi)
services:
mirrormate:
build:
context: .
dockerfile: Dockerfile
ports:
- "3000:3000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- GOOGLE_SERVICE_ACCOUNT_EMAIL=${GOOGLE_SERVICE_ACCOUNT_EMAIL}
- GOOGLE_PRIVATE_KEY=${GOOGLE_PRIVATE_KEY}
- GOOGLE_CALENDAR_ID=${GOOGLE_CALENDAR_ID}
- OLLAMA_API_KEY=${OLLAMA_API_KEY}
- DISCORD_WEBHOOK_URL=${DISCORD_WEBHOOK_URL}
volumes:
- ./config:/app/config:ro
- mirrormate-data:/app/data
restart: unless-stopped
volumes:
mirrormate-data:compose.studio.yaml (Services - Mac Studio)
services:
# Speech-to-Text (Whisper)
faster-whisper:
image: fedirz/faster-whisper-server:latest-cpu
ports:
- "8080:8000"
environment:
- WHISPER__MODEL=large-v3
- WHISPER__INFERENCE_DEVICE=cpu
restart: unless-stopped
# Text-to-Speech
voicevox:
image: voicevox/voicevox_engine:cpu-ubuntu20.04-latest
ports:
- "50021:50021"
restart: unless-stopped
# Embedding
plamo-embedding:
build:
context: ./plamo-server
dockerfile: Dockerfile
ports:
- "8000:8000"
restart: unless-stoppedData Persistence
The SQLite database (memories, users, sessions) is stored in the mirrormate-data Docker volume:
# View volume location
docker volume inspect mirrormate-data
# Backup the database
docker run --rm -v mirrormate-data:/data -v $(pwd):/backup alpine \
cp /data/mirrormate.db /backup/mirrormate-backup.db
# Restore from backup
docker run --rm -v mirrormate-data:/data -v $(pwd):/backup alpine \
cp /backup/mirrormate-backup.db /data/mirrormate.dbEnvironment Variables
Create a .env file:
# Required for OpenAI LLM/TTS (if not using Ollama/VOICEVOX)
OPENAI_API_KEY=sk-...
# Optional: Google Calendar
GOOGLE_SERVICE_ACCOUNT_EMAIL=...
GOOGLE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
GOOGLE_CALENDAR_ID=...
# Optional: Web Search (Ollama)
# Get API key from: https://ollama.com/settings/keys
OLLAMA_API_KEY=your-ollama-api-key
# Optional: Discord Integration
DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...GPU Support
NVIDIA GPU (Linux)
For faster inference with NVIDIA GPU:
services:
# Whisper with CUDA
faster-whisper:
image: fedirz/faster-whisper-server:latest-cuda
ports:
- "8080:8000"
environment:
- WHISPER__MODEL=large-v3
- WHISPER__INFERENCE_DEVICE=cuda
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# VOICEVOX with CUDA
voicevox:
image: voicevox/voicevox_engine:nvidia-ubuntu20.04-latest
ports:
- "50021:50021"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]Apple Silicon (Mac Studio)
Use CPU versions. Apple Silicon is fast enough for real-time inference:
- faster-whisper:
large-v3model processes 30s audio in ~15s - VOICEVOX: CPU version works well on M1/M2 Ultra
Note: Metal GPU acceleration is not available in Docker containers.
Troubleshooting
Cannot connect to Ollama
Error: Connection refused to studio:11434
Solution:
- Ensure Ollama is running on Mac Studio:
ollama serveorbrew services start ollama - Verify Tailscale connection:
ping studio - Check Ollama is listening:
curl http://studio:11434/api/tags
VOICEVOX not responding
Error: Connection refused to port 50021
Solution:
- Check VOICEVOX container:
docker compose -f compose.studio.yaml ps - Check logs:
docker compose -f compose.studio.yaml logs voicevox - Wait for startup (first launch takes time to load models)
Config changes not applied
Solution: Config is mounted as read-only. Changes apply immediately but may require page refresh.
Whisper not transcribing
Error: No transcription returned or empty response
Solution:
- Check faster-whisper container:
docker compose -f compose.studio.yaml ps - Check logs:
docker compose -f compose.studio.yaml logs faster-whisper - First request may be slow (model loading)
- Verify endpoint:
curl http://studio:8080/health - Check
providers.yamlhasprovider: local(notweb)
Commands
Mac Studio
# Start VOICEVOX
docker compose -f compose.studio.yaml up -d
# View logs
docker compose -f compose.studio.yaml logs -f
# Stop
docker compose -f compose.studio.yaml downRaspberry Pi
# Start UI
docker compose up -d
# View logs
docker compose logs -f
# Stop
docker compose down
# Rebuild after code changes
docker compose build --no-cache
docker compose up -d