RPG API Architecture¶
Overview¶
The RPG API is a Pydantic AI-powered Dungeon Master agent for solo TTRPG campaigns. It provides persistent world state, tool-calling for game mechanics, and streaming chat responses.
System Architecture¶
┌─────────────────────────────────────────────────────────────────┐
│ rpg-web (Next.js) │
│ ┌─────────────┐ ┌─────────────────────────────────────────┐ │
│ │ Campaign │ │ Chat Interface (assistant-ui) │ │
│ │ Selector │ │ - Sync + typewriter animation │ │
│ │ │ │ - Tool call visibility │ │
│ │ │ │ - Session persistence (localStorage) │ │
│ └─────────────┘ └─────────────────────────────────────────┘ │
│ │ │
│ Next.js API Proxy Routes │
│ /api/campaigns/[id]/chat │
└──────────────────────────────┬──────────────────────────────────┘
│ HTTP (internal K8s)
▼
┌─────────────────────────────────────────────────────────────────┐
│ palimpsest (FastAPI) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ DM Agent (Pydantic AI) │ │
│ │ - System prompt with response format rules │ │
│ │ - Multi-tool calls per turn │ │
│ │ - DeepSeek via OpenRouter │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┴───────────────────────────┐ │
│ │ Tools │ │
│ │ roll_dice - Dice rolling with notation parsing │ │
│ │ update_pc - HP, location, conditions, inventory │ │
│ │ get_entity - Retrieve NPC/location/item details │ │
│ │ search_entities- Find entities by keyword │ │
│ │ add_thread - Create plot threads │ │
│ │ resolve_thread - Mark threads complete │ │
│ │ add_timeline - Record events to campaign timeline │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┴───────────────────────────┐ │
│ │ Context Injection Layer │ │
│ │ 1. Campaign premise (from seed.md) │ │
│ │ 2. PC state (HP, location, conditions) │ │
│ │ 3. Active threads (high/medium priority) │ │
│ │ 4. Smart context (auto-retrieve mentioned entities) │ │
│ └───────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌──────────────────┐ ┌─────────────────────┐
│ Campaign Vault │ │ pgvector │ │ Supabase (prod) │
│ (PVC Mount) │ │ (dedicated) │ │ │
│ │ │ │ │ palimpsest.campaigns│
│ /vault/{campaign} │ │ palimpsest_ │ │ palimpsest.sessions │
│ ├── canon/ │ │ vectors db │ │ palimpsest.messages │
│ │ ├── pcs/ │ │ │ │ │
│ │ ├── npcs/ │ │ - embeddings │ │ Chat history │
│ │ ├── locations/│ │ - rag_config │ │ Session management │
│ │ └── ... │ │ │ │ Vault state │
│ ├── _gm/ │ │ RAG vector │ │ │
│ └── sessions/ │ │ similarity search│ │ │
└─────────────────────┘ └──────────────────┘ └─────────────────────┘
Database Architecture¶
Palimpsest uses two PostgreSQL databases for different purposes:
Supabase (Shared Infrastructure)¶
- Host:
postgres.supabase.svc.cluster.local(prod Supabase) - Schema:
palimpsest - Purpose: Relational data storage
- Tables:
campaigns- Campaign metadatasessions- Play sessionschat_messages- Conversation history
pgvector (Dedicated Instance)¶
- Host:
pgvector.palimpsest-api.svc.cluster.local - Database:
palimpsest_vectors - Purpose: RAG vector embeddings for AI similarity search
- Tables:
embeddings- Vector embeddings for entity contentrag_config- RAG configuration
Why two databases? - Vector similarity queries can be resource-intensive - Isolating embeddings prevents impact on other Supabase apps - Allows independent tuning of pgvector for AI workloads
Connection config: See manifests/apps/palimpsest-api/configmap.yaml
Key Components¶
1. DM Agent (api/agent/dm.py)¶
The core agent uses Pydantic AI with a curated system prompt:
- Response format: 500-3000 characters, 5 action options
- Secrets system: NPCs have secrets that require effort to discover
- Tool usage: Dice rolls, PC updates, entity lookups
- Multi-tool support: Can make multiple tool calls per turn
2. Context Injection (api/routes/chat.py)¶
Before each agent run, context is automatically injected:
- Campaign premise - Setting and premise from
seed.md - Session state - Current day, time, session number
- PC status - HP, location, conditions
- Active threads - High and medium priority plot hooks
- Smart context - Entities mentioned in player message (RAG-like)
This eliminates the need for RAG vector embeddings while providing equivalent context awareness.
3. Campaign Spawning (POST /campaigns)¶
Creates a complete campaign from minimal seed data:
curl -X POST /campaigns -d '{
"campaign_id": "mojave",
"name": "Mojave Wasteland",
"setting": "post-apocalyptic",
"premise": "A courier shot in the head...",
"pc_name": "Six",
"pc_class": "Courier",
"pc_concept": "Survivor seeking revenge"
}'
This creates:
- canon/seed.md - Campaign configuration
- canon/pcs/{name}.md - Player character file
- canon/current-state.md - Campaign state
- canon/temporal-index.json - Time tracking
- canon/open-threads.md - Plot thread index
- canon/timeline.md - Event history
4. Smart Context Extraction¶
When a player mentions an entity name (e.g., "I talk to Marlena"), the system automatically:
- Extracts potential entity names from the message
- Searches the vault for matching NPCs/locations/items
- Injects a summary into the context
This provides RAG-like ambient context without vector embeddings.
Request Flow¶
Player: "I attack the goblin"
│
▼
┌─────────────────────────────────────┐
│ 1. Extract mentioned entities │
│ - "goblin" → search vault │
│ - Found: goblin NPC file │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 2. Build context injection │
│ - Campaign premise │
│ - PC state (HP: 10/10) │
│ - Mentioned entity: goblin │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. Run DM agent │
│ - Agent decides to call tools │
│ - roll_dice("1d20+5") → 18 │
│ - roll_dice("1d8+3") → 7 │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. Stream response │
│ "Your blade arcs through..." │
│ Tool events: attack=18, dmg=7 │
│ 5 action options │
└─────────────────────────────────────┘
Comparison: AnythingLLM vs Pydantic AI¶
| Feature | AnythingLLM | Pydantic AI Agent |
|---|---|---|
| Tools per turn | 1 (macro workaround) | Multiple (native) |
| Context source | RAG vector embeddings | Injection + smart context |
| Token cost | High (8-16k/request) | Low (targeted) |
| State persistence | File-based only | Supabase + files |
| Retrieval | Automatic (similarity) | Agent-controlled + auto |
| Response quality | Good | Good (with proper prompt) |
API Endpoints¶
Chat¶
POST /campaigns/{id}/chat- Streaming SSE chat (deprecated)POST /campaigns/{id}/chat/sync- Synchronous chat (recommended)
Streaming vs Sync Endpoints¶
The frontend uses the sync endpoint with client-side typewriter animation. This was chosen over SSE streaming due to a Pydantic AI limitation:
| Endpoint | Tool Execution | UX | Use Case |
|---|---|---|---|
/chat (SSE) |
Unreliable - tools after text may not execute | Real-time text | Not recommended |
/chat/sync |
100% reliable - all tools execute | Typewriter animation | Production use |
Why sync is preferred:
-
State persistence reliability: Streaming has a Pydantic AI limitation where tools called after text is emitted may not execute. This means
update_pc()calls for HP, location, or inventory changes can be silently dropped. -
Typewriter provides similar UX: The frontend animates the sync response character-by-character at ~180 chars/sec, providing a similar "streaming" feel while guaranteeing all tool calls complete.
-
Simpler error handling: Sync responses are atomic - either the full response succeeds or fails. No partial state from interrupted streams.
Implementation (rpg-web/src/components/rpg/runtime-provider.tsx):
// Fetch complete response from sync endpoint
const response = await fetch(`/api/campaigns/${campaignId}/chat/sync`, {...});
const { response: text, session_id } = await response.json();
// Animate with typewriter effect (~60fps, 3 chars per frame)
animateTypewriter(text, messageId, () => setIsRunning(false));
Campaign Management¶
GET /campaigns- List campaignsPOST /campaigns- Create campaign from seed
Entity Operations¶
GET /campaigns/{id}/entities/{type}- List entitiesGET /campaigns/{id}/entities/{type}/{id}- Get entityPOST /campaigns/{id}/entities/{type}- Create entity
PC Management¶
GET /campaigns/{id}/pc- Get PC statePATCH /campaigns/{id}/pc- Update PC state
Dice¶
POST /campaigns/{id}/dice/roll- Roll dice
Configuration¶
Environment variables:
- RPG_CAMPAIGNS_PATH - Path to campaign vault mount
- LLM_PROVIDER - "openrouter" or "openai"
- OPENROUTER_API_KEY - API key for OpenRouter
- OPENROUTER_MODEL - Model ID (default: deepseek/deepseek-chat)
- POSTGRES_* - Supabase connection details
Deprecated Features¶
Macro Tools¶
The macro tools (introduce_npc_bundle, quick_npc, apply_batch) were
created to work around AnythingLLM's one-tool-per-turn limitation. With
Pydantic AI supporting multiple tool calls natively, these are deprecated.
The routes remain for backwards compatibility but should not be used for new development.