DM Agent Quality Analysis¶
Date: 2025-12-31 Model: DeepSeek (via OpenRouter) Campaign: Seagate Sessions Analyzed: Day 5 morning scenes
Executive Summary¶
The DM agent successfully uses injected context to hit plot beats but consistently fails to capture character texture, relationship nuance, and setting accuracy. The model hallucinates locations, invents NPCs, mischaracterizes established relationships, and prioritizes action over emotional payoff.
1. Context Injection Analysis¶
What Context Is Injected¶
Full context (on session start/resume): ~2700 chars
## SESSION CONTEXT: Seagate
**Day 5** - morning
### Player Character
**Jake** (Wizard)
- HP: 13/13
- Location: the-salty-sigil
- Gold: 50 gp
### Current Location
**The Salty Sigil**
### NPCs Present
- **marlena** [active] - intimate-ally
- **gareth** [rescued] - ally
### Active Storylines
- **[URGENT]** The Fading Muffle (URGENT)
The stolen Vintner's Guild token has a scrying anchor...
- **[URGENT]** Silent Circle Salon (TONIGHT)
...
- **[URGENT]** Gareth & Jorah Recovery
Both miners are traumatized...
Light context (on follow-up): ~67 chars
What Context Is Missing¶
| Missing Element | Impact |
|---|---|
| NPC personality descriptions | Model defaults to tropes (Elara → bartender) |
| Location descriptions | Model invents ("Salinmoor" instead of Seagate) |
| Relationship history | "intimate-ally" doesn't convey HOW they became intimate |
| Recent events for Jake | No mention that Jake rode out last night, is exhausted |
| NPC voice/speech patterns | Marlena's "slow drawl," Elara's formality not conveyed |
2. Failure Cases¶
Failure 1: Location Hallucination (Salinmoor)¶
Context injected: ## SESSION CONTEXT: Seagate
User message: "Let's all go to the gilded quill, I'll finish preparing there"
DM response:
"The cobbled streets of Salinmoor's merchant district bustle with morning activity..."
Analysis:
- Salinmoor appears in open-threads.md as a historical tragedy from Lorcan's backstory
- The model confused a lore reference with the current setting
- Context says "Seagate" but model substituted a more "interesting" name it found in training or prior context
Root cause: Location name in context header, but no reinforcement. Model has weak grounding on setting name.
Failure 2: Elara Mischaracterization¶
Canon (from npcs/elara.md):
type: npc
status: active
disposition: ally
location: gilded-quill
---
# Elara
## Role
Half-elf proprietor of the Gilded Quill bookstore.
Secretly a **Lorekeeper of Oghma** with access to Conclave vault procedures.
## Personality
Stern but principled. Initially adversarial over rent; became ally when faced with greater threat.
Context injected: Elara not in NPCs Present (correct - she's at Gilded Quill, not Salty Sigil)
DM response:
"Behind the bar, Elara, the Quill's proprietor, polishes glasses with a cloth that glows faintly blue." "Clock's ticking, kid. What's your play?"
Analysis: - Elara runs a BOOKSTORE, not a tavern - no bar, no glasses - "kid" is casual/sassy; Elara is stern and scholarly - The "Gilded Quill" name apparently triggered "quill = pub" association - Dragonfly-wing cloth, iridescent quill artifact - pure invention
Root cause: When player moves to a new location, no location/NPC context is injected. Model invents based on name association.
Failure 3: Invented Mechanics and NPCs¶
DM response:
"Try not to wake the other guest—she's been hex-hunting in the Feywild and came back… twitchy."
"But the Conclave has a backdoor. Three locks: blood, breath, and a lie."
Canon: No Feywild hex-hunter exists. No "blood, breath, lie" Conclave backdoor exists.
Analysis: - Model fills narrative gaps with creative invention - These inventions are tonally appropriate but canonically wrong - Creates continuity problems if player engages with invented elements
Root cause: No grounding mechanism. Model doesn't know what it CAN'T say.
Failure 4: Relationship Texture Missing¶
Canon (from npcs/marlena.md):
## Current Relationship
**Intimate alliance** - evolved from transactional accord
Day 3 Night: Jake arrived with a nautilus shell gift, using dramatic magic...
Extended intimate encounter followed. Post-coital, she shared strategic intel...
## Personality
- Has a vulnerable side that emerged during intimate moments
- Speaks in a slow, measured drawl (drops when emotional)
Context injected: - **marlena** [active] - intimate-ally
DM response:
Marlena glances up, her silver-ringed fingers pausing mid-air. "Morning, Jake. Sleep well? Or did the weight of tonight's… performance keep you tossing?" She smirks...
Analysis: - Tone is businesslike with light flirtation - Missing: physical intimacy (touch), genuine concern, vulnerability - "intimate-ally" disposition is too abstract - doesn't convey their history - She should be worried about HIM (he rode out last night) not jumping to mission briefing
What it should be:
Marlena emerges from the back room, pausing in the doorway. Her hair is loose, uncombed—she hasn't slept either. She crosses silently and puts a hand on your shoulder, her thumb tracing a small circle. "You came back," she says quietly. No drawl. No smirk.
Root cause: Disposition label doesn't convey relationship texture or recent history.
Failure 5: Emotional Payoff Offscreen¶
Canon (from npcs/gareth.md):
## Connections
- **Jorah** - Partner, lover(?), still at Gilded Quill
## Story Hooks
- Reuniting with Jorah (emotional scene)
User action: Goes to Gilded Quill specifically to reunite them
DM response:
Gareth doesn't wait. He bolts for the stairs... Upstairs, a door creaks open, and you hear Gareth's voice, raw with emotion: "J-Jorah? Stars, your hand—"
Analysis: - The reunion - the emotional POINT of going to the Quill - happens offscreen - Player is left with Marlena doing stitching while the real scene happens upstairs - "Stars, your hand" implies an injury that doesn't exist in canon - Model prioritized maintaining player agency (choices) over delivering emotional payoff
Root cause: No guidance on when to "zoom in" on NPC moments vs keep player central.
Failure 6: Jake's State Ignored¶
Canon (from timeline.md):
### Night (Day 4)
- Finds lightning-blasted pine landmark
- Discovers Gareth in gully, entranced, holding glowing fragment
- Environment warped: rocks leaning, shadows moving, psychic scar on rock face
...
- Jake rescues exhausted Gareth, rides back to Seagate
### Late Night
- Returns through northern gate
- Takes Gareth to The Salty Sigil
Context injected: No mention of Jake's exhaustion or last night's ordeal
DM response: Treats morning as fresh start, no acknowledgment of Jake's state
Analysis: - Jake rode through the night into supernatural horror and back - He probably hasn't slept - NPCs should react to his exhaustion, show concern - This affects tactical choices (rest vs. push through)
Root cause: PC state in context is mechanical (HP, location) not narrative (exhausted, shaken).
3. System Prompt Analysis¶
Current system prompt excerpt:
You're RPG-Bot, an impartial Dungeon Master crafting captivating, limitless experiences.
## TONE & ATMOSPHERE
...
What's missing:
-
Canon grounding instruction:
"Never invent locations, NPCs, or mechanics not established in context. When uncertain, ask the player or keep descriptions vague."
-
Relationship guidance:
"NPC dispositions indicate relationship depth: - intimate-ally: Physical affection, genuine concern, vulnerability - ally: Trust and cooperation, professional warmth - neutral: Transactional, cautious"
-
Emotional payoff instruction:
"When NPCs have pending emotional moments (reunions, confrontations, revelations), zoom in. Let these scenes breathe before returning to player choices."
-
Location change handling:
"When the player moves to a new location, describe it based on established canon. If no description exists, use minimal sensory details and focus on NPCs present."
4. Holistic Remediation Plan¶
Phase 1: Context Enrichment (Immediate)¶
1.1 Add NPC Personality Snippets to Context¶
Current:
Proposed:
### NPCs Present
- **marlena** [active] - intimate-ally
Voice: slow measured drawl, drops when emotional. Shows vulnerability with Jake.
Recent: Stayed up nursing Gareth. Worried about Jake's night ride.
Implementation:
- Extend ContextBuilder.format_full() to include 1-2 line NPC summaries
- Pull from personality or voice fields in NPC frontmatter
- Add recent_context field to NPCs for session-relevant state
1.2 Add Location Context on Movement¶
Current: No location injection when player moves
Proposed: When player moves to new location, inject location description:
### Current Location: Gilded Quill
A three-story bookshop with ivy-covered blue facade. Smells of old paper and ink.
Proprietor: Elara (half-elf Lorekeeper, stern and scholarly).
Current guests: Jorah (upstairs, deteriorating mentally)
Implementation: - Track location changes in chat handler - On location change, fetch location entity and inject summary - Include NPCs at that location with brief descriptors
1.3 Add PC Narrative State¶
Current:
Proposed:
**Jake** (Wizard)
- HP: 13/13
- Location: the-salty-sigil
- Condition: Exhausted (rode through the night, no sleep)
- Last major event: Rescued Gareth from psychic nightmare in Sunfall Hills
Implementation:
- Add narrative_state field to PC frontmatter
- Update after significant events
- Include in context injection
Phase 2: System Prompt Refinement¶
2.1 Add Canon Grounding Rules¶
## CANON DISCIPLINE
You must ONLY reference:
- Locations named in SESSION CONTEXT or established in prior conversation
- NPCs listed in context or previously introduced
- Mechanics and items from the player's inventory or quest progress
If you need an NPC, location, or item not in context:
- Use a generic descriptor ("a merchant," "a side street")
- Or ask the player: "Do you know anyone in this district who might help?"
NEVER invent:
- Named locations not in context (especially: do not confuse SEAGATE with historical locations like Salinmoor)
- Named NPCs with specific backstories
- Magical items, artifacts, or puzzle mechanics
2.2 Add Relationship Texture Guidance¶
## NPC RELATIONSHIPS
Disposition labels indicate how NPCs should BEHAVE:
**intimate-ally**: Physical affection (touch, proximity). Genuine emotional concern. Vulnerability and honesty. Prioritize their wellbeing in dialogue.
**ally**: Warm professionalism. Trust. Will help without needing persuasion. May show concern but maintains some formality.
**neutral**: Transactional. Guarded. Requires persuasion or payment for assistance.
**hostile**: Active opposition. Deception. May attack or betray.
When an NPC has a VOICE or SPEECH PATTERN noted in context, use it consistently.
2.3 Add Emotional Payoff Guidance¶
## EMOTIONAL BEATS
Some moments are more important than player choices. When context indicates:
- A reunion between separated loved ones
- A revelation about a mystery
- A confrontation with a personal antagonist
- A death or loss
ZOOM IN. Let the scene breathe. Describe the emotion, the physical reactions, the words exchanged. Don't rush to the next choice menu.
The player came to this moment for a reason. Deliver it.
Phase 3: Structural Improvements¶
3.1 RAG-Enhanced NPC Retrieval¶
When an NPC is mentioned by name in player message or appears in context: 1. Retrieve full NPC document from vault 2. Extract personality, voice, and recent interactions 3. Inject as "NPC Reference" block
Example injection:
### NPC Reference: Elara
Role: Proprietor of Gilded Quill bookstore. Lorekeeper of Oghma.
Personality: Stern but principled. Scholarly. Formal speech.
Recent: Performed divination to locate Gareth. Watching over deteriorating Jorah.
Relationship: Ally. Granted Jake 6 months free rent after Lorcan crisis.
Voice: Precise, measured. Does not use slang or nicknames.
3.2 Location Change Detection¶
Implement location tracking in chat handler:
def detect_location_change(message: str, current_location: str) -> Optional[str]:
"""Detect if player is moving to a new location."""
movement_patterns = [
r"go to (?:the )?(\w+[-\w]*)",
r"head to (?:the )?(\w+[-\w]*)",
r"visit (?:the )?(\w+[-\w]*)",
r"let's .* to (?:the )?(\w+[-\w]*)",
]
# Match against known locations in vault
# Return new location ID if found
When location change detected: 1. Update PC location in vault 2. Inject new location context 3. Inject NPCs at new location
3.3 Scene Type Classification¶
Classify incoming messages to adjust response style:
| Scene Type | Indicators | Response Style |
|---|---|---|
| Reunion | NPCs with relationship meeting | Zoom in, emotional detail |
| Preparation | "prep," "ready," "plan" | Practical, choice-focused |
| Exploration | "look around," "search" | Descriptive, discovery |
| Combat | Attack verbs, initiative | Tactical, rapid |
| Social | NPC names, "talk to," "ask" | Dialogue-heavy, NPC voice |
Phase 4: Evaluation Framework¶
4.1 Canon Compliance Checklist¶
After each session, evaluate: - [ ] All locations named are in vault - [ ] All NPCs named are in vault or clearly marked as new - [ ] No invented mechanics or artifacts - [ ] NPC voices match documented patterns - [ ] Relationships reflected in NPC behavior
4.2 Emotional Delivery Checklist¶
- [ ] Pending reunions/confrontations given screen time
- [ ] Player character state acknowledged by NPCs
- [ ] Intimate relationships show appropriate warmth
- [ ] Stakes are felt, not just stated
4.3 Automated Regression Tests¶
Create test cases:
- name: "Elara should not be a bartender"
message: "I go to the Gilded Quill"
must_contain: ["book", "scroll", "paper", "Lorekeeper"]
must_not_contain: ["bar", "glasses", "drink", "tavern"]
- name: "Setting is Seagate not Salinmoor"
message: "start session"
must_contain: ["Seagate"]
must_not_contain: ["Salinmoor"]
- name: "Marlena intimate behavior"
context_includes: "marlena.*intimate-ally"
must_contain_pattern: "(touch|hand|close|tender|worried)"
5. Implementation Priority¶
| Priority | Task | Effort | Impact |
|---|---|---|---|
| P0 | Add canon grounding rules to system prompt | Low | High |
| P0 | Fix location name reinforcement in context | Low | High |
| P1 | Add NPC personality snippets to context | Medium | High |
| P1 | Add relationship texture to system prompt | Low | Medium |
| P1 | Inject location context on movement | Medium | High |
| P2 | Add PC narrative state | Low | Medium |
| P2 | RAG-enhanced NPC retrieval | High | High |
| P2 | Scene type classification | Medium | Medium |
| P3 | Automated regression tests | Medium | Medium |
6. Success Metrics¶
After implementation, measure:
- Canon Compliance Rate: % of responses with no invented locations/NPCs/mechanics
- Relationship Accuracy: Manual scoring of NPC behavior vs disposition
- Emotional Delivery: Player satisfaction with key scenes (reunion, revelation)
- Location Accuracy: Zero Salinmoor-type errors
Target: 95% canon compliance, 4/5 relationship accuracy, 4/5 emotional delivery within 2 iterations.