Skip to content

DM Agent Quality Analysis

Date: 2025-12-31 Model: DeepSeek (via OpenRouter) Campaign: Seagate Sessions Analyzed: Day 5 morning scenes


Executive Summary

The DM agent successfully uses injected context to hit plot beats but consistently fails to capture character texture, relationship nuance, and setting accuracy. The model hallucinates locations, invents NPCs, mischaracterizes established relationships, and prioritizes action over emotional payoff.


1. Context Injection Analysis

What Context Is Injected

Full context (on session start/resume): ~2700 chars

## SESSION CONTEXT: Seagate

**Day 5** - morning

### Player Character
**Jake** (Wizard)
- HP: 13/13
- Location: the-salty-sigil
- Gold: 50 gp

### Current Location
**The Salty Sigil**

### NPCs Present
- **marlena** [active] - intimate-ally
- **gareth** [rescued] - ally

### Active Storylines
- **[URGENT]** The Fading Muffle (URGENT)
  The stolen Vintner's Guild token has a scrying anchor...
- **[URGENT]** Silent Circle Salon (TONIGHT)
  ...
- **[URGENT]** Gareth & Jorah Recovery
  Both miners are traumatized...

Light context (on follow-up): ~67 chars

[Day 5 | Jake HP 13/13 | at the-salty-sigil | with marlena, gareth]

What Context Is Missing

Missing Element Impact
NPC personality descriptions Model defaults to tropes (Elara → bartender)
Location descriptions Model invents ("Salinmoor" instead of Seagate)
Relationship history "intimate-ally" doesn't convey HOW they became intimate
Recent events for Jake No mention that Jake rode out last night, is exhausted
NPC voice/speech patterns Marlena's "slow drawl," Elara's formality not conveyed

2. Failure Cases

Failure 1: Location Hallucination (Salinmoor)

Context injected: ## SESSION CONTEXT: Seagate

User message: "Let's all go to the gilded quill, I'll finish preparing there"

DM response:

"The cobbled streets of Salinmoor's merchant district bustle with morning activity..."

Analysis: - Salinmoor appears in open-threads.md as a historical tragedy from Lorcan's backstory - The model confused a lore reference with the current setting - Context says "Seagate" but model substituted a more "interesting" name it found in training or prior context

Root cause: Location name in context header, but no reinforcement. Model has weak grounding on setting name.


Failure 2: Elara Mischaracterization

Canon (from npcs/elara.md):

type: npc
status: active
disposition: ally
location: gilded-quill
---
# Elara
## Role
Half-elf proprietor of the Gilded Quill bookstore.
Secretly a **Lorekeeper of Oghma** with access to Conclave vault procedures.

## Personality
Stern but principled. Initially adversarial over rent; became ally when faced with greater threat.

Context injected: Elara not in NPCs Present (correct - she's at Gilded Quill, not Salty Sigil)

DM response:

"Behind the bar, Elara, the Quill's proprietor, polishes glasses with a cloth that glows faintly blue." "Clock's ticking, kid. What's your play?"

Analysis: - Elara runs a BOOKSTORE, not a tavern - no bar, no glasses - "kid" is casual/sassy; Elara is stern and scholarly - The "Gilded Quill" name apparently triggered "quill = pub" association - Dragonfly-wing cloth, iridescent quill artifact - pure invention

Root cause: When player moves to a new location, no location/NPC context is injected. Model invents based on name association.


Failure 3: Invented Mechanics and NPCs

DM response:

"Try not to wake the other guest—she's been hex-hunting in the Feywild and came back… twitchy."

"But the Conclave has a backdoor. Three locks: blood, breath, and a lie."

Canon: No Feywild hex-hunter exists. No "blood, breath, lie" Conclave backdoor exists.

Analysis: - Model fills narrative gaps with creative invention - These inventions are tonally appropriate but canonically wrong - Creates continuity problems if player engages with invented elements

Root cause: No grounding mechanism. Model doesn't know what it CAN'T say.


Failure 4: Relationship Texture Missing

Canon (from npcs/marlena.md):

## Current Relationship
**Intimate alliance** - evolved from transactional accord

Day 3 Night: Jake arrived with a nautilus shell gift, using dramatic magic...
Extended intimate encounter followed. Post-coital, she shared strategic intel...

## Personality
- Has a vulnerable side that emerged during intimate moments
- Speaks in a slow, measured drawl (drops when emotional)

Context injected: - **marlena** [active] - intimate-ally

DM response:

Marlena glances up, her silver-ringed fingers pausing mid-air. "Morning, Jake. Sleep well? Or did the weight of tonight's… performance keep you tossing?" She smirks...

Analysis: - Tone is businesslike with light flirtation - Missing: physical intimacy (touch), genuine concern, vulnerability - "intimate-ally" disposition is too abstract - doesn't convey their history - She should be worried about HIM (he rode out last night) not jumping to mission briefing

What it should be:

Marlena emerges from the back room, pausing in the doorway. Her hair is loose, uncombed—she hasn't slept either. She crosses silently and puts a hand on your shoulder, her thumb tracing a small circle. "You came back," she says quietly. No drawl. No smirk.

Root cause: Disposition label doesn't convey relationship texture or recent history.


Failure 5: Emotional Payoff Offscreen

Canon (from npcs/gareth.md):

## Connections
- **Jorah** - Partner, lover(?), still at Gilded Quill

## Story Hooks
- Reuniting with Jorah (emotional scene)

User action: Goes to Gilded Quill specifically to reunite them

DM response:

Gareth doesn't wait. He bolts for the stairs... Upstairs, a door creaks open, and you hear Gareth's voice, raw with emotion: "J-Jorah? Stars, your hand—"

Analysis: - The reunion - the emotional POINT of going to the Quill - happens offscreen - Player is left with Marlena doing stitching while the real scene happens upstairs - "Stars, your hand" implies an injury that doesn't exist in canon - Model prioritized maintaining player agency (choices) over delivering emotional payoff

Root cause: No guidance on when to "zoom in" on NPC moments vs keep player central.


Failure 6: Jake's State Ignored

Canon (from timeline.md):

### Night (Day 4)
- Finds lightning-blasted pine landmark
- Discovers Gareth in gully, entranced, holding glowing fragment
- Environment warped: rocks leaning, shadows moving, psychic scar on rock face
...
- Jake rescues exhausted Gareth, rides back to Seagate

### Late Night
- Returns through northern gate
- Takes Gareth to The Salty Sigil

Context injected: No mention of Jake's exhaustion or last night's ordeal

DM response: Treats morning as fresh start, no acknowledgment of Jake's state

Analysis: - Jake rode through the night into supernatural horror and back - He probably hasn't slept - NPCs should react to his exhaustion, show concern - This affects tactical choices (rest vs. push through)

Root cause: PC state in context is mechanical (HP, location) not narrative (exhausted, shaken).


3. System Prompt Analysis

Current system prompt excerpt:

You're RPG-Bot, an impartial Dungeon Master crafting captivating, limitless experiences.

## TONE & ATMOSPHERE
...

What's missing:

  1. Canon grounding instruction:

    "Never invent locations, NPCs, or mechanics not established in context. When uncertain, ask the player or keep descriptions vague."

  2. Relationship guidance:

    "NPC dispositions indicate relationship depth: - intimate-ally: Physical affection, genuine concern, vulnerability - ally: Trust and cooperation, professional warmth - neutral: Transactional, cautious"

  3. Emotional payoff instruction:

    "When NPCs have pending emotional moments (reunions, confrontations, revelations), zoom in. Let these scenes breathe before returning to player choices."

  4. Location change handling:

    "When the player moves to a new location, describe it based on established canon. If no description exists, use minimal sensory details and focus on NPCs present."


4. Holistic Remediation Plan

Phase 1: Context Enrichment (Immediate)

1.1 Add NPC Personality Snippets to Context

Current:

### NPCs Present
- **marlena** [active] - intimate-ally

Proposed:

### NPCs Present
- **marlena** [active] - intimate-ally
  Voice: slow measured drawl, drops when emotional. Shows vulnerability with Jake.
  Recent: Stayed up nursing Gareth. Worried about Jake's night ride.

Implementation: - Extend ContextBuilder.format_full() to include 1-2 line NPC summaries - Pull from personality or voice fields in NPC frontmatter - Add recent_context field to NPCs for session-relevant state

1.2 Add Location Context on Movement

Current: No location injection when player moves

Proposed: When player moves to new location, inject location description:

### Current Location: Gilded Quill
A three-story bookshop with ivy-covered blue facade. Smells of old paper and ink.
Proprietor: Elara (half-elf Lorekeeper, stern and scholarly).
Current guests: Jorah (upstairs, deteriorating mentally)

Implementation: - Track location changes in chat handler - On location change, fetch location entity and inject summary - Include NPCs at that location with brief descriptors

1.3 Add PC Narrative State

Current:

**Jake** (Wizard)
- HP: 13/13
- Location: the-salty-sigil

Proposed:

**Jake** (Wizard)
- HP: 13/13
- Location: the-salty-sigil
- Condition: Exhausted (rode through the night, no sleep)
- Last major event: Rescued Gareth from psychic nightmare in Sunfall Hills

Implementation: - Add narrative_state field to PC frontmatter - Update after significant events - Include in context injection


Phase 2: System Prompt Refinement

2.1 Add Canon Grounding Rules

## CANON DISCIPLINE

You must ONLY reference:
- Locations named in SESSION CONTEXT or established in prior conversation
- NPCs listed in context or previously introduced
- Mechanics and items from the player's inventory or quest progress

If you need an NPC, location, or item not in context:
- Use a generic descriptor ("a merchant," "a side street")
- Or ask the player: "Do you know anyone in this district who might help?"

NEVER invent:
- Named locations not in context (especially: do not confuse SEAGATE with historical locations like Salinmoor)
- Named NPCs with specific backstories
- Magical items, artifacts, or puzzle mechanics

2.2 Add Relationship Texture Guidance

## NPC RELATIONSHIPS

Disposition labels indicate how NPCs should BEHAVE:

**intimate-ally**: Physical affection (touch, proximity). Genuine emotional concern. Vulnerability and honesty. Prioritize their wellbeing in dialogue.

**ally**: Warm professionalism. Trust. Will help without needing persuasion. May show concern but maintains some formality.

**neutral**: Transactional. Guarded. Requires persuasion or payment for assistance.

**hostile**: Active opposition. Deception. May attack or betray.

When an NPC has a VOICE or SPEECH PATTERN noted in context, use it consistently.

2.3 Add Emotional Payoff Guidance

## EMOTIONAL BEATS

Some moments are more important than player choices. When context indicates:
- A reunion between separated loved ones
- A revelation about a mystery
- A confrontation with a personal antagonist
- A death or loss

ZOOM IN. Let the scene breathe. Describe the emotion, the physical reactions, the words exchanged. Don't rush to the next choice menu.

The player came to this moment for a reason. Deliver it.

Phase 3: Structural Improvements

3.1 RAG-Enhanced NPC Retrieval

When an NPC is mentioned by name in player message or appears in context: 1. Retrieve full NPC document from vault 2. Extract personality, voice, and recent interactions 3. Inject as "NPC Reference" block

Example injection:

### NPC Reference: Elara
Role: Proprietor of Gilded Quill bookstore. Lorekeeper of Oghma.
Personality: Stern but principled. Scholarly. Formal speech.
Recent: Performed divination to locate Gareth. Watching over deteriorating Jorah.
Relationship: Ally. Granted Jake 6 months free rent after Lorcan crisis.
Voice: Precise, measured. Does not use slang or nicknames.

3.2 Location Change Detection

Implement location tracking in chat handler:

def detect_location_change(message: str, current_location: str) -> Optional[str]:
    """Detect if player is moving to a new location."""
    movement_patterns = [
        r"go to (?:the )?(\w+[-\w]*)",
        r"head to (?:the )?(\w+[-\w]*)",
        r"visit (?:the )?(\w+[-\w]*)",
        r"let's .* to (?:the )?(\w+[-\w]*)",
    ]
    # Match against known locations in vault
    # Return new location ID if found

When location change detected: 1. Update PC location in vault 2. Inject new location context 3. Inject NPCs at new location

3.3 Scene Type Classification

Classify incoming messages to adjust response style:

Scene Type Indicators Response Style
Reunion NPCs with relationship meeting Zoom in, emotional detail
Preparation "prep," "ready," "plan" Practical, choice-focused
Exploration "look around," "search" Descriptive, discovery
Combat Attack verbs, initiative Tactical, rapid
Social NPC names, "talk to," "ask" Dialogue-heavy, NPC voice

Phase 4: Evaluation Framework

4.1 Canon Compliance Checklist

After each session, evaluate: - [ ] All locations named are in vault - [ ] All NPCs named are in vault or clearly marked as new - [ ] No invented mechanics or artifacts - [ ] NPC voices match documented patterns - [ ] Relationships reflected in NPC behavior

4.2 Emotional Delivery Checklist

  • [ ] Pending reunions/confrontations given screen time
  • [ ] Player character state acknowledged by NPCs
  • [ ] Intimate relationships show appropriate warmth
  • [ ] Stakes are felt, not just stated

4.3 Automated Regression Tests

Create test cases:

- name: "Elara should not be a bartender"
  message: "I go to the Gilded Quill"
  must_contain: ["book", "scroll", "paper", "Lorekeeper"]
  must_not_contain: ["bar", "glasses", "drink", "tavern"]

- name: "Setting is Seagate not Salinmoor"
  message: "start session"
  must_contain: ["Seagate"]
  must_not_contain: ["Salinmoor"]

- name: "Marlena intimate behavior"
  context_includes: "marlena.*intimate-ally"
  must_contain_pattern: "(touch|hand|close|tender|worried)"


5. Implementation Priority

Priority Task Effort Impact
P0 Add canon grounding rules to system prompt Low High
P0 Fix location name reinforcement in context Low High
P1 Add NPC personality snippets to context Medium High
P1 Add relationship texture to system prompt Low Medium
P1 Inject location context on movement Medium High
P2 Add PC narrative state Low Medium
P2 RAG-enhanced NPC retrieval High High
P2 Scene type classification Medium Medium
P3 Automated regression tests Medium Medium

6. Success Metrics

After implementation, measure:

  1. Canon Compliance Rate: % of responses with no invented locations/NPCs/mechanics
  2. Relationship Accuracy: Manual scoring of NPC behavior vs disposition
  3. Emotional Delivery: Player satisfaction with key scenes (reunion, revelation)
  4. Location Accuracy: Zero Salinmoor-type errors

Target: 95% canon compliance, 4/5 relationship accuracy, 4/5 emotional delivery within 2 iterations.