Problem
Board game recommendations are genuinely hard. Player constraints are complex and contextual ("something for 5 people, not too long, my group hates trading"). LLMs hallucinate on incomplete metadata. And users don't trust suggestions that feel mechanically correct but miss the actual situation. Generic AI recommendations weren't solving this.
The Hard Call
After shipping V1, I discovered two wrong assumptions at once. The engine was matching on keywords but not reasoning — it couldn't interpret the intent behind a contextual request, only filter on surface constraints. And my entire feedback loop was built on power users, not the casual players I was designing for, so I'd been optimizing for the wrong problems. Rather than patch either one, I rebuilt both: the architecture and the validation methodology.
What I Did
Architecture
- V1: Built a 5-node hybrid pipeline combining deterministic constraint filters with LLM result formatting. Shipped and tested.
- Identified that V1 matched on surface keywords but didn't reason — it couldn't understand the tradeoffs behind a complex, contextual request
- Ran a structured 6-model evaluation before committing to V2 — selected on failure mode severity, not benchmark scores
- V2: Added a dedicated Intent Interpreter node to reason through user requests before passing to the matching pipeline — separating understanding from retrieval
- Built a multi-model metadata enrichment pipeline to clean inconsistent fields across the catalog, reducing hallucinations at the reasoning stage
Validation
- Discovered that all early beta testers were power users — not the casual players I was building for — and their feedback was pushing me to solve the wrong problems
- Rebuilt validation around the actual target audience; redirected ~40% of planned bug fixes that only affected power-user edge cases
- First round of casual-player testing returned immediate positive signal and surfaced real gaps the power-user feedback had masked
Outcome
78% latency reduction
65% cost reduction
100% recommendation consistency
Immediate positive signal from target casual players
Skills:
AI System Design · Hybrid Architecture · Reasoning Pipeline Design · User Segmentation · Iterative Product Development