Transforming vague feedback into structured,
actionable insights.
4 specialist agents · AOFG framework · CU Anschutz DBMI
The Problem
Asynchronous Online Focus Groups (AOFGs) are a widely used tool to collect feedback at scale, while addressing key geographic barriers and participant time constraints; however, the feedback collected via AOFGs is often too vague or ambiguous to act on.
"Q: Was this helpful? A: Kind of" gives a designer nothing to act on.
ClarifAI intercepts vague feedback before it becomes unusable. A four-module LLM pipeline filters out irrelevant responses, flags vague or ambiguous comments, and conducts a targeted follow-up dialogue to turn a thin comment into richer, more granular data for design and evaluation teams.
AI
Human
AI
Human
AI
Problem Clarified
Raw response: "somewhat useful"
Granular data captured: useful for orientation, but missing stopped/new medications and rationale needed for medication reconciliation.
Methodology
ClarifAI's four-module pipeline is the result of an iterative design and validation programme combining contextual inquiry, multi-stage prompt engineering, and controlled empirical testing. Each module operates as a specialised agent with a distinct task, scoring rubric, and prompt template, developed, tested, and refined independently before being chained into the full pipeline.
ClarifAI is designed to operate inside an AOFG-based feedback workflow. Per-module evaluation is built into the pipeline, so individual agents can be re-tested, swapped, or extended without rewriting the orchestrator. New modules can be added by authoring a single prompt template alongside an existing one.
Engineering skills
Prompt engineering, structured-output scoring rubrics, tool-use orchestration, model routing.
Research skills
Contextual inquiry, validation experiment design, ground-truth labelling protocols.
Evaluation skills
Per-module accuracy, composite pipeline scoring, held-out validation sets.
Product skills
Information architecture, prompt-as-interface design, integration into the AOFG flow.
At a glance
Each agent is specialised, evaluated, and replaceable. The orchestrator routes between modules based on intermediate results so only feedback that needs clarification is escalated.
4
Agents
AOFG
Framework
Tool-use
Orchestration
01 Observe
Observed how research teams process AOFG transcripts at scale, identifying the moments where vague responses bottleneck downstream analysis.
02 Specify
Decomposed the disambiguation task into four discrete agent responsibilities, each with a single owning prompt and a clear input/output contract.
03 Prompt
Each module's prompt was refined against a held-out validation set, with structured scoring rubrics replacing free-text output to guarantee machine-readable consistency between agents.
Validation
04 Validate
Pipeline evaluated end-to-end against ground-truth labels from clinical researchers, with per-module accuracy reported alongside a cross-pipeline composite score.
Live pipeline
05 Integrate
Designed to surface clarification dialogues to participants in real time inside an AOFG flow and write structured results back to the research dataset.
The Pipeline
A · Telemetry
The Telemetry module classifies whether each piece of feedback (e.g., an individual answer to a feedback question/prompt) is addressing the informational intent of the question/prompt. Off-topic or tangential responses are filtered out before they enter the pipeline.
Doing this classification first means downstream modules only see feedback that genuinely belongs to the question being asked, reducing both LLM cost and false-positive escalations into the clarification dialogue.
B · Flight
The Flight module further filters relevant feedback for responses that are not specific enough to act on (i.e., contain any vagueness or ambiguity). Only feedback that contains vagueness or ambiguity is escalated to the clarification dialogue, keeping the experience lightweight for users who already gave clear and relevant responses.
This routing decision is the load-bearing piece of the pipeline. Over-escalation creates participant fatigue; under-escalation leaves vague responses unclarified. Flight's prompt was iteratively tuned to balance the two.
C · CapCom
The CapCom module engages the user in a short, targeted follow-up conversation for each vague or ambiguous topic identified by Flight. The LLM interviewer asks the questions needed to resolve the vagueness or ambiguity, then sends the transcript of the conversation to the Payload module for final processing.
The dialogue is bounded — CapCom stops as soon as the original vagueness signal has been resolved — so participants experience a focused exchange rather than a generic AI chat.
D · Payload
The Payload module extracts the relevant and granular information elicited from the user by the CapCom module, and injects the new granular information into the original feedback response, resulting in a structured, machine-readable, and specific insight that designers can act on directly.
The output is structured rather than free-text: it preserves the original phrasing while attaching the elicited specifics, so designers and downstream analysis tools can both read the human version and query the structured fields.
System Architecture
The complete ClarifAI pipeline, from AOFG collection through four LLM modules to structured, actionable output.
Stage 1
Prerequisite Task
Stage 2
Project & Tasks
Stage 3
Discussion Board
Contextual
Relevance
Requires
Clarification
Clarification
Dialogue
Summarisation
& Refactor
Why It Matters
Actionability of Feedback by Condition
4
LLM pipeline modules
AOFG
Feedback context
Tool-use
Orchestration model
Active
Project status