DoctorsWhoCode · OpenMFM
AI for Clinicians — DoctorsWhoCode.blog
You Are Still
Prompting.
You Should Be
Building Agents.
Prompting has a ceiling. Most clinicians have hit it. Here is what actually changes when you stop coordinating AI manually and start building systems that coordinate themselves.
CO
Chukwuma Onyeije, MD, FACOG
Maternal-Fetal Medicine · DoctorsWhoCode.blog
01 / 22
The problem with prompting alone
You Have Hit the Ceiling.
Here Is What It Feels Like.
→
You prompt Claude for an APSO note. You read the output.
→
You copy it into AthenaHealth. You notice the codes are missing.
→
You go back, ask for the codes, paste them in separately.
→
You catch a factual error in the objective. You fix it manually.
→
You repeat this for every patient, every day.
The AI is doing individual tasks well. You are doing all the coordination manually. That is not automation. That is assisted manual labor. There is a meaningful difference.
02 / 22
The definition
What a Prompt
Actually Is
A prompt is a single instruction handed to a language model that produces a single output.
It is stateless. The model has no memory of what it was trying to accomplish beyond the current window. It cannot notice it was wrong and try again.
input → model → output
· · · nothing more · · ·
You Are Currently Playing All These Roles
🗺
Planner
You decide what to prompt, in what order, and what each step depends on.
🔧
Error Handler
You notice when the output is wrong and decide what to do about it.
✅
Quality Reviewer
You check the final output against your standards before it goes anywhere.
⏭
Coordinator
You manually move outputs from one step to the next, every time.
03 / 22
🔴 What you have now
Assisted
Manual Labor
You are still the coordinator between every step
The AI executes tasks. You manage the process.
Does not scale beyond 2–3 steps without constant attention
Every coordination minute is a minute not spent on judgment
🟢 What agents give you
Actual
Automation
The system manages the process. You manage the outcome.
Multi-step workflows run without manual connection
Scales because coordination is encoded, not rebuilt each time
Your attention reserved for decisions only a physician can make
04 / 22
The shift in category
What an Agent Actually Is
An agent is not a smarter prompt. It is a fundamentally different category of thing.
Chatbot / Prompt
💬
Answers Questions
Waits for you to tell it what to do. Stateless. Knows nothing about what came before or what comes next. You do the coordination.
→
Agent
🚗
Gets You There
Takes a goal, figures out the route, handles the unexpected, and delivers the result — without a status update every five minutes.
Definition: An agent takes a goal, determines the steps required to reach it, executes those steps, monitors what is happening, and adapts when something does not go according to plan.
05 / 22
Inside every well-designed agent
The Four Workers Operating Simultaneously
Together, they do the coordination you are currently doing yourself.
🔍
Analyst
Examines the inputs and identifies what actually matters. Flags ambiguity before work begins.
🗺
Planner
Decides the sequence of steps, selects the right tools and approaches before execution starts.
⚙️
Operator
Carries out the work — writes, formats, routes, commits, stages. The executor in the system.
✅
Auditor
Reviews output before it reaches you — checks for errors, contradictions, and failure modes.
These four workers replace the coordination you are currently doing manually between every prompt.
06 / 22
ARR
Not Every Task Should Be an Agent
Use the ARR test before you build anything. If the answer to any one of these is no, you need a better-defined process before you need an agent.
A
Autonomous
The system should perform much of the work without constant supervision. If a human must approve every micro-step, it is not ready.
R
Recurring
It must happen repeatedly. Automation pays off through repetition — the investment amortizes over every future run.
R
Reviewable
There must be a clear definition of what good looks like. Vague goals produce consistent failure.
07 / 22
Before a single line of automation runs
The GPS Check
Answer these three questions in writing. If you cannot answer any one of them clearly, stop — and do that work first. The agent will not infer your intent.
G
Goal
"Can you state the objective in one clear sentence?"
If it takes more than one sentence, the goal is not clear enough to automate.
P
Proof
"What does good actually look like, and how will you know when it arrives?"
"A good note" is not a proof. Specific, auditable criteria are.
S
Steps
"Can you describe every step precisely, without ambiguity?"
The agent executes bad instructions as efficiently as good ones.
The bottleneck is never the technology. Physicians who try to automate tasks they do not fully understand build agents around vague goals, get inconsistent outputs, and blame the tech. The bottleneck is the clarity of the human behind it.
08 / 22
Workflow 01 · Post-Visit Documentation — GPS Check + ARR
GOAL
Convert every MFM encounter transcript into a complete, signed-ready APSO note in AthenaHealth within 10 minutes of visit end.
PROOF
Note contains all required sections (A/P/S/O), correct ICD-10 + CPT codes, no hallucinated clinical data, and passes attending review in under 90 seconds.
STEPS
Ambient scribe captures audio → transcript cleaned → APSO generator applies MFM consultation skill → coding layer appends ICD-10/CPT → note staged in AthenaHealth draft queue.
Autonomous — no manual formatting
Recurring — every patient encounter
Reviewable — note quality is auditable
10 / 22
Workflow 01 · Post-Visit Documentation — The Four Workers
🔍
Analyst
Parses the raw transcript — identifies chief complaint, gestational age, ultrasound findings, risk factors, and any ambiguous clinical language flagged for review.
🗺
Planner
Decides note structure: which APSO sections need expansion, whether a high-complexity modifier applies, and which SMFM guideline references to cite in the plan.
⚙️
Operator
Writes the full APSO note using the MFM consultation skill, maps ICD-10 and CPT codes, and pushes the draft to AthenaHealth via API.
✅
Auditor
Checks for: missing required fields, contradictions between subjective and objective, hallucinated medications or labs not in the transcript, and coding accuracy.
11 / 22
Human Checkpoint · Workflow 01
🧑⚕️
The Draft Is Waiting When
You Walk Out of the Room
The agent handles formatting, coding, and routing. What arrives in your queue is a complete draft — not a blank page, not a prompt to respond to.
Your job at the checkpoint: Does the plan reflect your actual clinical reasoning? Does the impression capture the nuance of this patient? Sign or revise — never rubber-stamp. The agent cannot replace clinical judgment. It was never asked to.
12 / 22
Workflow 02 · Idea-to-Published-Post — GPS Check + ARR
GOAL
Transform a raw idea (voice note, Telegram message, Obsidian stub) into a publish-ready MDX draft in the DoctorsWhoCode repo within 2 hours of capture.
PROOF
Draft matches voice profile, has correct frontmatter (title, tags, description, hero image prompt), passes editorial checklist, and requires no structural rewriting — only refinement.
STEPS
Idea captured in Telegram → Obsidian vault → Agent classifies domain → Expands to full draft using voice/style skill → Generates hero image prompt + Facebook caption → Commits MDX to repo → Notifies for review.
Autonomous — pipeline runs on capture
Recurring — weekly publishing cadence
Reviewable — voice consistency is measurable
14 / 22
Workflow 02 · Idea-to-Published-Post — The Four Workers
🔍
Analyst
Reads the raw capture, classifies the content domain, identifies the core argument or insight, and surfaces related past posts to avoid duplication.
🗺
Planner
Selects the appropriate post template, determines target length and argument structure before a word is written. Maps the arc.
⚙️
Operator
Writes full MDX draft with frontmatter, hero image prompt, Facebook caption, and inline citations. Commits to the correct GitHub branch with a conventional commit message.
✅
Auditor
Checks voice consistency against the DWC style profile, verifies internal links, flags unsubstantiated claims, confirms SEO metadata is complete.
15 / 22
Human Checkpoint · Workflow 02
✍️
The Draft Arrives Needing
Your Editorial Judgment, Not Your Labor
The agent handles capture, classification, drafting, and commit. You arrive at a complete draft — not a blank page, not a voice memo waiting to be expanded.
Your job at the checkpoint: Does this paragraph earn its place? Does the theological argument hold? Is this the right week for this post? Your standards are what the agent is scaling — not replacing. Your voice is the product.
16 / 22
Workflow 03 · Daily Readiness — GPS Check + ARR
GOAL
Generate a personalized daily training recommendation — including zone, duration, and nutrition adjustments — based on the previous 24h of PGIS biomarker data.
PROOF
Recommendation correctly applies the Type 1/Type 2 Red physiological distinction, references the Stress-Glucose Index trend, aligns with the current training block phase, and is delivered before 6 AM.
STEPS
Garmin API pulls overnight HRV/sleep/load → CGM pulls nocturnal glucose → PGIS Readiness Engine scores the day → Agent classifies readiness tier → Generates training recommendation + nutrition note → Pushes to PGIS dashboard.
Autonomous — runs nightly on sensor data
Recurring — daily readiness is the loop
Reviewable — training response is measurable
18 / 22
Workflow 03 · Daily Readiness — The Four Workers
🔍
Analyst
Ingests last 24h: HRV, resting HR, sleep stages, step load, nocturnal glucose baseline, and any flagged Type 1 or Type 2 Red events. Identifies the dominant readiness signal.
🗺
Planner
Cross-references the readiness score against the current training block phase (base / build / peak / taper) and applies WFPB nutrition protocol parameters to determine the appropriate training stimulus.
⚙️
Operator
Writes the daily readiness report: readiness tier, session type and duration, target HR zones, pre-workout meal timing. Updates Chart.js dashboard on Railway. Sends Telegram notification.
✅
Auditor
Verifies the recommendation is internally consistent — no hard intervals on a Red day. Checks for data gaps or sensor anomalies. Flags 3-day trends warranting a recovery week override.
19 / 22
Human Checkpoint · Workflow 03
🏃
The Agent Sees Your Numbers.
You Feel Your Legs.
The readiness report is a starting point, not a directive. The agent knows your HRV and glucose. You know about the overnight delivery, the cold coming on, the disrupted night on call.
Your job at the checkpoint: Override the recommendation with body intelligence the sensors cannot capture. The agent applies your framework — you apply your embodied knowledge. That layer of judgment is yours, and it matters.
20 / 22
The actual shift
What Changes When You Make the Shift
When you were prompting, you were producing individual outputs faster. That was the value.
When you build agents, something different happens.
1
You stop trading time for labor. You start scaling your standards. Your note format, your voice, your readiness framework — they run without you rebuilding them each time.
2
Your judgment is the scarce thing. The agent is the infrastructure that makes that judgment available at every iteration — not just when you have time to prompt.
3
The physician who will matter most is the one who has encoded their standards into systems that run without them — and who knows exactly when to step back in.
21 / 22
The Takeaway
Build the agent.
Keep the judgment.
The agent handles coordination. You handle the decisions that only a physician — and only you — can make. That division of labor is the design.
DoctorsWhoCode.blog · OpenMFM · 2026
22 / 22