DoctorsWhoCode · OpenMFM

AI for Clinicians — DoctorsWhoCode.blog

You Are Still
Prompting.
You Should Be
Building Agents.

Prompting has a ceiling. Most clinicians have hit it. Here is what actually changes when you stop coordinating AI manually and start building systems that coordinate themselves.

CO

Chukwuma Onyeije, MD, FACOG Maternal-Fetal Medicine · DoctorsWhoCode.blog

01 / 22

The problem with prompting alone

You Have Hit the Ceiling.
Here Is What It Feels Like.

→

You prompt Claude for an APSO note. You read the output.

→

You copy it into AthenaHealth. You notice the codes are missing.

→

You go back, ask for the codes, paste them in separately.

→

You catch a factual error in the objective. You fix it manually.

→

You repeat this for every patient, every day.

The AI is doing individual tasks well. You are doing all the coordination manually. That is not automation. That is assisted manual labor. There is a meaningful difference.

02 / 22

The definition

What a Prompt
Actually Is

A prompt is a single instruction handed to a language model that produces a single output.

It is stateless. The model has no memory of what it was trying to accomplish beyond the current window. It cannot notice it was wrong and try again.

input → model → output
· · · nothing more · · ·

You Are Currently Playing All These Roles

🗺

Planner

You decide what to prompt, in what order, and what each step depends on.

🔧

Error Handler

You notice when the output is wrong and decide what to do about it.

✅

Quality Reviewer

You check the final output against your standards before it goes anywhere.

⏭

Coordinator

You manually move outputs from one step to the next, every time.

03 / 22

🔴 What you have now

Assisted
Manual Labor

You are still the coordinator between every step

The AI executes tasks. You manage the process.

Does not scale beyond 2–3 steps without constant attention

Every coordination minute is a minute not spent on judgment

🟢 What agents give you

Actual
Automation

The system manages the process. You manage the outcome.

Multi-step workflows run without manual connection

Scales because coordination is encoded, not rebuilt each time

Your attention reserved for decisions only a physician can make

04 / 22

The shift in category

What an Agent Actually Is

An agent is not a smarter prompt. It is a fundamentally different category of thing.

Chatbot / Prompt

💬

Answers Questions

Waits for you to tell it what to do. Stateless. Knows nothing about what came before or what comes next. You do the coordination.

→

Agent

🚗

Gets You There

Takes a goal, figures out the route, handles the unexpected, and delivers the result — without a status update every five minutes.

Definition: An agent takes a goal, determines the steps required to reach it, executes those steps, monitors what is happening, and adapts when something does not go according to plan.

05 / 22

Inside every well-designed agent

The Four Workers Operating Simultaneously

Together, they do the coordination you are currently doing yourself.

🔍

Analyst

Examines the inputs and identifies what actually matters. Flags ambiguity before work begins.

🗺

Planner

Decides the sequence of steps, selects the right tools and approaches before execution starts.

⚙️

Operator

Carries out the work — writes, formats, routes, commits, stages. The executor in the system.

✅

Auditor

Reviews output before it reaches you — checks for errors, contradictions, and failure modes.

These four workers replace the coordination you are currently doing manually between every prompt.

06 / 22

ARR

Not Every Task Should Be an Agent

Use the ARR test before you build anything. If the answer to any one of these is no, you need a better-defined process before you need an agent.

A

Autonomous

The system should perform much of the work without constant supervision. If a human must approve every micro-step, it is not ready.

R

Recurring

It must happen repeatedly. Automation pays off through repetition — the investment amortizes over every future run.

R

Reviewable

There must be a clear definition of what good looks like. Vague goals produce consistent failure.

07 / 22

Before a single line of automation runs

The GPS Check

Answer these three questions in writing. If you cannot answer any one of them clearly, stop — and do that work first. The agent will not infer your intent.

G

Goal

"Can you state the objective in one clear sentence?"

If it takes more than one sentence, the goal is not clear enough to automate.

P

Proof

"What does good actually look like, and how will you know when it arrives?"

"A good note" is not a proof. Specific, auditable criteria are.

S

Steps

"Can you describe every step precisely, without ambiguity?"

The agent executes bad instructions as efficiently as good ones.

The bottleneck is never the technology. Physicians who try to automate tasks they do not fully understand build agents around vague goals, get inconsistent outputs, and blame the tech. The bottleneck is the clarity of the human behind it.

08 / 22

01

🏥 Clinical · MFM / Atlanta Perinatal Associates

The Post-Visit
Documentation Agent

From ambient transcript to signed APSO note — without touching a keyboard between patients.

⏱ 12–15 min per patient (prompting)

→

⚡ 90 seconds (agent)

09 / 22

Workflow 01 · Post-Visit Documentation — GPS Check + ARR

GOAL

Convert every MFM encounter transcript into a complete, signed-ready APSO note in AthenaHealth within 10 minutes of visit end.

PROOF

Note contains all required sections (A/P/S/O), correct ICD-10 + CPT codes, no hallucinated clinical data, and passes attending review in under 90 seconds.

STEPS

Ambient scribe captures audio → transcript cleaned → APSO generator applies MFM consultation skill → coding layer appends ICD-10/CPT → note staged in AthenaHealth draft queue.

Autonomous — no manual formatting

Recurring — every patient encounter

Reviewable — note quality is auditable

10 / 22

Workflow 01 · Post-Visit Documentation — The Four Workers

🔍

Analyst

Parses the raw transcript — identifies chief complaint, gestational age, ultrasound findings, risk factors, and any ambiguous clinical language flagged for review.

🗺

Planner

Decides note structure: which APSO sections need expansion, whether a high-complexity modifier applies, and which SMFM guideline references to cite in the plan.

⚙️

Operator

Writes the full APSO note using the MFM consultation skill, maps ICD-10 and CPT codes, and pushes the draft to AthenaHealth via API.

✅

Auditor

Checks for: missing required fields, contradictions between subjective and objective, hallucinated medications or labs not in the transcript, and coding accuracy.

11 / 22

Human Checkpoint · Workflow 01

🧑‍⚕️

The Draft Is Waiting When
You Walk Out of the Room

The agent handles formatting, coding, and routing. What arrives in your queue is a complete draft — not a blank page, not a prompt to respond to.

Your job at the checkpoint: Does the plan reflect your actual clinical reasoning? Does the impression capture the nuance of this patient? Sign or revise — never rubber-stamp. The agent cannot replace clinical judgment. It was never asked to.

12 / 22

02

✍️ Content · DoctorsWhoCode.blog

The Idea-to-
Published-Post Agent

A thought captured in Telegram becomes a fully drafted, voice-matched post — ready for editorial review, not a blank page.

⏱ 60–90 min per post (prompting)

→

⚡ Minutes of active time (agent)

13 / 22

Workflow 02 · Idea-to-Published-Post — GPS Check + ARR

GOAL

Transform a raw idea (voice note, Telegram message, Obsidian stub) into a publish-ready MDX draft in the DoctorsWhoCode repo within 2 hours of capture.

PROOF

Draft matches voice profile, has correct frontmatter (title, tags, description, hero image prompt), passes editorial checklist, and requires no structural rewriting — only refinement.

STEPS

Idea captured in Telegram → Obsidian vault → Agent classifies domain → Expands to full draft using voice/style skill → Generates hero image prompt + Facebook caption → Commits MDX to repo → Notifies for review.

Autonomous — pipeline runs on capture

Recurring — weekly publishing cadence

Reviewable — voice consistency is measurable

14 / 22

Workflow 02 · Idea-to-Published-Post — The Four Workers

🔍

Analyst

Reads the raw capture, classifies the content domain, identifies the core argument or insight, and surfaces related past posts to avoid duplication.

🗺

Planner

Selects the appropriate post template, determines target length and argument structure before a word is written. Maps the arc.

⚙️

Operator

Writes full MDX draft with frontmatter, hero image prompt, Facebook caption, and inline citations. Commits to the correct GitHub branch with a conventional commit message.

✅

Auditor

Checks voice consistency against the DWC style profile, verifies internal links, flags unsubstantiated claims, confirms SEO metadata is complete.

15 / 22

Human Checkpoint · Workflow 02

✍️

The Draft Arrives Needing
Your Editorial Judgment, Not Your Labor

The agent handles capture, classification, drafting, and commit. You arrive at a complete draft — not a blank page, not a voice memo waiting to be expanded.

Your job at the checkpoint: Does this paragraph earn its place? Does the theological argument hold? Is this the right week for this post? Your standards are what the agent is scaling — not replacing. Your voice is the product.

16 / 22

03

⚡ Performance · PGIS / Endurance Athletics

The Daily Readiness &
Training Adaptation Agent

Your CGM, Garmin, and sleep data are reviewed overnight. You wake up to a training decision — not a spreadsheet.

⏱ Manual app-checking + prompting (~15 min)

→

⚡ Dashboard waiting at 6 AM

17 / 22

Workflow 03 · Daily Readiness — GPS Check + ARR

GOAL

Generate a personalized daily training recommendation — including zone, duration, and nutrition adjustments — based on the previous 24h of PGIS biomarker data.

PROOF

Recommendation correctly applies the Type 1/Type 2 Red physiological distinction, references the Stress-Glucose Index trend, aligns with the current training block phase, and is delivered before 6 AM.

STEPS

Garmin API pulls overnight HRV/sleep/load → CGM pulls nocturnal glucose → PGIS Readiness Engine scores the day → Agent classifies readiness tier → Generates training recommendation + nutrition note → Pushes to PGIS dashboard.

Autonomous — runs nightly on sensor data

Recurring — daily readiness is the loop

Reviewable — training response is measurable

18 / 22

Workflow 03 · Daily Readiness — The Four Workers

🔍

Analyst

Ingests last 24h: HRV, resting HR, sleep stages, step load, nocturnal glucose baseline, and any flagged Type 1 or Type 2 Red events. Identifies the dominant readiness signal.

🗺

Planner

Cross-references the readiness score against the current training block phase (base / build / peak / taper) and applies WFPB nutrition protocol parameters to determine the appropriate training stimulus.

⚙️

Operator

Writes the daily readiness report: readiness tier, session type and duration, target HR zones, pre-workout meal timing. Updates Chart.js dashboard on Railway. Sends Telegram notification.

✅

Auditor

Verifies the recommendation is internally consistent — no hard intervals on a Red day. Checks for data gaps or sensor anomalies. Flags 3-day trends warranting a recovery week override.

19 / 22

Human Checkpoint · Workflow 03

🏃

The Agent Sees Your Numbers.
You Feel Your Legs.

The readiness report is a starting point, not a directive. The agent knows your HRV and glucose. You know about the overnight delivery, the cold coming on, the disrupted night on call.

Your job at the checkpoint: Override the recommendation with body intelligence the sensors cannot capture. The agent applies your framework — you apply your embodied knowledge. That layer of judgment is yours, and it matters.

20 / 22

The actual shift

What Changes When You Make the Shift

When you were prompting, you were producing individual outputs faster. That was the value.

When you build agents, something different happens.

1

You stop trading time for labor. You start scaling your standards. Your note format, your voice, your readiness framework — they run without you rebuilding them each time.

2

Your judgment is the scarce thing. The agent is the infrastructure that makes that judgment available at every iteration — not just when you have time to prompt.

3

The physician who will matter most is the one who has encoded their standards into systems that run without them — and who knows exactly when to step back in.

21 / 22

The Takeaway

Build the agent.
Keep the judgment.

The agent handles coordination. You handle the decisions that only a physician — and only you — can make. That division of labor is the design.

→ Read the full post · DoctorsWhoCode.blog OpenMFM.org

DoctorsWhoCode.blog · OpenMFM · 2026

22 / 22

You Are StillPrompting.You Should BeBuilding Agents.

You Have Hit the Ceiling.Here Is What It Feels Like.

What a PromptActually Is

What an Agent Actually Is

The Four Workers Operating Simultaneously

Not Every Task Should Be an Agent

The GPS Check

The Post-VisitDocumentation Agent

The Draft Is Waiting WhenYou Walk Out of the Room

The Idea-to-Published-Post Agent

The Draft Arrives NeedingYour Editorial Judgment, Not Your Labor

The Daily Readiness &Training Adaptation Agent

The Agent Sees Your Numbers.You Feel Your Legs.

What Changes When You Make the Shift

You Are Still
Prompting.
You Should Be
Building Agents.

You Have Hit the Ceiling.
Here Is What It Feels Like.

What a Prompt
Actually Is

The Post-Visit
Documentation Agent

The Draft Is Waiting When
You Walk Out of the Room

The Idea-to-
Published-Post Agent

The Draft Arrives Needing
Your Editorial Judgment, Not Your Labor

The Daily Readiness &
Training Adaptation Agent

The Agent Sees Your Numbers.
You Feel Your Legs.