Making AI Review Itself First

From:

Prompt Guy <thinkaiprompt@mail.beehiiv.com>

To:

Hidden Recipient <hidden@emailshot.io>

Date:

2/28/2026, 9:00 AM(11 hours ago)

February 28, 2026

Making AI Review Itself First

One follow-up prompt catches errors your original prompt missed entirely.

Racheal From Thinkaiprompt

In partnership with

Reading Time: 5 minutes

Hey Prompt Lover,

Module 4 is done. Two newsletters. Two techniques. Both built on the same principle: the best prompt for a complex task is never one prompt.

If you tested the Tree-of-Thought structure from last issue on a real decision, you already know what I mean. The evaluation step — generating multiple paths before committing to one — produces a different category of output than asking for an answer directly. Several of you replied with results. The word that kept coming up was "specific." The output finally felt specific. That's what happens when you stop asking for the answer and start asking for the search.

Today we move into Module 5.

This module is about something different from everything we've covered so far. Every technique in Modules 1 through 4 was about what you put into the prompt before you run it. Better structure. Better examples. Better reasoning instructions. Better sequencing.

Module 5 is about what happens after the first output lands.

Specifically: what happens when you make AI look at its own work before you do.

The technique is called Self-Refine. And the version of it I've been using in my workflow for the past six months has quietly become the most consistently useful thing I took from The Prompt Report.

Here's Why This Matters

Here's something the research makes clear that changes how you think about AI output.

When AI generates a first draft, it is optimizing for producing a response. Not for producing the best response.

The model moves forward. It fills the output. It reaches a conclusion. At no point in that process does it look back and ask whether what it just wrote was actually good.

That's not a flaw. It's how generation works. The model produces the most statistically likely next token at each step. Forward. Always forward. It doesn't self-edit mid-generation.

Self-Refine changes the process by adding a second pass with a completely different instruction. Instead of generating, the model is now evaluating. And evaluation activates different reasoning than generation does.

When you ask AI to find problems in its own output, it finds them. Not always all of them. But consistently more than it would catch if you just asked it to "make this better." The critique pass creates a gap between the AI and its own output. That gap is where the improvement lives.

The research found this across writing tasks, reasoning tasks, code generation, and dialogue. Self-Refine improved output quality in study after study.

The gains were consistent enough that the researchers concluded it as a reliable technique rather than a context-dependent one.

How Marketers Are Scaling With AI in 2026

61% of marketers say this is the biggest marketing shift in decades.

Get the data and trends shaping growth in 2026 with this groundbreaking state of marketing report.

Inside you’ll discover:

Results from over 1,500 marketers centered around results, goals and priorities in the age of AI
Stand out content and growth trends in a world full of noise
How to scale with AI without losing humanity
Where to invest for the best return in 2026

Download your 2026 state of marketing report today.

Get Your Report

What You'll Learn In This Newsletter

By the end of this issue, you'll have:

• A clear explanation of why critique prompts outperform revision prompts

• The exact Self-Refine structure that produces consistent improvement

• A working template you can drop into any existing workflow today

• The specific critique language that catches the problems most revision prompts miss

Let's get started.

What Most People Do Wrong

When AI produces output that isn't quite right, most people do one of two things.

They either edit it themselves, which works but puts the quality control entirely on them. Or they send it back to AI with a vague instruction. "Make this better." "Improve the flow." "Make it sound more natural."

Vague revision instructions produce vague revisions.

"Make this better" tells the AI nothing specific about what's wrong. So the AI makes surface changes. Synonyms swap. Sentences rearrange.

The output looks different. The underlying problems stay exactly where they were.

The gap between "make this better" and "find every vague claim and rewrite it with a specific example" is enormous.

The first instruction produces cosmetic changes. The second produces structural ones.

Most people never make that distinction. They treat revision as one category of instruction when it's actually two completely different things: general improvement, which produces little, and specific critique followed by targeted rewrite, which produces a lot.

Quick Reality Check

I ran a test last month. Same draft. Three revision approaches. First: "Make this better." Second: "Improve the tone and flow." Third: the Self-Refine critique prompt from this newsletter. The first two produced outputs I would have edited for another twenty minutes. The third produced an output I sent to the client with two small changes. All three took about the same amount of time to run. The difference was entirely in how specifically I told the AI what to look for.

The Prompt That Works

Self-Refine runs in two stages. The first stage produces your initial output. The second stage critiques and revises it.

Run your normal task prompt here first.

[Whatever prompt you'd normally use for this task. Role, directive, context, format, examples. Run it. Get your first output. Then move to Stage 2.]

▼ COPY THIS PROMPT — STAGE 2:


Here is a draft I need you to critique and revise:

[Paste Stage 1 output here]

Work through the following critique checklist. For each item, identify every instance of the problem in the draft, explain specifically why it's a problem, and rewrite that passage to fix it.

Critique checklist:

Vague claims — Find every claim that isn't supported by a specific example, number, or detail. Flag it and rewrite it with specificity.

Generic language — Find every phrase that could have been written by anyone about anything. "Great results." "Better outcomes." "Significant improvement." Flag it and replace it with language specific to this topic and audience.

Weak transitions — Find every transition that summarizes the previous point instead of advancing to the next one. Flag it and rewrite it to move the argument forward.

Restatement — Find any conclusion or closing section that repeats what was already said instead of landing with something new. Flag it and rewrite it to end on insight rather than summary.

Passive constructions — Find sentences where the subject is acted upon rather than acting. Flag them and rewrite them in active voice.

After completing the critique, produce a clean revised version of the full draft incorporating every fix.

How To Use This Prompt

Step 1: Run your normal task prompt first. Whatever you'd usually use. Don't change it. Stage 1 is just your existing workflow.

Step 2: Copy the Stage 2 critique prompt exactly as written.

Step 3: Paste your Stage 1 output into the designated spot in Stage 2.

Step 4: Read the critique section before the revised draft. This is important. The critique tells you what the AI found and why it was a problem. Reading it builds your own eye for the same issues. Over time you'll start catching these problems in your original prompts before they reach Stage 2.

Step 5: Take the clean revised draft and do your own final read. You're not editing for the same problems anymore — the AI caught those. You're now reading for anything the checklist didn't cover and for anything specific to your voice or client relationship that only you can judge.

One note on the checklist: The five items above cover the most common problems in AI-generated content. You can add to this list for your specific work. If your client hates bullet points, add that. If your audience needs technical language kept simple, add that. The checklist is a template, not a ceiling.

Why This Prompt Works

Self-Refine works because critique and generation are different cognitive tasks.

When the AI generates a first draft, it's producing. Moving forward. Filling space. When you ask it to critique that draft against specific criteria, it's evaluating. Comparing what exists against a standard. Those two modes produce different types of attention to the text.

Generation misses things because it's always looking forward. Evaluation catches them because it's looking back with specific questions. The combination — generate, then evaluate against criteria, then revise — produces output that neither pass could produce alone.

The research also found something worth noting about the specificity of the critique. General revision instructions ("improve this") activate shallow editing. Specific critique criteria ("find every vague claim") activate targeted analysis. The more specific the criteria, the more useful the critique. The more useful the critique, the more substantive the revision.

That's why the checklist in Stage 2 is specific. "Vague claims" is specific. "Weak transitions" is specific. "Make it better" is not. Specificity in the critique is what separates a cosmetic revision from a structural one.

What Most People Write Instead

Typical revision prompt: "Can you improve this draft? Make it more engaging and professional."

Why it fails: The AI doesn't know what "engaging" means to you. It doesn't know what "professional" looks like for your audience. So it makes surface changes that feel like improvement without fixing the underlying problems. You still do the same amount of editing. You just have a slightly different draft to start from.

The Self-Refine version: Stage 1 produces the draft. Stage 2 critiques it against five specific criteria, explains every problem it finds, and rewrites each problem passage before producing a clean revised version.

Why it wins: The AI is now doing the quality control work you were doing manually. Not all of it. But enough of it that your editing time drops significantly. You're reviewing and finishing rather than fixing and rebuilding.

Quick Reality Check

The research tested Self-Refine on dialogue generation, code optimization, and mathematical reasoning alongside writing tasks. It improved output quality across all of them. The researchers noted that the technique works even when the model doing the critique is the same model that produced the draft. You don't need a smarter model to catch problems in the output. You need a differently instructed one. Same model. Different instruction. Meaningfully better result.

Two Related Techniques Worth Knowing

Self-Calibration:

Before you trust any AI output on a factual or analytical task, add one follow-up prompt: "How confident are you in this answer and what are the weakest parts of your reasoning?"

The research found this produces genuinely useful uncertainty assessment. The AI identifies the parts of its own output it's least certain about. You now know exactly where to focus your verification effort instead of checking everything equally.

Reversing Chain-of-Thought:

After getting an output, ask the AI to reconstruct the original problem from the answer it gave. Compare that reconstruction to your actual question. Any gap between them reveals where the AI's interpretation of your task diverged from your actual intention. Fix the gap in your original prompt and run again.

This one sounds strange but works consistently. If the AI can't accurately reconstruct your question from its own answer, the answer probably wasn't answering your actual question.

The Bigger Lesson Here

The principle behind Self-Refine is not complicated.

First drafts have problems. Every first draft. Human or AI. The question is who finds them. If the answer is always you, you're the quality control layer in every workflow. That's slow and it scales badly.

Self-Refine makes the AI the first quality control layer. Not the only one. You still read the output. You still make judgment calls the checklist can't make. But you're making those calls on a draft that's already been through a critique pass, not a raw first draft with all the obvious problems still in it.

That shift — from you doing all the quality control to AI doing the first pass — is worth more time than almost any other technique in this series.

What Changes After Using This

The first time you run Stage 2 on a draft you were about to edit yourself, you'll notice two things.

First, the AI finds problems you would have found. That's the time it saves you. Second, occasionally it finds a problem you might have missed. That's the quality it adds.

After a few weeks of running this structure consistently, you'll stop thinking about prompting as a one-shot process. Every complex output becomes a two-stage task. Generate. Critique. Revise. The habit builds quickly because the results are obvious from the first time you use it.

Try This Right Now

Take the last piece of AI-generated content you were unhappy with. The draft that needed too much editing. The output that was almost right but not quite.

Run Stage 2 on it right now. Paste it into the critique prompt. Work through the checklist.

See what the AI finds. See how the revised version compares to what you would have produced manually.

That comparison is the argument for building this into every workflow where quality matters.

What's Coming Next

Next newsletter we go deeper into Module 5 with Chain-of-Verification — the technique where AI generates a list of questions designed to test whether its own answer is actually correct, answers each one, and uses the results to produce a verified final output.

It's Self-Refine taken further. Instead of critiquing the writing, it's verifying the facts and logic. For any task where a confident wrong answer has real consequences, this is the technique that catches what self-criticism misses.

See you then.

Reply With Your Results

Run the Self-Refine critique prompt on something real this week and reply with what it found.

Tell me which checklist item caught the most problems. Tell me if the revised output was meaningfully different from the original. Send me both versions if you want a second opinion on whether the critique actually improved it.

I read every reply.

— Prompt Guy

Update your email preferences or unsubscribe here

228 Park Ave S, #29976, New York, New York 10003, United States

Similar newsletters

There are other similar shared emails that you might be interested in: