Ralph Wiggum for Claude Code is Insane: AI Update #11

Plus: Why OpenAI is betting ads will save the business. Everything you need to know from AI this week.

Jan 23, 2026

👋 Hey there, I’m Aakash. In this newsletter, I cover AI, AI PM, and getting a job. This is your weekly AI update. For more: Podcast | Cohort

Annual subscribers get a free year of 9 premium products: Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, and Mobbin (worth $28,336).

Welcome back to the AI Update.

Everyone is going crazy over the new Ralph Wiggum technique for Claude Code.

What the heck is it, and why does it matter? That’s today’s deep dive.

But first, I cover this week’s top AI news. And I end with my appearance on the Convergence podcast.

Your AI agents are answering questions. But are they solving problems?

Companies dropped millions on AI in 2025, but 85% of leaders still can’t prove ROI. Turns out, building agents is the easy part—measuring their impact is where everyone gets stuck.

This guide gives you the KPIs that actually matter for product teams:

Which metrics separate signal from noise
How to prove agents are speeding up workflows
Tips to catch user frustration before it hurts retention

Get the Metrics

There’s a million AI news articles, resources, tools, and fundraises every week. You can’t keep track of everything. Here’s what mattered - the 1 big story and key news.

OpenAI Is Betting Ads Will Save the Business

Sam Altman said ads were “a last resort.” Fifteen months later, they’re here.

OpenAI dropped three announcements last week: $20B+ in annualized revenue, an $8/month ChatGPT Go plan, and ads coming to the free tier. The timing wasn’t random.

Sam Altman@sama

We are starting to test ads in ChatGPT free and Go (new $8/month option) tiers. Here are our principles. Most importantly, we will not accept money to influence the answer ChatGPT gives you, and we keep your conversations private from advertisers. It is clear to us that a lot

OpenAI @OpenAI

In the coming weeks, we plan to start testing ads in ChatGPT free and Go tiers. We’re sharing our principles early on how we’ll approach ads–guided by putting user trust and transparency first as we work to make AI accessible to everyone. What matters most: - Responses in

7:58 PM · Jan 16, 2026 · 12.8M Views

4.94K Replies · 915 Reposts · 10.1K Likes

This was coordinated. Let me break down why.

The revenue story first. Sarah Friar, OpenAI’s CFO, released actual numbers. Revenue has tripled every year since 2023: $2B → $6B → $20B+. Compute grew on the same curve: 0.2 gigawatts → 0.6 → 1.9.

Their revenue is literally constrained by how many GPUs they can buy.

The flywheel: More compute → better models → more users → more revenue → more compute.

signüll@signulll

openai now hitting ~$20b annualized revenue basically validates their early 3x every year claim. that part aged well & it is impressive but not shocking. they project ~$100b in ~3 years, which implies ~2x yoy from here. that’s a very different slope esp for a household name

1:07 AM · Jan 19, 2026 · 381K Views

107 Replies · 51 Reposts · 907 Likes

But they’re almost certainly losing money on every token. Dr. Gingerballs did the math: OpenAI makes about $1.20 per kilowatt-hour of compute. H100s cost $2-7/hour on the open market. Best case, they’re on track to lose $20B this year.

So why launch a cheaper plan?

Enter ChatGPT Go. $8/month gets you 10x the messages, file uploads, and image generation of the free tier. All powered by GPT-5.2 Instant (the smaller, cheaper model). It launched in India in August. Now it’s worldwide.

This is a loss leader. OpenAI is probably losing money on Go subscribers.

OpenAI@OpenAI

ChatGPT Go is rolling out globally in every country where ChatGPT is available. ChatGPT Go is our low-cost subscription tier that gives you 10x more messages, file uploads and image creation vs free tier, more memory, longer context window, and unlimited use of GPT 5.2 instant

6:00 PM · Jan 16, 2026 · 1.02M Views

355 Replies · 376 Reposts · 2.78K Likes

But once you’re in the ecosystem, you’re locked in. ChatGPT remembers you, personalizes to you, makes switching painful. And consumers drive enterprise. When you’re familiar with ChatGPT at home, you push for it at work.

Now the ads. Sam Altman said in October 2024 that ads were “a last resort.” Fifteen months later, they’re here.

OpenAI@OpenAI

6:00 PM · Jan 16, 2026 · 16.8M Views

3.62K Replies · 1.56K Reposts · 9.55K Likes

Starting in the coming weeks, free and Go tier users will see sponsored content. OpenAI’s principles: ads won’t influence responses, conversations stay private from advertisers, Plus/Pro/Business/Enterprise stay ad-free.

Tom Warren@tomwarren

"I kind of think of ads as like a last resort for us as a business model," - Sam Altman, October 2024

Sam Altman @sama

10:48 PM · Jan 16, 2026 · 5.27M Views

799 Replies · 6.79K Reposts · 78.7K Likes

Here’s why the math works:

Meta makes ~$58 per user per year from ads alone. OpenAI has 800M+ weekly active users. At 9% of Meta’s ARPU, that’s $5B incremental. At full parity? $57B/year.

And AI sits closer to intent than social feeds do. When someone asks ChatGPT “what laptop should I buy,” they’re ready to buy. That’s worth more than a scroll-past impression. If ChatGPT becomes the default interface for how people research and buy, OpenAI could build one of the greatest ad businesses ever.

Why release all three (almost) together? If OpenAI announced ads alone, the narrative would be “they’re desperate.” By pairing it with proof of $20B revenue and continued 3x growth, the narrative becomes “they’re expanding.”

The Google contrast. Demis Hassabis told Alex Heath that Google has “no plans” for ads in Gemini. His quote: “It’s interesting they’ve gone for that so early.”

But Google can afford that stance. Their ad business made $74B last quarter. OpenAI made $5B total in Q4 and lost money on every dollar.

Google keeps Gemini ad-free because the chatbot protects search ads. Every user who stays in Google’s ecosystem eventually sees ads somewhere. That’s their moat.

OpenAI doesn’t have that luxury. They’re burning $9B annually on compute, committing $1.4 trillion in infrastructure through 2033, projecting $74B in operating losses by 2028. They need revenue streams that don’t require raising another $100B.

Why this matters: OpenAI is playing for lock-in. Every piece of this is about making ChatGPT the default AI for a billion people before Google wakes up.

If you’re building on their platform, expect pricing to get more aggressive. If you’re competing with them, expect bundling.

The bet: capture the market now, monetize it later.

Continue Reading Online

News

GPT-5.2 Pro solved a decades-old math problem. Neel Somani used it to crack Erdős problem #281. No prior solutions existed. Terence Tao called it “perhaps the most unambiguous instance” of AI solving an open problem. This shows the potential for AI to aid in new discovery research.
Anthropic published Claude’s new constitution: the full document used directly in training. It’s the most detailed public look at how an AI company shapes model personality.
Anna’s Archive has become the de facto shadow data broker for the entire AI industry.
Turmoil at Thinking Machines ended with its cofounder joining OpenAI.
xAI Colossus 2 became the first gigawatt-scale AI training cluster online.
Claude now connects to your health data.
Kling reached $240M ARR in 18 months.
xAI open sourced the new X algorithm.

Resources

7 NotebookLM Use Cases to try
The shorthand guide to everything Claude Code
10 metrics to measure and improve Agent performance

New Tools

Blink: Full stack agentic vibe coding tool - hit #1 on PH
Noodle Seed: build a no-code AI app so it’s discoverable inside ChatGPT and other AI assistants - hit #1 on PH

Market

Humans& raised $480 million seed at $4.48 billion valuation. SV Angel led with participation from Nvidia, Jeff Bezos, and GV. The three-month-old startup was founded by researchers from Anthropic, xAI, and Google.
Baseten raised $300 million Series E at $5 billion valuation, more than 2x from four months ago. IVP and CapitalG led with Nvidia contributing $150 million. The company aims to become “AWS for inference.”
OpenEvidence raised $250 million Series D at $12 billion valuation, doubling from October. Thrive Capital and DST led. The “ChatGPT for doctors” startup is now used by 40% of US physicians and crossed $100 million in revenue.
Upscale AI raised $200 million Series A to tackle data center networking bottlenecks for AI workloads. Tiger Global, Premji Invest, and Xora Innovation led.
Deepgram raised $130 million Series C to scale real-time voice AI infrastructure.
OpenAI is raising a fresh $50B from the middle east at a $750-830B valuation.

And now on to today’s deep dive:

Ralph Wiggum: How to Ship Code While You Sleep

Someone spent $297 on API costs and delivered work that would have cost $50,000 to outsource. Here’s the exact system.

At a YC hackathon, teams shipped 6 working repositories overnight.

Geoffrey Huntley, who invented this system, built an entire programming language over 3 months while barely touching his keyboard.

The technique is called Ralph. It’s a bash loop that runs an AI coding agent repeatedly until a task is done.

while :; do cat PROMPT.md | claude-code ; done

That’s it. While you sleep, while you eat dinner, the loop keeps going. It picks up a task, builds it, checks if it works, commits, picks the next task. When you wake up, features are finished.

Let me show you exactly how this works:

Why normal AI coding breaks

You open Claude Code with an idea. 45 minutes later you’re fixing the same bug for the third time. The AI forgot what you were building. The context got polluted.

The problem is context rot.

LLMs are autoregressive. They predict the next token based on everything before it. As conversations grow, the model loses track of instructions buried in history. Quality degrades even when you’re under the context limit.

The typical fix is compaction. The agent summarizes history and uses that as the new context. But compaction is lossy. That instruction you gave three prompts ago? Gone.

Ralph throws compaction away entirely. Each loop iteration starts fresh. Clean context. No accumulated confusion.

Matt Pocock@mattpocockuk

My Ralph Wiggum breakdown went viral. It's a keep-it-simple-stupid approach to AI coding that lets you ship while you sleep. So here's a full explanation, example code, and demo.

3:36 PM · Jan 5, 2026 · 571K Views

207 Replies · 368 Reposts · 4.84K Likes

How it actually works

Memory lives in three files:

Git commits = the code itself
progress.txt = learnings from this sprint
prd.json = what’s done, what’s next

The AI reads what happened, picks the next task, implements it, commits, logs learnings. Loop runs again.

Here’s a real prompt.md:

You are an autonomous coding agent.

1. Read prd.json — find the highest priority task where "passes": false
2. Implement ONLY that task
3. Run: pnpm type-check && pnpm test
4. If tests pass, commit the change
5. Update prd.json — set "passes": true
6. Append learnings to progress.txt
7. If all tasks complete, output: <promise>complete</promise>

Only work on ONE task per iteration.

Here’s prd.json:

{
  "branch": "feature/priority-filter",
  "stories": [
    {
      "id": "1",
      "title": "Add priority column to database",
      "acceptance_criteria": [
        "Column exists in tasks table",
        "Default value is 'medium'",
        "Accepts: high, medium, low"
      ],
      "passes": false
    },
    {
      "id": "2", 
      "title": "Add filter dropdown to UI",
      "acceptance_criteria": [
        "Dropdown shows: All, High, Medium, Low",
        "Selecting option filters list immediately",
        "Default selection is 'All'"
      ],
      "passes": false
    }
  ]
}

Here’s progress.txt after a few iterations:

=== Iteration 1 ===
Thread: amp-thread-abc123
Implemented: Priority column migration
Files changed: prisma/schema.prisma, src/db/migrations/001_priority.sql
Learnings: Use snake_case for DB columns. Prisma generates types automatically.

=== Iteration 2 ===
Thread: amp-thread-def456  
Implemented: Filter dropdown component
Files changed: src/components/FilterDropdown.tsx, src/hooks/useFilters.ts
Learnings: Filter state lives in useFilters hook. Don't put it in component state.

The bash script you can steal

Here’s ralph.sh:

#!/bin/bash
set -e

MAX_ITERATIONS=${1:-10}
PROMPT_FILE="./prompt.md"
PRD_FILE="./prd.json"

for ((i=1; i<=MAX_ITERATIONS; i++)); do
    echo "=== Ralph iteration $i of $MAX_ITERATIONS ==="
    
    OUTPUT=$(cat "$PROMPT_FILE" | claude --dangerously-skip-permissions -p)
    
    if echo "$OUTPUT" | grep -q "<promise>complete</promise>"; then
        echo "All tasks complete!"
        exit 0
    fi
done

echo "Hit max iterations. Check progress.txt for status."

Run it: ./ralph.sh 14

It loops up to 14 times, stopping early if the agent outputs the completion promise.

The 3-step workflow

Step 1: Describe what you want

Talk for 2-3 minutes. Voice-to-text works great; I use the built-in dictation on Mac (Fn key twice) or the iOS app.

“I want users to filter tasks by priority. High, medium, low. A dropdown with all options. Selecting one filters the list immediately.”
Then: “Convert this into a PRD with atomic user stories. Each story should be completable in one context window. Include binary acceptance criteria.”

Step 2: Generate the prd.json

Use this prompt to convert your PRD to the Ralph format:

Convert this PRD to prd.json for the Ralph agent system.

Rules:
- Each story must be completable in ONE iteration
- Acceptance criteria must be verifiable (pass/fail, not "looks good")
- Order by dependency (foundational tasks first)
- Set all "passes" to false

Output format:
{
  "branch": "feature/name",
  "stories": [
    {
      "id": "1",
      "title": "Short description",
      "acceptance_criteria": ["Testable condition 1", "Testable condition 2"],
      "passes": false
    }
  ]
}

Step 3: Run Ralph

./ralph.sh 10

Walk away. Check back in an hour.

Ahmad@TheAhmadOsman

me watching Claude Code write the code for me

9:15 AM · Jan 21, 2026 · 213K Views

90 Replies · 446 Reposts · 7.36K Likes

Why the loop must run outside the agent

The Claude Code plugin version of Ralph runs inside the session. The loop is controlled by the agent. This defeats the purpose.

Real Ralph runs outside. The bash script kills the session and starts fresh each time. The agent can’t decide it’s “done.” Only the prd.json decides that.

Think of it as layers:

┌─────────────────────────────┐
│  BASH LOOP (ralph.sh)       │  ← Controls everything
│  ┌───────────────────────┐  │
│  │  CLAUDE CODE SESSION  │  │  ← Fresh each iteration
│  │  - Reads prd.json     │  │
│  │  - Does ONE task      │  │
│  │  - Commits            │  │
│  │  - Dies               │  │
│  └───────────────────────┘  │
└─────────────────────────────┘

The bash loop is the boss. Claude Code is the worker that gets reset every shift.

Feedback loops are everything

The agent needs to know if code works. Without feedback, it marks things complete that aren’t.

Add this to your prompt:

Before marking a task complete:
1. Run: pnpm type-check (must pass)
2. Run: pnpm test (must pass)  
3. If UI task: use browser tool to verify visually
4. Only then update prd.json

For frontend work, hook up Playwright’s MCP server so the agent can actually see the UI:

Use the browser tool to:
- Navigate to localhost:3000
- Verify the dropdown exists
- Click each option
- Confirm the list filters correctly

The math

10 iterations ≈ $30 in API costs.

Senior developer ≈ $500/day fully loaded.

Ralph gets you 90% there. You spend an hour on cleanup. That’s an 8-hour workday compressed to 1 hour + $30.

Geoffrey Huntley calculated unit economics: $10.42/hour in raw API costs. Add your review and debugging time and the true rate is higher. Still dramatically cheaper than outsourcing.

(These numbers assume Claude Sonnet. Opus costs roughly 5x more per token. For most Ralph tasks, Sonnet handles implementation fine. Save Opus for complex architectural decisions or debugging sessions.)

Common mistakes I’ve seen

Tasks too big. If it can’t finish in one context window, it fails halfway and produces garbage. Break it down.
Vague criteria. “Make it look good” = nothing. “Button is blue #3B82F6, 44px tall, centered” = something.
Using the plugin. The Claude Code plugin runs Ralph inside the session. Context still accumulates. Use the bash script.
No tests. Without type checks and unit tests, the agent marks broken code as complete.
Rushing the PRD. An hour on requirements saves 10 hours of fixing.

The shift

You stop writing code. You write requirements, define acceptance criteria, review output.

Product designer, not engineer.

Understanding systems and requirements matters more than typing speed now.

Get started now

Ralph is free and open source:

Tell your agent to set it up:

Look at github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/ and set up the Ralph loop in my project.

Create prompt.md, prd.json template, progress.txt, and ralph.sh.

First time: 30 minutes to understand. Second time: 10 minutes to start.

Three months from now this will be in every YouTube tutorial. Start before it’s everywhere.

That’s it for today’s deep dive. Finally, onto insights from a podcast I did:

What’s Killing AI Startups (and Apollo’s PLG Playbook)

I joined the Convergence Podcast. Here’s a breakdown of some key takeaways:

On Apollo’s PLG playbook. Most B2B companies start with sales, then bolt on self-serve later. We went the opposite direction.

We didn’t chase revenue early. We chased usage. Give away enough value that users can’t imagine going back to the old way.

This only works if your product solves a problem people feel daily. If users need reminders your product exists, PLG won’t save you.

On AI startup pitfalls. I’ve watched dozens of companies stumble on these same things. Lab teams optimizing for benchmarks while product teams optimize for users. Chasing every new model release instead of shipping. Moving fast without guardrails. Demo magic that breaks in production. Raising too much capital too early.

The companies winning? They pick one problem and solve it completely.

On remote interviews. In-person, you read the room. Remote, you perform for a camera. Different game.

Do mock interviews on video. Record yourself. It’s painful to watch but the feedback is instant.

Watch the Full Conversation

That’s all for today. See you next week,

Aakash

P.S. You can pick and choose to only receive the AI update, or only receive Product Growth emails, or podcast emails here.

AI by Aakash

Discussion about this post

Ready for more?