Ralph Wiggum for Claude Code is Insane: AI Update #11
Plus: Why OpenAI is betting ads will save the business. Everything you need to know from AI this week.
👋 Hey there, I’m Aakash. In this newsletter, I cover AI, AI PM, and getting a job. This is your weekly AI update. For more: Podcast | Cohort
Annual subscribers get a free year of 9 premium products: Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, and Mobbin (worth $28,336).
Welcome back to the AI Update.
Everyone is going crazy over the new Ralph Wiggum technique for Claude Code.
What the heck is it, and why does it matter? That’s today’s deep dive.
But first, I cover this week’s top AI news. And I end with my appearance on the Convergence podcast.
Your AI agents are answering questions. But are they solving problems?
Companies dropped millions on AI in 2025, but 85% of leaders still can’t prove ROI. Turns out, building agents is the easy part—measuring their impact is where everyone gets stuck.
This guide gives you the KPIs that actually matter for product teams:
Which metrics separate signal from noise
How to prove agents are speeding up workflows
Tips to catch user frustration before it hurts retention
There’s a million AI news articles, resources, tools, and fundraises every week. You can’t keep track of everything. Here’s what mattered - the 1 big story and key news.
OpenAI Is Betting Ads Will Save the Business
Sam Altman said ads were “a last resort.” Fifteen months later, they’re here.
OpenAI dropped three announcements last week: $20B+ in annualized revenue, an $8/month ChatGPT Go plan, and ads coming to the free tier. The timing wasn’t random.
This was coordinated. Let me break down why.
The revenue story first. Sarah Friar, OpenAI’s CFO, released actual numbers. Revenue has tripled every year since 2023: $2B → $6B → $20B+. Compute grew on the same curve: 0.2 gigawatts → 0.6 → 1.9.
Their revenue is literally constrained by how many GPUs they can buy.
The flywheel: More compute → better models → more users → more revenue → more compute.
But they’re almost certainly losing money on every token. Dr. Gingerballs did the math: OpenAI makes about $1.20 per kilowatt-hour of compute. H100s cost $2-7/hour on the open market. Best case, they’re on track to lose $20B this year.
So why launch a cheaper plan?
Enter ChatGPT Go. $8/month gets you 10x the messages, file uploads, and image generation of the free tier. All powered by GPT-5.2 Instant (the smaller, cheaper model). It launched in India in August. Now it’s worldwide.
This is a loss leader. OpenAI is probably losing money on Go subscribers.
But once you’re in the ecosystem, you’re locked in. ChatGPT remembers you, personalizes to you, makes switching painful. And consumers drive enterprise. When you’re familiar with ChatGPT at home, you push for it at work.
Now the ads. Sam Altman said in October 2024 that ads were “a last resort.” Fifteen months later, they’re here.
Starting in the coming weeks, free and Go tier users will see sponsored content. OpenAI’s principles: ads won’t influence responses, conversations stay private from advertisers, Plus/Pro/Business/Enterprise stay ad-free.
Here’s why the math works:
Meta makes ~$58 per user per year from ads alone. OpenAI has 800M+ weekly active users. At 9% of Meta’s ARPU, that’s $5B incremental. At full parity? $57B/year.
And AI sits closer to intent than social feeds do. When someone asks ChatGPT “what laptop should I buy,” they’re ready to buy. That’s worth more than a scroll-past impression. If ChatGPT becomes the default interface for how people research and buy, OpenAI could build one of the greatest ad businesses ever.
Why release all three (almost) together? If OpenAI announced ads alone, the narrative would be “they’re desperate.” By pairing it with proof of $20B revenue and continued 3x growth, the narrative becomes “they’re expanding.”
The Google contrast. Demis Hassabis told Alex Heath that Google has “no plans” for ads in Gemini. His quote: “It’s interesting they’ve gone for that so early.”
But Google can afford that stance. Their ad business made $74B last quarter. OpenAI made $5B total in Q4 and lost money on every dollar.
Google keeps Gemini ad-free because the chatbot protects search ads. Every user who stays in Google’s ecosystem eventually sees ads somewhere. That’s their moat.
OpenAI doesn’t have that luxury. They’re burning $9B annually on compute, committing $1.4 trillion in infrastructure through 2033, projecting $74B in operating losses by 2028. They need revenue streams that don’t require raising another $100B.
Why this matters: OpenAI is playing for lock-in. Every piece of this is about making ChatGPT the default AI for a billion people before Google wakes up.
If you’re building on their platform, expect pricing to get more aggressive. If you’re competing with them, expect bundling.
The bet: capture the market now, monetize it later.
News
GPT-5.2 Pro solved a decades-old math problem. Neel Somani used it to crack Erdős problem #281. No prior solutions existed. Terence Tao called it “perhaps the most unambiguous instance” of AI solving an open problem. This shows the potential for AI to aid in new discovery research.
Anthropic published Claude’s new constitution: the full document used directly in training. It’s the most detailed public look at how an AI company shapes model personality.
Anna’s Archive has become the de facto shadow data broker for the entire AI industry.
Turmoil at Thinking Machines ended with its cofounder joining OpenAI.
xAI Colossus 2 became the first gigawatt-scale AI training cluster online.
Claude now connects to your health data.
Kling reached $240M ARR in 18 months.
xAI open sourced the new X algorithm.
Resources
7 NotebookLM Use Cases to try
The shorthand guide to everything Claude Code
10 metrics to measure and improve Agent performance
New Tools
Blink: Full stack agentic vibe coding tool - hit #1 on PH
Noodle Seed: build a no-code AI app so it’s discoverable inside ChatGPT and other AI assistants - hit #1 on PH
Market
Humans& raised $480 million seed at $4.48 billion valuation. SV Angel led with participation from Nvidia, Jeff Bezos, and GV. The three-month-old startup was founded by researchers from Anthropic, xAI, and Google.
Baseten raised $300 million Series E at $5 billion valuation, more than 2x from four months ago. IVP and CapitalG led with Nvidia contributing $150 million. The company aims to become “AWS for inference.”
OpenEvidence raised $250 million Series D at $12 billion valuation, doubling from October. Thrive Capital and DST led. The “ChatGPT for doctors” startup is now used by 40% of US physicians and crossed $100 million in revenue.
Upscale AI raised $200 million Series A to tackle data center networking bottlenecks for AI workloads. Tiger Global, Premji Invest, and Xora Innovation led.
Deepgram raised $130 million Series C to scale real-time voice AI infrastructure.
OpenAI is raising a fresh $50B from the middle east at a $750-830B valuation.
And now on to today’s deep dive:
Ralph Wiggum: How to Ship Code While You Sleep
Someone spent $297 on API costs and delivered work that would have cost $50,000 to outsource. Here’s the exact system.
At a YC hackathon, teams shipped 6 working repositories overnight.
Geoffrey Huntley, who invented this system, built an entire programming language over 3 months while barely touching his keyboard.
The technique is called Ralph. It’s a bash loop that runs an AI coding agent repeatedly until a task is done.
while :; do cat PROMPT.md | claude-code ; doneThat’s it. While you sleep, while you eat dinner, the loop keeps going. It picks up a task, builds it, checks if it works, commits, picks the next task. When you wake up, features are finished.
Let me show you exactly how this works:
Why normal AI coding breaks
You open Claude Code with an idea. 45 minutes later you’re fixing the same bug for the third time. The AI forgot what you were building. The context got polluted.
The problem is context rot.
LLMs are autoregressive. They predict the next token based on everything before it. As conversations grow, the model loses track of instructions buried in history. Quality degrades even when you’re under the context limit.
The typical fix is compaction. The agent summarizes history and uses that as the new context. But compaction is lossy. That instruction you gave three prompts ago? Gone.
Ralph throws compaction away entirely. Each loop iteration starts fresh. Clean context. No accumulated confusion.
How it actually works
Memory lives in three files:
Git commits = the code itself
progress.txt = learnings from this sprint
prd.json = what’s done, what’s next
The AI reads what happened, picks the next task, implements it, commits, logs learnings. Loop runs again.
Here’s a real prompt.md:
You are an autonomous coding agent.
1. Read prd.json — find the highest priority task where "passes": false
2. Implement ONLY that task
3. Run: pnpm type-check && pnpm test
4. If tests pass, commit the change
5. Update prd.json — set "passes": true
6. Append learnings to progress.txt
7. If all tasks complete, output: <promise>complete</promise>
Only work on ONE task per iteration.Here’s prd.json:
{
"branch": "feature/priority-filter",
"stories": [
{
"id": "1",
"title": "Add priority column to database",
"acceptance_criteria": [
"Column exists in tasks table",
"Default value is 'medium'",
"Accepts: high, medium, low"
],
"passes": false
},
{
"id": "2",
"title": "Add filter dropdown to UI",
"acceptance_criteria": [
"Dropdown shows: All, High, Medium, Low",
"Selecting option filters list immediately",
"Default selection is 'All'"
],
"passes": false
}
]
}Here’s progress.txt after a few iterations:
=== Iteration 1 ===
Thread: amp-thread-abc123
Implemented: Priority column migration
Files changed: prisma/schema.prisma, src/db/migrations/001_priority.sql
Learnings: Use snake_case for DB columns. Prisma generates types automatically.
=== Iteration 2 ===
Thread: amp-thread-def456
Implemented: Filter dropdown component
Files changed: src/components/FilterDropdown.tsx, src/hooks/useFilters.ts
Learnings: Filter state lives in useFilters hook. Don't put it in component state.The bash script you can steal
Here’s ralph.sh:
#!/bin/bash
set -e
MAX_ITERATIONS=${1:-10}
PROMPT_FILE="./prompt.md"
PRD_FILE="./prd.json"
for ((i=1; i<=MAX_ITERATIONS; i++)); do
echo "=== Ralph iteration $i of $MAX_ITERATIONS ==="
OUTPUT=$(cat "$PROMPT_FILE" | claude --dangerously-skip-permissions -p)
if echo "$OUTPUT" | grep -q "<promise>complete</promise>"; then
echo "All tasks complete!"
exit 0
fi
done
echo "Hit max iterations. Check progress.txt for status."Run it: ./ralph.sh 14
It loops up to 14 times, stopping early if the agent outputs the completion promise.
The 3-step workflow
Step 1: Describe what you want
Talk for 2-3 minutes. Voice-to-text works great; I use the built-in dictation on Mac (Fn key twice) or the iOS app.
“I want users to filter tasks by priority. High, medium, low. A dropdown with all options. Selecting one filters the list immediately.”
Then: “Convert this into a PRD with atomic user stories. Each story should be completable in one context window. Include binary acceptance criteria.”
Step 2: Generate the prd.json
Use this prompt to convert your PRD to the Ralph format:
Convert this PRD to prd.json for the Ralph agent system.
Rules:
- Each story must be completable in ONE iteration
- Acceptance criteria must be verifiable (pass/fail, not "looks good")
- Order by dependency (foundational tasks first)
- Set all "passes" to false
Output format:
{
"branch": "feature/name",
"stories": [
{
"id": "1",
"title": "Short description",
"acceptance_criteria": ["Testable condition 1", "Testable condition 2"],
"passes": false
}
]
}Step 3: Run Ralph
./ralph.sh 10Walk away. Check back in an hour.
Why the loop must run outside the agent
The Claude Code plugin version of Ralph runs inside the session. The loop is controlled by the agent. This defeats the purpose.
Real Ralph runs outside. The bash script kills the session and starts fresh each time. The agent can’t decide it’s “done.” Only the prd.json decides that.
Think of it as layers:
┌─────────────────────────────┐
│ BASH LOOP (ralph.sh) │ ← Controls everything
│ ┌───────────────────────┐ │
│ │ CLAUDE CODE SESSION │ │ ← Fresh each iteration
│ │ - Reads prd.json │ │
│ │ - Does ONE task │ │
│ │ - Commits │ │
│ │ - Dies │ │
│ └───────────────────────┘ │
└─────────────────────────────┘The bash loop is the boss. Claude Code is the worker that gets reset every shift.
Feedback loops are everything
The agent needs to know if code works. Without feedback, it marks things complete that aren’t.
Add this to your prompt:
Before marking a task complete:
1. Run: pnpm type-check (must pass)
2. Run: pnpm test (must pass)
3. If UI task: use browser tool to verify visually
4. Only then update prd.jsonFor frontend work, hook up Playwright’s MCP server so the agent can actually see the UI:
Use the browser tool to:
- Navigate to localhost:3000
- Verify the dropdown exists
- Click each option
- Confirm the list filters correctlyThe math
10 iterations ≈ $30 in API costs.
Senior developer ≈ $500/day fully loaded.
Ralph gets you 90% there. You spend an hour on cleanup. That’s an 8-hour workday compressed to 1 hour + $30.
Geoffrey Huntley calculated unit economics: $10.42/hour in raw API costs. Add your review and debugging time and the true rate is higher. Still dramatically cheaper than outsourcing.
(These numbers assume Claude Sonnet. Opus costs roughly 5x more per token. For most Ralph tasks, Sonnet handles implementation fine. Save Opus for complex architectural decisions or debugging sessions.)
Common mistakes I’ve seen
Tasks too big. If it can’t finish in one context window, it fails halfway and produces garbage. Break it down.
Vague criteria. “Make it look good” = nothing. “Button is blue #3B82F6, 44px tall, centered” = something.
Using the plugin. The Claude Code plugin runs Ralph inside the session. Context still accumulates. Use the bash script.
No tests. Without type checks and unit tests, the agent marks broken code as complete.
Rushing the PRD. An hour on requirements saves 10 hours of fixing.
The shift
You stop writing code. You write requirements, define acceptance criteria, review output.
Product designer, not engineer.
Understanding systems and requirements matters more than typing speed now.
Get started now
Ralph is free and open source:
Tell your agent to set it up:
Look at github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/ and set up the Ralph loop in my project.
Create prompt.md, prd.json template, progress.txt, and ralph.sh.First time: 30 minutes to understand. Second time: 10 minutes to start.
Three months from now this will be in every YouTube tutorial. Start before it’s everywhere.
That’s it for today’s deep dive. Finally, onto insights from a podcast I did:
What’s Killing AI Startups (and Apollo’s PLG Playbook)
I joined the Convergence Podcast. Here’s a breakdown of some key takeaways:
On Apollo’s PLG playbook. Most B2B companies start with sales, then bolt on self-serve later. We went the opposite direction.
We didn’t chase revenue early. We chased usage. Give away enough value that users can’t imagine going back to the old way.
This only works if your product solves a problem people feel daily. If users need reminders your product exists, PLG won’t save you.
On AI startup pitfalls. I’ve watched dozens of companies stumble on these same things. Lab teams optimizing for benchmarks while product teams optimize for users. Chasing every new model release instead of shipping. Moving fast without guardrails. Demo magic that breaks in production. Raising too much capital too early.
The companies winning? They pick one problem and solve it completely.
On remote interviews. In-person, you read the room. Remote, you perform for a camera. Different game.
Do mock interviews on video. Record yourself. It’s painful to watch but the feedback is instant.
That’s all for today. See you next week,
Aakash
P.S. You can pick and choose to only receive the AI update, or only receive Product Growth emails, or podcast emails here.


















