From Vibe Coding to Agentic Engineering: Andrej Karpathy on the Evolution of AI-Assisted Programming

Andrej Karpathy experienced a stark transition in December 2023. The AI researcher who helped build modern AI suddenly found himself “vibe coding”—trusting AI agents to generate increasingly large chunks of code without correction. This shift marked more than faster programming; it signaled the emergence of an entirely new computing paradigm.

The Three Eras of Software

Karpathy frames this evolution through three distinct paradigms:

Software 1.0: Explicit programming with hand-written code Software 2.0: Programming through datasets and neural network training
Software 3.0: Programming through prompts and context windows

In Software 3.0, the LLM becomes a programmable computer. Your context window serves as your primary interface to an interpreter that performs computation across digital information space.

Consider OpenClaw’s installation process. Traditional software would require complex shell scripts targeting multiple platforms. Instead, OpenClaw provides text instructions you copy-paste to your agent. The agent intelligently adapts to your environment, debugs issues, and completes installation—no precise specification required.

When Apps Become Obsolete

Karpathy’s MenuGen project illustrates this paradigm shift dramatically. He built a full application that photographs restaurant menus, extracts text via OCR, generates food images, and re-renders enhanced menus. The Software 3.0 version? Take a photo, send it to Gemini with the prompt “use NanoBanana to overlay pictures onto the menu.” The AI returns the exact menu photo with food images rendered directly into the pixels.

“All of my MenuGen is spurious,” Karpathy realized. “That app shouldn’t exist.”

This represents more than efficiency gains. Software 3.0 enables entirely new capabilities that couldn’t exist before—like automatically generating organizational knowledge bases from unstructured documents or creating intelligent interfaces that adapt in real-time.

The Verifiability Advantage

AI automates fastest in domains where outputs can be verified. This explains why models excel at math and coding while struggling with subjective tasks. During training, these models receive verification rewards in reinforcement learning environments, creating “jagged” intelligence that peaks in verifiable domains.

This jaggedness creates puzzling contradictions. State-of-the-art models can refactor 100,000-line codebases or find zero-day vulnerabilities, yet suggest walking to a car wash 50 meters away. The models remain outside certain “circuits” that weren’t emphasized during training.

For founders, this suggests opportunity. Verifiable domains not yet prioritized by major labs offer potential for specialized fine-tuning and competitive advantage.

From Vibe Coding to Agentic Engineering

Karpathy distinguishes between two approaches to AI-assisted development:

Vibe coding raises the floor—anyone can build software by trusting AI output without deep verification.

Agentic engineering preserves professional quality standards while dramatically increasing speed. It treats AI agents as powerful but fallible tools requiring coordination and oversight.

Agentic engineers don’t just code faster; they achieve 10x+ productivity gains by orchestrating multiple AI agents while maintaining security and reliability standards.

The Human Role in an AI-Native World

As agents handle implementation details, humans focus on higher-level concerns:

Taste and judgment: Defining what good looks like
System design: Creating detailed specifications for agents to implement
Oversight: Catching errors like Karpathy’s agent trying to match users via email addresses instead of persistent IDs

Karpathy no longer remembers PyTorch API details—whether it’s keepdims or keep_dim, dim or axis. Agents handle this recall perfectly. But he still understands tensor views, memory efficiency, and fundamental architectural decisions.

“You can outsource your thinking, but you can’t outsource your understanding,” he notes, quoting a tweet that resonates with his experience.

Building for an Agent-First World

Current infrastructure remains fundamentally human-centric. Documentation tells humans what to do rather than providing copy-paste instructions for agents. Deployment requires navigating human-designed interfaces and menus.

Karpathy envisions agent-native infrastructure where you describe what you want built, and agents handle everything from coding to deployment without human intervention. This requires rethinking systems as collections of sensors and actuators that agents can manipulate directly.

What Still Matters

Despite AI’s expanding capabilities, understanding remains uniquely human. You must still direct the thinking, define objectives, and maintain the mental models that guide agent coordination.

Karpathy builds knowledge bases from articles and documents, using AI to create different projections of information that enhance his understanding. These tools don’t replace comprehension—they amplify it.

The future belongs to those who can effectively direct AI agents while maintaining deep understanding of the domains they’re automating. Technical details become less important; architectural thinking and system design become paramount.

As we move toward agent-to-agent communication and fully automated workflows, the humans who thrive will be those who best understand how to orchestrate intelligence rather than implement it directly.

From Vibe Coding to Agentic Engineering: Andrej Karpathy on the Evolution of AI-Assisted Programming

From Vibe Coding to Agentic Engineering: Andrej Karpathy on the Evolution of AI-Assisted Programming

The Three Eras of Software

When Apps Become Obsolete

The Verifiability Advantage

From Vibe Coding to Agentic Engineering

The Human Role in an AI-Native World

Building for an Agent-First World

What Still Matters

AutoTTS: Automated Discovery of Test-Time Scaling Strategies for Large Language Models

AutoTTS: Automated Discovery of Test-Time Scaling Strategies for Large Language Models

RustAssistant: Using Large Language Models to Automatically Fix Rust Compilation Errors