Vibe Coding in Production

Understanding vibe coding and how to make it production-ready with SDD

This Tutorial is based on this video

Vibe coding, as defined by Andre Carpathy, is about "fully giving into the vibes, embracing exponentials, and forgetting that the code even exists". This means allowing AI to generate large chunks of code without meticulously reviewing every line. While this offers immense potential for productivity, it also comes with risks, especially in a production environment. This tutorial outlines how to leverage vibe coding responsibly.

1. Understanding Vibe Coding and its Context 🤔

  • What Vibe Coding Is Not: It's not just extensive AI code generation where you're in a tight feedback loop with the model (e.g., using Cursor or Co-pilot).
  • The True Essence: It's about forgetting the code exists, allowing non-engineers to create entire applications, and unlocking significant productivity.
  • The Downside: Uncontrolled vibe coding can lead to issues like maxed-out API usage, subscription bypasses, and random database entries, often by those new to coding.
  • Why It Matters: The Exponential: The length of tasks AI can handle is doubling every seven months. Soon, AI will generate an entire day's or week's worth of work, making it impossible for humans to keep up by reviewing every line. We must learn to trust these systems, similar to how developers trust compilers without reading the assembly code.

2. The Core Principle: Trust the Product, Not Just the Code ✅

The key to responsible vibe coding is to "forget that the code exists but not that the product exists". This means focusing on verifying the product's functionality and outcomes rather than the underlying AI-generated code. This isn't a new problem; managers constantly oversee experts in domains they don't fully understand.

  • Analogies from Management:
    • A CTO manages an expert by writing acceptance tests for the expert's work, even without understanding the implementation.
    • A Product Manager (PM) reviews an engineering feature by using the product to ensure it works as expected.
    • A CEO checks an accountant's work by spot-checking key facts and data slices to build confidence in the overall financial model.
  • Finding an Abstraction Layer: The goal is to find an abstraction layer that you can verify without needing to understand the underlying implementation.

3. Navigating the Challenge of Tech Debt ⚠️

A significant caveat to vibe coding is tech debt. Currently, there isn't a good way to measure or validate tech debt without reading the code yourself.

  • Focus on Leaf Nodes: To mitigate this, focus vibe coding on "leaf nodes" in your codebase. These are parts of the system that nothing else depends on; they are the "end feature" or "bell and whistle". It's acceptable for these parts to have some tech debt because they are less likely to change or have further features built upon them.
  • Protect Core Architecture: The "trunks and underlying branches" (the core architecture) should still be deeply understood and protected by human engineers to ensure they remain extensible, understandable, and flexible.
  • Evolving Trust: As AI models improve (e.g., Claude 4), the level of trust in their ability to write extensible code with less tech debt will increase, allowing for broader application of vibe coding.

4. How to Succeed at Vibe Coding: Be LLM's PM 💼

"Ask not what LLM can do for you but what you can do for LLM". When vibe coding, you act as a product manager for the AI.

  • Provide Comprehensive Guidance: Think about what guidance and context a new employee would need to succeed. Don't just give quick commands like "make this feature"; provide a tour of the codebase, requirements, specifications, and constraints.
  • Invest in Pre-Planning: Spend 15-20 minutes collecting guidance into a single prompt. This involves:
    • Conversing with the AI to explore the codebase.
    • Identifying files that need changing.
    • Collaboratively building a plan that captures the desired outcome and codebase patterns.
    • Once this artifact is created, then direct the AI to execute the plan. This significantly increases the AI's success rate.
  • Ask the Right Questions: Vibe coding in production is not for everyone, especially those who are fully non-technical. You need enough technical understanding to effectively "product manage" the AI and ask clarifying questions to guide it correctly.

5. Case Study: Merging a 22,000-Line Change Responsibly 🚀

The speaker shares an experience of merging a large, AI-written change to a production reinforcement learning codebase.

  • Human-Driven Requirements & Guidance: Days of human work went into defining requirements, guiding the AI, and shaping the system design.
  • Focus on Leaf Nodes: The change was primarily concentrated in leaf nodes, where tech debt was acceptable. Important, extensible parts received heavy human review.
  • Designed for Verifiability:
    • Stress Tests for Stability: Carefully designed stress tests measured stability over long durations without needing to read the code.
    • Human-Verifiable Inputs and Outputs: The system was designed to have easily verifiable inputs and outputs, allowing correctness to be checked based on these rather than the code itself.
  • Outcome: This approach allowed for the delivery of a large, high-confidence change in a fraction of the time and effort. It shifted the perspective on engineering costs, enabling bigger features and changes more readily.

6. Key Takeaways for Responsible Vibe Coding in Production 🎯

To successfully vibe code in production:

  1. Be LLM's PM: Provide clear guidance and context.
  2. Focus on Leaf Nodes: Apply vibe coding to isolated parts of the codebase where tech debt is contained.
  3. Prioritize Verifiability: Design systems with clear inputs and outputs, and use tests that can confirm correctness without requiring code review.
  4. Remember the Exponential: While optional today, embracing vibe coding will become crucial to leverage future AI models that produce vast amounts of work. Those who resist will become bottlenecks.

Q&A Insights: Learning and Balancing 🧠

  • Learning in an AI-Assisted World: While traditional "grind" of coding may diminish, AI tools can accelerate learning. Use the AI as an "always there pair programmer" to understand new libraries or design choices. This also allows engineers to take more "shots on goal" for higher-level architectural decisions, compressing learning cycles.
  • Balancing Information for the AI: The balance depends on what you care about. For simple tasks, state requirements without implementation details. For familiar codebases, provide more depth (e.g., specific classes, example features). However, avoid over-constraining the model; treat it like a junior engineer and give it what it needs to succeed, not a rigid format.
  • Cybersecurity Considerations: Security in vibe-coded apps comes down to the "PM" (the human) understanding the context, knowing what's dangerous, and where to be careful. For production systems, the human needs to be able to ask the right questions to guide the AI. Designing systems to be offline or to have "provably correct" frameworks (where critical components like auth/payments are handled by trusted systems) can help mitigate risks.
  • Test-Driven Development (TDD) with AI: TDD is very useful. Encourage the AI to write minimalist, end-to-end tests (e.g., happy path, error cases) rather than overly implementation-specific tests. Reviewing the tests first can be a good way to gain confidence in the AI-generated code.
  • Embracing the Exponential: It means understanding that AI models will improve faster than we can possibly imagine, not just steadily. Things will "go bonkers" rapidly, leading to capabilities millions of times better than today, similar to the exponential growth in computer processing power over decades.
  • Workflow with AI Coding Tools: A hybrid approach using both terminal and IDE (like VS Code with Claude Code) is common. Compacting the AI session (starting a new one) is recommended at natural stopping points, similar to when a human programmer would take a break, to manage token usage and keep the context fresh.