Article

Jun 9, 2026

From Vibe-Coded Prototype to Production: The 12-Point Hardening Checklist

Your vibe-coded app works on your laptop. Here's the 12-point checklist between that and customers actually using it without anything catching fire

A single thin orange seam of light running through a fractured black architectural surface

Your vibe-coded app works. The demo lands. A few friends are using it. And now you're staring at the gap between "works on my laptop" and "a stranger can pay me for this without something catching fire." That gap is where most prototypes die quietly, and it's the gap nobody publishes a real checklist for.

This is that checklist. Not anti-vibe-coding. Pro-shipping the thing.

TL;DR

  • A vibe coded app production ready pass means hardening five failure classes: secrets, auth, data loss, scale, maintainability.

  • About 25% of YC's Winter 2025 cohort runs ~95% AI-generated code, so this isn't a fringe problem anymore.

  • The 12-point checklist below is what we run before any prototype touches a real customer.

  • Hardening typically lands in the $25K–75K range; full rebuilds are $75K–500K+ per Chrono Innovation's 2026 cost report.

  • Some prototypes should stay prototypes forever. We'll tell you which.

1. Vibe coding got you 70% of the way — the dangerous part is which 70%

The answer to is vibe coding safe for production is: the working parts are usually fine. The missing parts are what hurt you.

Vibe-coded apps tend to nail the happy path. A user signs up, clicks the thing, sees the result. What they skip is everything that only matters when something goes wrong: the second user signing up at the same moment, the attacker pasting a script tag into a name field, the database migration that silently drops a column, the API key sitting in the client bundle.

This isn't theoretical. TechCrunch reported in March 2025 that roughly 25% of YC's Winter 2025 cohort had codebases that were about 95% AI-generated. Those apps are taking payments. They're storing PII. They're running on production traffic. The question stopped being whether vibe-coded software reaches users and became how to harden it before it does.

And the hardening has a shape. There are five failure classes, twelve concrete checks, and an honest decision tree between patching what you have and starting over. We run this exact sequence on client codebases at Entropy's software-development practice, and the same pattern keeps showing up.

2. The five failure classes nobody warned you about

Every prototype we audit fails in some combination of the same five places. Naming them matters because it turns a vague "it's not ready" into a list you can actually work.

Secrets. API keys, database URLs, and service tokens committed to the repo, baked into the client bundle, or echoed in error messages. The model didn't know which strings were dangerous, so it treated all of them like configuration.

Auth. Routes that check authentication on the frontend only. JWTs without expiry. Password resets that email a plaintext link with no rate limit. "Admin" gated by a boolean in localStorage.

Data loss. No backups. No migrations. Schema changes applied with a casual DROP TABLE. One bad deploy and last week's signups are gone.

Scale. Queries inside loops. No indexes. Synchronous calls to third-party APIs in the request path. The app handles 10 users and dies at 100.

Maintainability. This is the quiet one. GitClear's 2025 analysis of 211 million lines of code found AI-era code carries 48% more copy-paste duplication and 60% less refactoring than pre-AI baselines. Translation: the next change is more expensive than the last, and the cost curve is the wrong shape. We wrote about that pattern in detail in AI-generated code and the new technical debt.


Comparison grid of five failure classes showing prototype behavior, production requirement, and cost of skipping

The five failure classes we find in nearly every vibe-coded prototype audit.

3. The 12-point production checklist (printable)

This is the list we run before a vibe-coded app sees a real customer. Score yourself honestly. Anything below a 7/10 is a blocker.

Secrets and configuration

  1. All secrets live in a secret manager, not in .env files committed to git, not in client-side code, not in error logs. Vercel, AWS Secrets Manager, Doppler — pick one. Rotate any key that has ever been in the repo.

  2. Production environment variables are separate from staging and local. Different database, different API keys, different everything. A bad seed script in dev should be unable to touch production data.

Auth and authorization

  1. Every protected route checks auth server-side. Frontend route guards are UX, not security. The API must independently verify the session on every request.

  2. Authorization is row-level, not just role-level. User A cannot fetch User B's data by changing an ID in the URL. This is the single most common vibe-coding security risk we find, and it's usually present in apps that "have auth."

  3. Password reset, signup, and login are rate-limited. 5 attempts per minute per IP is a defensible floor. Without this, you have a credential-stuffing target.

Data integrity

  1. Automated daily backups, tested monthly. A backup you've never restored from is a hope, not a backup. Schedule one test restore on the calendar.

  2. Schema changes go through migrations, not manual SQL in the production console. Tools like Prisma, Drizzle, or Alembic exist for this — see their published documentation for setup.

  3. Destructive operations require confirmation and are reversible for 30 days. Soft deletes, not hard ones, anywhere a user can delete something they might want back.

Scale and reliability

  1. Database queries are indexed and bounded. No unbounded SELECT *. No N+1 queries in list endpoints. Add an index for any column you filter or join on.

  2. External API calls have timeouts and retries. A 30-second hang from Stripe should not take down your signup flow. Wrap every third-party call.

Maintainability

  1. At least one human has read every file. Not approved, not skimmed. Read. If the answer is "the AI wrote it and I trust it," you do not have a maintainable codebase; you have a lottery ticket.

  2. There's a test for every payment path and every auth path. Everything else can wait. These two cannot.

That's the list. Print it. Run it. The honest version is that most vibe-coded apps will fail 6–9 of these on first pass. That's normal. The work to harden ai built app code from there is mechanical, not mysterious.

4. Harden vs rebuild: deciding in one afternoon

Here's the call you actually have to make. Three questions, in order:

Does the app do roughly what you want it to do? If you're going to redesign half the features anyway, harden nothing. Rebuild with what you've learned.

Can a senior engineer read the code without flinching? Open three random files. If a competent engineer can follow what's happening in 10 minutes per file, you can harden. If every file requires archaeology, the labor of hardening exceeds the labor of rewriting.

Are the data models roughly right? Schemas are the expensive thing to change later. If your users table is missing fields you obviously need, or your relationships are tangled, fix the schema now — whether you harden or rebuild.

In our client work, roughly 60% of vibe-coded prototypes are hardenable. The other 40% are faster to rebuild with the prototype as a working spec. The prototype was never wasted; it was the cheapest possible product requirements document you'll ever write.

5. What a professional hardening pass costs in 2026

The honest number: a supervised hardening pass on a vibe-coded app typically lands in the $25K–75K range per Chrono Innovation's 2026 MVP cost report, which categorizes expert-AI builds in that band. A full agency rebuild from scratch runs $75K–500K+ in the same report.

What moves you inside that hardening range:

  • Codebase size. Under 10K lines is usually toward the floor. 50K+ pushes toward the ceiling.

  • Payment and PII surface. Anything touching Stripe, health data, or financial data adds compliance review time.

  • Existing test coverage. Zero tests means writing them as part of the pass. That's real hours.

  • How much the original developer is available. If the founder can answer questions, the pass is faster. If they've moved on, an engineer has to reverse-engineer intent.

We break the full math down in what an AI-assisted MVP actually costs. The short version: hardening is almost always cheaper than rebuilding, if the codebase passes the read-without-flinching test.

6. When keeping it a prototype forever is the right call

Not every vibe-coded app should graduate. Some should stay exactly what they are.

If the app is an internal tool with under 20 users who all know each other, the hardening checklist is overkill. If it's a one-time campaign microsite with a three-week life, don't bother. If it's a personal project that makes you happy, the only audit it needs is whether you still enjoy it.

The checklist exists for one specific case: a stranger is about to give you money or data, and you are responsible for what happens next. Below that bar, ship the vibe. Above it, run the twelve points.

FAQ

Is vibe coding safe for production?

The code itself is usually fine on the happy path. What's unsafe is shipping it without a hardening pass on secrets, auth, data integrity, scale, and maintainability. Roughly 25% of YC's Winter 2025 cohort runs ~95% AI-generated code in production, but the responsible ones ran a checklist like this one first.

What's the biggest vibe coding security risk?

In our client work, the most common finding is row-level authorization failures: apps that check whether you're logged in but don't check whether the row you're requesting belongs to you. Changing an ID in the URL exposes other users' data. This is almost universal in unaudited prototypes and almost trivial to fix once named.

How long does a hardening pass take?

For a typical prototype under 20K lines of code, a supervised hardening pass takes 2–4 weeks in our experience. Larger codebases or apps with payment and compliance surfaces extend that. The 12-point checklist itself can be self-audited in an afternoon; the fixes are what take time.

Can I harden an AI-built app myself without hiring anyone?

Yes, if you or someone on your team can read code competently and has shipped production software before. The checklist is concrete enough to self-execute. What you cannot self-execute is the judgment call on harden-vs-rebuild — that one benefits from a second pair of eyes who has seen both outcomes.

Will the same AI tools that built the prototype also harden it?

Partially. Modern coding assistants are good at applying known patterns once you point at the problem. They are bad at finding the problems in the first place, because they wrote the code that contains them. The 12-point checklist is the human layer that tells the AI what to fix.

Run the checklist this week

Pick one section — secrets, auth, or data — and score your app honestly against those points before Friday. The hardening work always starts the same way: someone reads the list, opens the codebase, and writes down what's missing.

If you'd rather have a second set of eyes on it, get in touch.

© All right reserved

© All right reserved