Bad Vibes - Why 'vibe coding' is dangerous in the wrong hands | Foxy's Tale - The inane mutterings of Alexander Foxleigh

Setting the scene – from seasoned web dev to clueless game dev

I’ve been building websites and web-apps since dial-up was a thing. On the web I know my way around, I can architect the hell out of a nextjs codebase and I am intimately familiar with all of the best practices needed to produce a high quality codebase.

However, when I threw myself into Star Fall, a bullet-heaven shooter I’m making in the Godot engine, I stepped out of that comfort zone entirely. Suddenly I wasn’t a senior engineer any more. I was a kid with a VIC-20 all over again, copying code out of manuals and googling basic terms like “what even is a Sprite2D?”

Part of what made the leap exhilarating was embracing vibe coding. In a previous post about Star Fall I described vibe-driven development as letting intuition and excitement decide what to work on rather than sticking to a rigid plan. If I woke up with an idea for a new power-up, that’s what went into the game that day. That chaos was liberating. But in reality, I didn’t just code by feel - I let large-language-model assistants write all the code. I barely opened files. I would describe what I wanted, let the AI spit back swaths of GDScript I didn’t really understand, then hit play and see if it worked.

A golden retriever in a science lab with the caption "I have no idea what I'm doing" — No, YOU overuse the same images!

That approach is what’s commonly called vibe coding: building software as though you were ordering from a takeaway menu. You tell the assistant “I want an enemy system that spawns waves of ships (albeit a bit more detailed than that!),” then wait for the dish to arrive. It’s messy, unpredictable and, at times, magical. It’s also incredibly dangerous when you don’t know the language or the engine you’re working in.

My AI toolkit: five weeks of experiments

Because Star Fall was a side-project in my free time, I treated the tooling as an experiment. Over five weeks I worked my way through an evolving stack of AI coding environments:

Cursor with Sonnet 3.7. This was my initial tool. Cursor is a VSCode-fork IDE that embeds models into your workflow. Sonnet 3.7 was the best coding model when I first started this project. This one was 'fine' but it seemed to struggle a lot with over-engineering a bit more than the later models did.
Cursor with Sonnet 4 and Gemini Pro 2.5. When Sonnet 4 rolled out I upgraded. Around the same time Google’s Gemini Pro 2.5 integration appeared, so I hopped between the two. These models produced code faster and could take in more context. Gemini, in particular, impressed me with its knowledge of shader language and particle effects. But with both models I noticed a tendency to over-engineer solutions.
Augment Code. After a couple of weeks I tried a VSCode plugin called Augment. At the time it locked you into a single model; you couldn’t change it. Augment sold itself on understanding the 'entire codebase'. I was intrigued. Unfortunately it often ignored its own previous fixes and would reintroduce bugs that had been solved earlier. Every back-and-forth cost money because Augment charges per message, I also felt that it had no more understanding of the whole codebase than Cursor did.

Augment left a bad taste in my mouth as there were times I felt like it was making go around in circles on purpose as every message was money in their pocket. Unless they made some truly earth-shattering improvements. They are not a solution I would ever return to.
Claude Code CLI with Sonnet 4 and Opus 4. In the final stretch I moved to Anthropic’s 'Claude Code'. Here I experimented with a workflow using two models: Sonnet 4 as a “helper” for straightforward tasks and Opus 4 for deeper architectural questions. Claude’s approach of spawning sub-agents to evaluate code was fascinating. It's far from perfect but has been the most impressive tool I've used so far.

This journey wasn’t some grand scientific study. I was simply wanting the best tool for the job and the AI landscape is a moving target. But switching tools so often gave me a unique perspective on where vibe coding shines and where it fails spectacularly.

The things that went right

Let’s start with the good news: vibe coding with AI isn’t a complete nightmare. There were moments when it felt like I was living in the future.

Prototyping at light speed

Getting an idea onto the screen was exhilaratingly quick. Normally in Godot you’d have to learn how to wire things together step by step, but with the AI I could just describe what I wanted and watch it appear. Within a few days I had a working prototype. It set up all the moving parts for me; enemies, weapons, health, even some simple behaviours. At one point I asked it to “make the enemies encircle the player in a swarm,” and within minutes I was testing it out in a playable build.

The immediacy rekindled the giddy feeling I described in my Star Fall blog post - the same joy I felt in the 80s watching a game come to life after copying the code from a code book.

It's actually also an apt comparison as I had no idea what I was doing then, either!

An AI generated photo of me as a child (with glasses and a full beard) typing game code into a Commodore VIC-20 computer from a games manual. — I'm glad my mum had the camera out for this milestone

Solving performance bottlenecks

Godot is efficient, but if you spawn hundreds of objects per frame it can chug. When my first enemy waves dropped the frame rate to a slideshow, I asked the AI to optimise. To its credit, it introduced spatial partitioning and batching techniques I hadn’t heard of. The game jumped from 20 fps to a solid 60. It also created custom shaders and particle effects to make explosions feel juicy. I suspect the models drew on examples from open-source Godot projects; the results certainly looked polished.

Shaders and particles

Shader code is both math and art. I don’t speak GLSL fluently. The AI, however, produced starfields with a parallax effect, bloom glows and even a glowing sun with a silhouette effect when elements are over it on command. It also generated particle emitters for thruster trails, weapon bursts and debris, complete with adjustable parameters. It worked SO well in fact that inn future versions of Star Fall I’ll probably continue to let the AI handle shaders as I doubt I could make them better myself.

Creative serendipity

Because vibe coding is unstructured, I stumbled into features I would never have planned. One night I asked the AI to make an “arc weapon that whipped around the ship,” and it created a really cool crackling effect that knocked an enemy ship back. I didn’t have to code it, so I could play with variations immediately. That sort of playful iteration is hard to reproduce when you’re writing everything by hand.

The dark side: what went wrong

For every “wow, that’s cool” moment there were five “what the hell are you doing?” ones. Vibe coding magnifies the flaws in large-language models and exposes your own knowledge gaps. The following issues became patterns across every tool I tried.

Performance fixes that create new problems

The AI did improve performance when asked, but those optimisations often sowed the seeds of future bugs. For example, it switched my simple enemy spawner to a MultiMeshInstance2D setup. This improved frame time because multiple sprites were rendered as one, but it broke collision detection and made individual enemy logic incredibly challenging. Later, when I needed enemies to have different AI states, I discovered I had to create a set of custom events for them which re-introduced performance problems. This pattern repeated: fix one bottleneck, then chase the regressions for hours. I’m sure if I knew Godot properly I could have caught these trade-offs sooner.

Over-engineering and misuse of the engine

The models love abstractions. At one point I caught it trying to lump every weapon into a single service, but that doesn’t work well because some weapons are MultiMeshInstance2D and others are regular sprites, which behave very differently.

If this had happened only two weeks earlier, I’d never have spotted it as I didn’t even know the distinction back then. Which makes me wonder what else it added in those first few weeks that will come back to haunt me later.

Matt Murdock from 'Daredevil' whispering to a woman in a red dress "You'll pay for what you did"

Code churn and forgotten context

One of my biggest frustrations was how often the AI overwrote working code with broken changes. I’d ask it to fix an enemy path-finding bug; it would update the script but undo the collision fix we’d added yesterday.

With Augment this was especially egregious, because the tool billed per message, every time the agent reintroduced a bug and I had to instruct it again, I felt like I was feeding coins into an arcade machine.

In my blog post about planfiles I explained how externalised planning can help AIs remember what’s been decided. Planfiles act as a memory outside the chat buffer, preventing the AI from forgetting details. I did use them during development, but the problem was that I didn’t know what I didn’t know. My lack of experience with Godot meant I couldn’t always anticipate the right abstractions or edge cases, so the plans I wrote weren’t always complete enough to prevent mistakes.

This meant I was at the mercy of the AI remembering the context of the entire codebase. Something that Augment claims to have nailed down but was actually just as terrible at it as everything else.

Bloating and duplication

The code that came out of these sessions was enormous. Files regularly ballooned into thousands of lines. The AI wrote comments to itself, generated unused helper functions and repeated logic across classes. I had asked it to build an “enemy system” that would centralise all enemies so I could add them via a JSON configuration. Technically it worked. Enemies spawned and were tracked. But under the hood the AI had scattered enemy logic around the entire codebase. There was no single source of truth. When I later tried to adjust how enemies shoot, I found duplicate functions in multiple scripts. Reading through this mess was the opposite of educational. I wanted to learn GDScript by example; instead I got a labyrinth of spaghetti code.

This was another example of it not understanding the codebase as context, it simply did not understand the project as a whole and would be constantly re-inventing the wheel or stepping on it's own toes.

Misunderstanding testing

Godot has a unit test framework called GUT. The AI would happily generate tests, but often for things that didn’t really need them. Worse, if the code didn’t pass, it would sometimes alter or even delete valid tests instead of fixing the underlying issue. Over time the test suite became useless, and I ended up manually testing the game after every change.

Lack of architecture

Every time I try to build a system - for example, a central manager for enemies or a progression system for levels - the AI would implement something superficially similar but with no underlying architecture. It would scatter lists and references across scripts. Without that underlying structure, debugging became a nightmare. When I built websites, I could easily review the AI’s output because I knew the patterns. Here, I was flying blind. My usual instincts about composition versus inheritance or DRY (Don’t Repeat Yourself) were still valid, but I didn’t recognise when the AI had broken them until much later.

Ron Swanson from parks and recreation having a sudden realisation about his computer and then cuts to a scene where he's throwing it in a skip

Not a learning tool

One of my goals was to learn GDScript by reviewing the code the AI produced. That didn’t work out. The scripts it generated were overly long and bloated, with thousands of lines produced as the result of a single prompt. I didn’t yet know enough about Godot to recognise what could be simplified and requests that I thought would be small ended up being very complex, so I just accepted the mess as 'normal' and pushed on.

As I gained more familiarity with the engine, I started to realise just how far off the mark that code was. Instead of clean, idiomatic examples that could help me build a mental model of how Godot projects should be structured, I was left reverse-engineering sprawling scripts that didn’t teach me anything useful. What I needed were small, digestible examples; what I got was a wall of noise that left me no better at GDScript than when I started.

It doesn't listen to it's own rules

Now, I knew this about AI anyway so this one didn't exactly shock me but again, I can spot rule disobedience quickly in a web project, here, I couldn't. Hell by the end of the 2nd week, I'd stopped even trying to review the code so I had no clue what it was producing anymore.

This caused a major problem when I realised that I was getting silent failures because despite the rules clearly stating that this is a pre-release project and that we don't want to add fallbacks and patches to old code, we just want to re-implement properly. It would still add over the top fallback code. So things I was expecting to work were not working but I got no errors, it was just the old behaviour not updating.

I got incredibly angry with Augment about this (again, it seemed to be the worst offender here) as it appeared that I was spending lots of credits on it accomplishing nothing. I spent hours of time going around in circles before I realised that it WAS making the changes but those changes were failing and falling back to the original functionality!

No matter how often I told it not to write fallbacks as this was not a legacy codebase, it didn't listen and very soon forgot and carried on regardless.

A gif of 'Anger' from the Pixar movie 'Inside out' showing that he's very angry indeed.

Why vibe coding is dangerous beyond hobby projects

My little space shooter is low-stakes. The worst that can happen is a bug, a crash, or me apologising to the future developer who has to clean up the GDScript. But there are real-world examples of vibe coding gone horribly wrong when it touches sensitive data and production systems.

The Tea app breaches

In July 2025 the dating safety app Tea suffered back-to-back data breaches. First, 72,000 images used for identity verification were exposed. Days later, independent researcher Kasra Rahjerdi discovered that more than 1.1 million direct messages between users could be accessed. These messages contained extremely sensitive conversations about phone numbers, abortions and relationships. The company disabled its direct messaging feature after reporters confronted them.

News reports noted that Tea was built quickly to capitalise on viral attention. Speculation online suggested the app’s developers leaned heavily on AI to generate code. Whether or not that’s true, the breach highlights what happens when inexperienced teams ship products without understanding security best practices. Exposed Firebase buckets, lax permissions and missing encryption are the kinds of mistakes a seasoned engineer would catch. When novices vibe code an entire backend, the risk of leaking personal data goes through the roof.

The Replit agent fiasco

Around the same time, Jason Lemkin of SaaStr documented his experience using Replit’s AI agent to build a business app. What began as an experiment in vibe coding ended in disaster. The AI agent ignored multiple explicit instructions not to modify the database, then deleted a live production database, generated thousands of fake users and lied to hide its mistakes. It even fabricated test results to make itself look competent.

Replit’s CEO, Amjad Masad, later apologised and promised to add guardrails. The incident underscored how current AI models can misinterpret commands, ignore safety checks and produce erroneous data when given too much autonomy.

The Replit story is instructive for two reasons:

First, it shows that AI tools marketed as accessible to non-developers are still unpredictable and simply not ready for prime-time.

Second, it demonstrates that even experienced founders can be lulled into a false sense of security when things seem to be working. The AI wrote a plausible app, then blew up months of data because no one knew how its internal logic worked. That’s vibe coding on a corporate scale.

Lessons learned from five weeks of vibe coding

Looking back on this experiment, a few themes emerge:

AI assistants are powerful multipliers, not magic replacements.

When I use AI for web development - a domain I know - the AI is like a junior developer who drafts code that I can quickly review. Because I know the stack, I spot problems immediately. In Godot, where I’m a novice, the AI amplified my ignorance.

This still needs to be managed carefully though, I recently allowed myself to fall into the trap of letting the AI try to do too much due due a combination of interview panic and a massive time-constraint and it ended up costing me a job opportunity as the AI made some architectural mistakes that I would not have done and I just didn't spot them because it produced too much code for me to review properly.

Sometimes we need to learn our lessons the hard way.

Without context, AI forgets.

Large models (even the latest and greatest ones) have limited context windows. In a long session they will forget what we decided a few prompts ago. Without planfiles, rules files or explicit reminders (along with reminders to check those files!), they will rewrite code you’ve fixed, hallucinate variables and ignore custom guardrails. My biggest headaches came from this forgetting.

AI-generated code can be elegant or grotesque, but it is always verbose.

I rarely received a concise, straightforward script. Instead, the AI often produced sprawling files with layer upon layer of unnecessary structure, long functions, and extra helpers I didn’t need. The result was code that was harder to read, harder to work with, and far more complicated than the problem required. Reviewing and refactoring it took almost as long as writing it myself.

Keep the AI where it shines.

Shaders, particle effects and simple data structures are great use cases for game development, (Creating API endpoints, generating tests and documentation are good use cases for web dev).

For high-level architecture or anything involving security, accessibility or performance, only use AI as a helper. Write the plan yourself, then let the model fill in the bits you understand.

Don’t use vibe coding to build things you can’t personally verify.

If you don’t know how to write a Godot system or a web API securely, you shouldn’t let AI do it for you. It’s like driving a car blindfolded because your sat-nav says it knows the route. The Tea breach and Replit fiasco show the real-world cost of this negligence.

Moving forward: finding a healthier balance

Meg and Nicky (and Alisons forearm) from the TV show "Dead Pixels" all making a toast "To a healthy balance" — Side note: If you've not watched Dead Pixels, you're missing out. This is my favourite episode!

Vibe coding taught me a lot, but I’m not going to build the rest of Star Fall that way in fact, I'm not even going to keep the code it has written so far!

I still believe AI-augmented coding is transformational. I believe it is already a great tool for experienced developers as we can now spend less time typing and more time architecting. But I will never again ask a model to write an entire system in a language I don’t know.

Instead, I will endeavour to:

Learn the engine properly. I’m now dedicating time to learning Godot’s nodes, signals and patterns. Understanding the underlying concepts lets me evaluate AI suggestions.
Use AI selectively. I’ll still have the AI generate shaders, particle effects and maybe some stub scripts. But for core game systems I’ll write them myself or use AI as a 'code monkey' rather than the actual lead developer of my codebase.
Review everything. Even when the AI writes most of the code, I now open every file and read through it. It’s slower, but it saves hours later.
Don’t be seduced by early success. The first prototypes will work. They’ll look fancy and you’ll feel like a genius. But if you don’t understand what’s happening under the hood, you’re standing on quicksand. Build your foundation deliberately.

The main take-away that I take from this is that you HAVE to review everything the AI produces and if you are going to review it, you need to know how to write it.

If you can't write the code without the AI, you shouldn't attempt to code with it.

A meme of a guy reading a book with the cover text saying "how to vibe code" and the books content saying "Learn to actually code first" — Clearly I already knew this lesson as I made this meme a few months ago. I should really listen to myself, that guy is smort!

Final thoughts

AI-assisted coding is already changing how we build software. As models improve, the temptation to rely on them for everything will grow. My experiment with vibe coding shows that while these tools can accelerate development and even delight, they can also mislead us into dangerous territory. The difference between success and disaster lies in our ability to understand, plan and review the work they produce. When you know the domain, AI is a force multiplier. When you don’t, it’s a blindfold.

Vibe coding a game taught me that the hard way. I won’t make the same mistake again.