LLM Limits

There's so much hype around using LLMs and code generation. Just this week, Claude announced its artifacts feature. Some of the apps (applets?) there seem genuinely fun and useful. I saw a Slime Soccer clone, which was deeply nostalgic to me.

There was also Gemini CLI, which looks like an almost free competitor to Claude Code. And then there was Fly.io's Phoenix New coding tool made by none other than the creator of Phoenix himself, Chris McCord. No idea why I'm summarising the news, I get it all from Simon Willison's blog anyway.

But in either case, the news makes you think that these tools can do everything.

I think hard things will still be hard.

At work, I was trying to add tests to this react frontend project. But the repo is set up in this weird way, where there were subdirectories with package.json files. But it's not a monorepo! And Cursor was confused about that.

In the end I wasted a lot of time trying to get the LLM to fix my issues when the best course of action would've been just to read the docs and do it myself. I wonder if this is the so called jagged frontier.

To be fair, it's not clear at all. I think it's still worth pasting in your error logs to the LLM before diving in yourself, since 80% of errors are trivial and they can fix it in one go. But the remaining 20% can make you tear out your hair.