A messy experiment that changed how we think about AI code analysis
Last week, I watched our AI choke on a React codebase - again. As timeout errors flooded my terminal, something clicked. We’d been teaching AI to read code like a fresh bootcamp grad, not a senior developer.
Last month, I asked Claude to help refactor a React component. The code it wrote was beautiful - clean, well-documented, following all the best practices.
It also quietly broke our error tracking system, removed a crucial race condition check (that admittedly looked like a bug), and duplicated three utility functions with slightly different implementations.
Sound familiar?
The AI coding space is exploding right now. Cursor hit $50M ARR, Lovable.ai reached $4M in 4 weeks, and every day there’s a new “AI-powered IDE” on Product Hunt. Clearly, developers want AI assistance.
Yet despite these impressive numbers, I believe current AI coding tools are fundamentally solving the wrong problem. That’s why I’m building another one.
I’ve been running multiple WordPress blogs for my friends and family on my own VPS since ~2012. I didn’t bother checking them for updates, and surprise surprise they all got hacked.
This is my journey of how I fixed it and how the latest version of my blog was born.
Hmm, this is not what WordPress is supposed to look like is it?Continue Reading →
At Metric, my latest accounting AI automation startup, we’ve built complex interconnected pipelines that query an LLM with multiple transactions and invoices.
Of course, the entire flow from beginning to end needs to be tested. Given the manual effort of uploading files and the long wait for LLM responses, we had to automate the entire test.
In the automation, I found many cases where an initial part of the test went correctly but a latter part got stuck and errored. Retrying from start would need at least ~1-2 minutes of waiting for all the initial cases to be processed, and I could not delete the initial cases either because the latter were connected with the response from the initial tests.
So I really had only one option — figure out how to save the state of a test frequently and allow resuming the test if there’s a crash.
To code this out, I broke down the problem into 2 parts:
Making generic methods to load and save state
Saving the state at needed parts
State methods to load and save
I like the pattern of Factory Functions, so I created a _stateManager factory function.
It took a testCode paramter to indicate what test is running, and created a private variable filePath which was used to save the state of the test.
const_stateManager=(testCode)=>{constfilePath=path.join(autosaveStateDirectory,`${testCode}-state.json`)return{load:async()=>{// load only the steps that are before loading transactions// i.e. transactions must always be retestedif(testsSteps.indexOf(testOutput.lastStepCompleted)>testsSteps.indexOf('UPLOAD_INITIAL_TRANSACTIONAL_DOCUMENTS'))returnfalsetry{constrawData=awaitfs.readFile(filePath,'utf8')constdata:TestOutput=JSON.parse(rawData)testOutput=datalog(`Successfully loaded state! Last step completed: ${testOutput.lastStepCompleted}`)returntrue}catch(e){console.error(e)returnfalse}},}}
The save function was easy: I just had to write the state to file.
save:async()=>{awaitfs.writeFile(filePath,JSON.stringify(testOutput,null,2))log(`Saved state after completing ${testOutput.lastStepCompleted}`)returntrue},
For the load function, I also had to check whether the file exists or not.
load:async()=>{try{constrawData=awaitfs.readFile(filePath,'utf8')constdata:TestOutput=JSON.parse(rawData)testOutput=datalog(`Successfully loaded state! Last step completed: ${testOutput.lastStepCompleted}`)returntrue}catch(e){console.error(e)returnfalse}},
Setting up test steps
My test has a few steps like logging in, uploading documents, uploading transactions, etc. I represented those in an array
I recorded all my state in a testOutput object, and stored the most recent completed step in lastStepCompleted.
At the start of each part, I updated stepKey which was the same as what I used in the testsSteps variable above.
// during the test, keep updating `testOutput` with data from the APItestOutput.parties=...testOutput.lastStepCompleted=stepKeyawaitstateManager.save()
Now, all that is left is to check if the specific part of the test has already loaded or not. If it has been loaded, skip that step entirely. This code should be at the very start of the test step.
conststepKey='CREATE_PARTIES'if(testsSteps.indexOf(testOutput.lastStepCompleted)>=testsSteps.indexOf(stepKey)){log('\tSkipping create parties due to loaded state')returnfalse}
Optionally, you might have to “rollback” your server to the most recent step as well. This completely depends on your server so I won’t include code for that, but a rollback could be as easy as deleting a few rows in your DB.
Conclusion
For good DX during long running tests, it’s important to be able to save and resume state as needed.
I came up with my own way to load and resume state, I am sure different libraries could have their own ways. But, I usually prefer writing my own code in such easy cases rather than depending on any abstractions.
With a simple factory function and a few changed lines during the tests, we were able to write a simple and easy to reason about test state manager.
A simple way to implement a “search” feature into your node app is using the database engine to check for presence of tokenized search query. In my case, I’m using MySQL with the Sequelize ORM and needed to add a e-commerce like search form with product results for a client.
Vue loads content asynchronously, which means that Google’s crawlers won’t pick up your site for indexing. That is, until you give them a rendered version to see. We’re going to discuss a common way to serve content properly for crawlers here, called “Prerendering”.
Ghost… I tried. Trust me, I really did. I stuck with you for over 3 years, developed a custom theme on you, hacked around any shortcomings you had. But yesterday, I had to give up. Trying to upgrade from 0.11.x to 1.x had to be one of the most annoying experiences I’ve had recently and enough is enough. It’s not me, it’s you.