On the Use of AI Code

In 2024, many conversations in programming circles revolve around "AI" and its impact on coding. There are people like Nvidia’s CEO Jensen Huang and others proclaiming the end of manual programming for humans. Video after video, blog after blog, celebrities and would-be celebrities from programming circles are spelling out the demise of the profession.

Bleak Reality

Behind all this fuss, researchers paint a bleak picture. A recent study by GitClear shows that the indicators for code quality in codebases in which AI assistants are used are falling rapidly. A look at year on year numbers in the Github repositories for the period between 2022 and 2023 shows that the number of moved (refactored) lines has fallen by 17%, while the number of copied lines has risen by 11%. Even more alarming is that code churn (the number of lines added and then deleted in less than 2 weeks) has increased by an incredible 39%.

Large-scale studies that rely solely on git diff without isolating other variables or understanding the underlying mechanisms are prone to noise in their data and should be taken with a grain of salt. Nevertheless, the numbers presented in this study are so significant that even a change of a few per cent does not change the conclusion. AI assistants make code worse. GitClear's data also shows that 2023 to 2024 will be much worse as AI proliferation increases.

Another study has shown that developers who use AI code assistants are far more likely to produce unsafe code across problem types, while they are at the same time significantly more confident about the correctness of their solution, making them less likely to revisit and fix the problems at a later stage.

Weighting Benefits

As I am writing this, the most popular coding assistant is still Copilot. It's site boldly claims that coding with Copilot is 55% faster. Of course, no source is given for this claim (how are you even supposed to measure this in a repeatable and isolated way?). But even if we take this claim at face value, speed does not mean productivity. More code in less time does not mean that the end product will be finished faster. The numbers show that any short-term gain in speed is more than offset by maintenance costs over time.

“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”
- Bill Gates

Since the code assistants are mostly wrong, their code should mostly be denied. GitClear's research clearly shows that less experienced developers are much more likely to accept the code returned by AI coding assistants than their more experienced counterparts. This in turn means that much of the code written by the AI has to be rejected at the code review stage or, failing that, someone knowledgeable and experienced has to track down the substandard code and fix it in a week or two when the issues arise, increasing code churn rates and slowing down work. By using code assistants organisations speed up the work of their most junior developers at the expense of their most experienced engineers who have to spend more time on code reviews and maintenance.

Next Big Step

There is this sentiment that LLMs are the next big step, and while not yet perfect, there is no way around this future. Get on the train or be thrown under the wheels. There's just one small problem. Large language models are only as good as (or slightly worse than) the data they are trained on. They have no real concept of correctness or truth. Basically, they can only ever reshuffle the data they are fed. Since all new LLMs are now trained on data that contains content created by their predecessors, we are heading for a model collapse. A situation where every new LLM is just a bad copy of a bad copy of a bad copy. In truth, LLMs have probably already reached their peak in terms of usefulness.

This is not to say that code assistants are not the future. In recent years, static and dynamic code checkers have improved significantly and have become so ubiquitous that many modern languages already come with one out of the box. IDEs also have snippet and autocomplete tools that significantly reduce the time spent in boring parts of the code and speed up the coding. The future is here, it's just not as exciting as the LLM evangelists would have you believe.