AI programming tools slow software developers down

Admin Top24News

2 weeks ago

The use of AI tools in developers’ workflow is becoming increasingly common, with large language model-based pair programmers cited (by their vendors, at least) as significantly improving devs’ speed and efficiency.

However, a small, randomised controlled trial of AI tools shows that in large, complex, and important software projects (over a million lines of code), not only do developers using AI take longer than those not doing so, but they retain a false conviction that AI is making them work faster.

The authors of the paper state that evaluations of AI’s effectiveness often don’t consider the context of developers’ work. In large projects, for example, few tasks – such as the addition of new features or debugging existing code – exist in isolation: developers need an awareness of the larger project in order that their work doesn’t impact or break other features, for example.

A failure to appreciate the risks of unexpurgated code entering large projects is something that agentic benchmarks do not take into consideration, making claimed benchmark results inapplicable in many cases.

The randomised controlled trial (RCT) used 16 experienced developers working on large, open-source repositories, who were given real issues to address that would improve the software—bug fixes, new features, and refactoring. In other words, working through the omnipresent list of bugs, feature requests, and general improvements common on every developer’s desk. Participants were chosen at random to use, or not use, an AI tool of their own choosing.

The developers using AI took 19% longer to complete their tasks than the control group, with AI users still believing their productivity was 20% faster than normal, un-aided programming.

Generally, developers thought that AI tools would improve their efficiency in terms of time taken by 24%.

The slowdown was present in different task types, contrary to benchmarks published by AI pair-programming vendors and developer expectations. The authors acknowledged that long-term use of a single tool or platform (Cursor, in this case) may produce better results than the few hours’ use during the trial, as there may be “strong learning effects” on the AI in the long-term. They also concede that the technology is relatively nascent, and better results may be possible as such platforms evolve thanks to iterative improvements by the vendor, access to greater learning corporaand so on.

One of the main reasons found for the slowdown in developer productivity was the need to check, amend, and debug the code output by Cursor, which often involved having to submit multiple queries to solve an initial problem or narrow down resulting output.

Code is inherently less subject to variation in style and structure than prose, and therefore it would seem logical that fewer inaccuracies would be present in an AI’s output compared to the hallucination-prone output of ‘mainstream’ AIs.

But according to Baldur Bjarnason, writing in 2023, all AI output is hallucination—but as humans we have a tendency to trust what AIs say. On the other hand, computers and software are literally binary: a solution is either right or wrong. And in the context of large software projects, even code that’s ‘right’ may well be unsuitable, insecure, and created by a system that’s ignorant of critical factors at play elsewhere in the project. While using an AI helper might give developers a perception of efficiency, in many instances, its use is arguably unwise—at least where it matters.

A company whose products run much of the working world, Microsoft, has proudly declared that 30-40% of its new code is written by AI. That gives sceptics like Baldur Bjarnason financial hope for the future.

“But… the best is yet to come. In a few years’ time, once the effects of the ‘AI’ bubble finally dissipates… somebody’s going to get paid to fix the cr*p it left behind.”

(Image source: “Spanner in the Works” by Kapungo is licensed under CC BY-NC 2.0)

See also: Meta slams ‘incorrect and unlawful’ EU DMA ruling

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.