A New Study Challenges the Hype: In the rapidly evolving world of software development, AI tools like Cursor Pro and Claude 3.5/3.7 Sonnet have been hailed as game changers, promising to supercharge productivity and streamline coding tasks. However, a recent study from METR (Model Evaluation & Threat Research) presents a surprising finding: for experienced open-source developers working on familiar codebases, AI tools may slow things down. Let’s dive into the findings of this randomized controlled trial (RCT) and explore what they mean for the future of AI in software development.
Published on July 10, 2025, METR’s study, titled "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity", set out to test how AI tools affect developers in realistic scenarios. Unlike many benchmarks that rely on synthetic tasks or algorithmic evaluations, this study focused on real-world coding tasks, such as bug fixes, feature additions, and code refactoring, in large, mature open-source repositories.
The researchers recruited 16 experienced developers, each with an average of 5 years of experience contributing to repositories boasting over 22,000 stars and 1 million lines of code. These developers tackled 246 tasks, with each task randomly assigned to either allow or disallow AI assistance. When AI was permitted, developers primarily used Cursor Pro with Claude 3.5/3.7 Sonnet, state-of-the-art tools at the time (February to March 2025). The tasks, averaging about two hours each, were recorded via screen captures, and developers self-reported their completion times.
The headline finding? When developers used AI tools, they took 19% longer to complete tasks compared to when they worked without AI. This result flies in the face of both developer expectations and industry hype. Before the study, participants predicted AI would speed them up by 24%, and even after completing the tasks, they estimated a 20% speedup. The reality—a 19% slowdown—reveals a striking gap between perception and actual performance.
Why the slowdown? METR’s analysis of screen recordings offers some clues. While AI tools reduced time spent on active coding, testing, and searching for information, developers spent significantly more time prompting AI, reviewing its outputs, and waiting for responses. In some cases, they also experienced “idle/overhead time” with no activity at all. Only 44% of AI-generated code was accepted without modification, meaning developers often had to tweak or debug suggestions, which ate into their time savings.
Several factors likely contributed to the slowdown:
One of the study’s most intriguing findings is the disconnect between developers’ perceptions and the actual outcomes. Despite the measured slowdown, 69% of participants continued using AI tools after the study, suggesting they found the experience less taxing or more enjoyable, even if it wasn’t faster. This aligns with broader productivity research, which shows that self-reported productivity often doesn’t match objective metrics. The reduced cognitive effort of using AI might make developers feel more productive, even when the clock tells a different story.
This perception gap has parallels in other fields. For example, studies have shown that people often overestimate the productivity gains from tools or substances (like Adderall) because they feel more engaged or energized, even if the output doesn’t match the enthusiasm. In coding, the satisfaction of seeing AI generate a quick prototype or handle boilerplate code can create an “IKEA effect,” where developers value the results more because they interacted with the tool, even if it took longer overall.
The METR study doesn’t spell doom for AI coding tools—it’s a snapshot of early-2025 capabilities in a specific context. The researchers themselves caution against overgeneralizing, noting that AI might offer greater benefits for less experienced developers, smaller projects, or unfamiliar codebases. They also point out that AI progress is rapid, and newer models (like Claude 4 Opus or Gemini 2.5 Pro, released after the study) could shift the results.
Still, the findings challenge the narrative that AI is a universal productivity booster. They highlight the importance of rigorous, real-world testing over anecdotal hype or synthetic benchmarks. As one participant put it, AI can be a “magic bullet” for tasks like prototyping or handling tedious boilerplate, but it’s not a one-size-fits-all solution. For complex, context-heavy tasks, human expertise still holds an edge.
METR plans to continue refining this methodology to track AI’s evolving impact on software development. Future studies could explore how AI performs with junior developers, greenfield projects, or different tools and models. They might also investigate whether AI’s benefits lie in areas beyond raw speed, like improving code quality or reducing burnout for developers with cognitive challenges, as one participant with ADHD noted.
For now, the study serves as a reality check. Developers and companies banking on AI to revolutionize coding should temper their expectations and invest in training to use these tools effectively. It’s also a reminder to measure actual outcomes, not just vibes. As AI continues to advance, the balance between human expertise and machine assistance will likely shift - but for now, don’t expect miracles.
For the full details, check out the study on METR’s website: Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/).
What do you think? Have you found AI tools to be a help or a hindrance in your coding projects? Let’s keep the conversation going!
Ready to transform your software quality strategy? Visit NUCIDA to learn more about artificial intelligence solutions. The future of intelligent quality is here.
Want to know more? Watch our YouTube video, Ignite Your Business with Three Strategies in AI, to leverage your business processes to the next level.
Pictures from pixabay.com and NUCIDA Group
Article written and published by Torsten Zimmermann