Thanks for sharing the results on PR throughput increases. The ~10% productivity gains are also in line what we've found from talking to engineering organizations in Switzerland. While this is nowhere near the 30-60% improvements that some consultants and vendors claim, it's still significant and valuable.
I'm surprised (but not completely) to see most of the comments pushing back at the results.
I agree that having more background data will be necessary or that the state of AI-assisted tooling in 2024 was way different than it is today (BTW Sergey, GitHub Copilot was launched in 2021, Cursor AI was first launched in 2023, etc), but what serious study or more than just anecdotal evidence is anyone putting forward as a counterargument?
If your pushbacks are based on your perceptions, I have bad news: those have been demonstrated in other studies, such as the METR study, to be completely unreliable.
Either most people have literally fallen for the marketing propaganda of the companies that wish you to believe their tools are much better than they really are in practice, or somehow there is so much desire to believe in a 10x improvement that anything pointing at it not being possible is just discarded as "bad propaganda" or a "junk study".
To all of you sharing this sentiment: please share studies conducted by independent entities proving this one (or all the other ones) wrong. Let's be scientific about the issue and forget about whether we like team A or team B.
And to be clear, I'm in no way affiliated with Engineering Enablement or GetDX.
I would very much like to see the distributions across orgs/within an org, not just the industry averages. I don't think this is telling the whole story.
This doesn't take into account that there's more than one kind of engineering. I think AI will have a different impact on engineers working on mission-critical software like IT support, product support, and maintaining legacy code, which seems to be the focus of this study. There are also engineers doing research, proof of concept or rapid prototyping where the goals are to explore and validate ideas. AI makes it much easier to try out ideas and discard the ideas that don't work out, long before you even get to the point of needing a code review.
“AI usage” is too broad a term. It would be more useful to understand how engineers are using AI in their daily work. Are they using it only for autocomplete, to ask questions about the codebase, to refactor code, to perform reviews, to brainstorm ideas, or as part of workflows that generate an entire feature? All of these count as AI usage, but they have very different levels of impact on the code that is actually produced.
It would also be worth asking whether PRs created with the help of AI result in better code quality, correctly implement the requirements, and truly solve the intended problem.
Grouping everything under a single “AI usage” category makes it difficult to measure its real impact accurately.
This feels much closer to reality. AI can speed up parts of coding, but the bigger bottlenecks are still review, planning, and coordination. That is why 10 percent matters, but 10x still feels more like marketing than daily work. 
Is there a full version of the study with methodology, etc. included? How did you count "AI usage"? Is this LLM code generation, linting, search? Do you have breakdowns of how much time is spent in each area of development?
Ok, this is just another junk study. Anybody who was paying attention can immediately realize that.
"As part of this study, we analyzed data from 40 companies between November 2024 through February 2026 to track whether teams are shipping more pull requests as AI adoption increases.
We found that, during this time, AI usage increased significantly—by an average 65%."
There was not real usable coding AI in 2024. If they only increased by 65% they did not adopt AI. I would start paying attention if they would see 900% increase.
This 10% PR throughput increase might not even caused by AI because they did not adopt AI.
Thanks for sharing the results on PR throughput increases. The ~10% productivity gains are also in line what we've found from talking to engineering organizations in Switzerland. While this is nowhere near the 30-60% improvements that some consultants and vendors claim, it's still significant and valuable.
I've shared more details and thoughts on the matter last week on my Substack: https://flowlabs.substack.com/p/4-todays-3060-ai-productivity-gains
Thanks for sharing! Where can I read the full research?
The report doesn’t talk about PR size or feature delivery rate.
What I’ve seen is that PR size has increased as well. But the impact on feature delivery is not clear.
We’re also seeing this, larger PR size and lower engagement with the PR (comment density)
I'm surprised (but not completely) to see most of the comments pushing back at the results.
I agree that having more background data will be necessary or that the state of AI-assisted tooling in 2024 was way different than it is today (BTW Sergey, GitHub Copilot was launched in 2021, Cursor AI was first launched in 2023, etc), but what serious study or more than just anecdotal evidence is anyone putting forward as a counterargument?
If your pushbacks are based on your perceptions, I have bad news: those have been demonstrated in other studies, such as the METR study, to be completely unreliable.
Either most people have literally fallen for the marketing propaganda of the companies that wish you to believe their tools are much better than they really are in practice, or somehow there is so much desire to believe in a 10x improvement that anything pointing at it not being possible is just discarded as "bad propaganda" or a "junk study".
To all of you sharing this sentiment: please share studies conducted by independent entities proving this one (or all the other ones) wrong. Let's be scientific about the issue and forget about whether we like team A or team B.
And to be clear, I'm in no way affiliated with Engineering Enablement or GetDX.
I would very much like to see the distributions across orgs/within an org, not just the industry averages. I don't think this is telling the whole story.
And why wouldn't that compound? Cos 0.7% per *month* is how we get exponential productivity gains no?
This doesn't take into account that there's more than one kind of engineering. I think AI will have a different impact on engineers working on mission-critical software like IT support, product support, and maintaining legacy code, which seems to be the focus of this study. There are also engineers doing research, proof of concept or rapid prototyping where the goals are to explore and validate ideas. AI makes it much easier to try out ideas and discard the ideas that don't work out, long before you even get to the point of needing a code review.
“AI usage” is too broad a term. It would be more useful to understand how engineers are using AI in their daily work. Are they using it only for autocomplete, to ask questions about the codebase, to refactor code, to perform reviews, to brainstorm ideas, or as part of workflows that generate an entire feature? All of these count as AI usage, but they have very different levels of impact on the code that is actually produced.
It would also be worth asking whether PRs created with the help of AI result in better code quality, correctly implement the requirements, and truly solve the intended problem.
Grouping everything under a single “AI usage” category makes it difficult to measure its real impact accurately.
This feels much closer to reality. AI can speed up parts of coding, but the bigger bottlenecks are still review, planning, and coordination. That is why 10 percent matters, but 10x still feels more like marketing than daily work. 
Is there a full version of the study with methodology, etc. included? How did you count "AI usage"? Is this LLM code generation, linting, search? Do you have breakdowns of how much time is spent in each area of development?
Ok, this is just another junk study. Anybody who was paying attention can immediately realize that.
"As part of this study, we analyzed data from 40 companies between November 2024 through February 2026 to track whether teams are shipping more pull requests as AI adoption increases.
We found that, during this time, AI usage increased significantly—by an average 65%."
There was not real usable coding AI in 2024. If they only increased by 65% they did not adopt AI. I would start paying attention if they would see 900% increase.
This 10% PR throughput increase might not even caused by AI because they did not adopt AI.
So if they do trunk-based development, they don't appear in the metrics?