Copilot productivity gains at a Fintech company with 2K+ engineers
Independent data on Copilot productivity impact.
This is the latest issue of Engineering Enablement, a weekly newsletter covering the data behind world-class engineering organizations. To get articles like this in your inbox every Friday, subscribe:
There's been a wave of research on the impact of AI tools like Copilot, but much of it has been driven by Microsoft and other vendors selling these solutions. So we thought it would be useful to take a look at real-world data and offer an independent perspective.
Earlier this month, we analyzed data from a US-based fintech company with 1,000 to 2,000 engineers. We’ve been working closely with this company to help them understand the impact of their Copilot rollout, particularly how it’s affecting productivity. This has been a pressing question for their CFO.
To answer these questions, they’ve focused on two key metrics amongst those that we recommend: the number of pull requests merged by Copilot users versus non-users, and self-reported time savings from Copilot users. Below is an overview of what we’ve seen in the data.
Copilot users deliver 25% more PRs than non-active users
One area we take a close look at with companies rolling out Copilot is the lift in PRs merged. As we know, the number of pull requests doesn’t tell the full story—for example, it doesn’t tell you how easy or difficult those PRs were. However, we and many other companies consider it a useful signal, which is why we’ve included it in the DX Core 4.
In this analysis, the developer population was divided into two groups: Active and Non-active Copilot users, with “Active” being defined as developers who use Copilot at least once per week.
Over a 90-day period during the past quarter, active Copilot users had 24% more PRs merged per week compared to non-active users. A 24% lift is significant and a result that this company was happy to see.
A common critique of PR throughput metrics is that not all PRs are equal—some are larger or more complex than others. To address this, the analysis also includes the mean PR size, measured by the lines of code changed. This shows that active Copilot users not only merged more PRs, but their PRs were also larger on average.
The takeaway here is that this company is seeing a measurable increase in output, as measured by pull requests. This has helped them feel more confident in their investment in Copilot.
While a 25% lift in pull requests is a remarkable data finding, it’s worth noting that many organizations do not observe similar findings. Some have even observed lower PR throughput associated with Copilot usage. We’ll explore some reasons for this in future reports, but this highlights the need for examining additional data points.
28% of developers using Copilot report saving at least 1 hour per week
As mentioned earlier, pull requests don’t tell the full story. There could be other factors contributing to the increase we saw—for example, perhaps the people adopting Copilot are generally more advanced or more active coders. It’s possible that what we’re seeing is not causation, but instead a correlation.
Another way this company is assessing Copilot’s impact is through self-reported time savings.
In their latest quarterly survey, they surveyed all active users of Copilot, and found that 28% of developers using the tool are saving at least 1 hour per week. 11% are saving 2 or more hours per week. These findings are generally consistent with the lift they’re seeing in PR output.
Of course, self-reported metrics like this one have their own challenges, namely that it’s hard for people to provide highly precise reports on the time they’ve saved. One of the ways to overcome this is to collect more in-the-moment data points through methods such as experience sampling.
In summary, while there's been plenty of research from vendors, we're seeing less coming directly from companies about the real-world impact of AI tools like Copilot. The data we’ve explored in this issue offers a strong example of a company experiencing a meaningful lift in output and material time savings. As Copilot continues to mature, and as developers become more skilled at leveraging the tool, we’d expect its impact to only grow.
Who’s hiring right now
Here’s a roundup of Developer Productivity job openings. Find more open roles here.
Adobe is hiring a Sr Engineering Manager - DevEx | San Jose
SiriusXM is hiring a Staff Software Engineer - Platform Observability | US
Vercel is hiring an Engineering Manager - Build and Deploy | UK
Snowflake is hiring a Senior Engineer - Developer Productivity | Bellevue
Wix is hiring a Software Team Lead - DevEx | Tel Aviv
That’s it for this week. Thanks for reading.
-Abi
Still curious and interested if that increased output had impact on „number of features delivered“ or „bugs fixed“ etc.
Given the size of that org and using some basic convention like „taskId“ as part of a PR would help identify more regarding output beyond PRs