Three methods for measuring GenAI adoption and impact

How telemetry metrics, surveys, and experience sampling can be used to understand GenAI adoption and best use cases, and track the impact AI-based tools are having.

Mar 08, 2024

This is the latest issue of my newsletter. Each week I share research and perspectives on developer productivity.

Today’s newsletter is an excerpt from a guide I just published on using data to inform the rollout of GenAI tools and measure their impact on productivity. Go here to download the full PDF guide.

GenAI is all the rage right now, but there are also a lot of challenges. Chief among those is determining the tangible impact of GenAI on developer productivity. Leaders need this information to inform their investments, but often lack the necessary signals or measurements. Additionally, some organizations are seeing suboptimal developer adoption that is difficult to explain, and are looking to better understand why this is happening and how to address it.

The common thread amongst these challenges is that it’s difficult for leaders to get useful feedback, signals, and measurements on how GenAI is impacting developer productivity.

Three methods for collecting insights

Understanding and measuring developer productivity has always been a difficult problem. But add GenAI into the mix and it’s gotten even harder. Thankfully, this problem is solvable, but only when organizations adopt a mixed-methods approach.

Here, we’ll outline the three different methods for measuring and collecting data on GenAI utilization and impact. We’ll tell you where each approach shines, how it can be utilized, and the common pitfalls we see organizations running into. Then in the next section, we’ll show you how to combine the different approaches together into a holistic insights strategy.

The three approaches we’ll discuss are telemetry metrics, experience sampling, and surveys.

Telemetry metrics

The first place that many organizations look to for data on the productivity impact of GenAI is telemetry metrics from tools like GitHub. Common metrics used to track productivity include pull requests per developer, code review time, and cycle time. Some organizations counterbalance these measures with metrics like number of incidents, to ensure that GenAI-fueled speed increases don’t come at the cost of quality.

Telemetry metrics are a useful way to get a high-level gauge of how developer output and activity levels are being affected by GenAI. Many organizations observe small but noticeable increases in their metrics, ranging from 5-10%. Other organizations, however, see little or no change in the numbers at all, leaving leaders concerned about the large investments they’re making in GenAI.

The challenge we’ve seen many leaders run into with telemetry metrics is that these metrics often don’t tell a clear or compelling story on their own. There’s skepticism around whether GenAI tools are the direct driver of fluctuations in developer activity levels. And these metrics alone don’t provide a concrete picture of how GenAI tools are being utilized to realize the benefits.

Telemetry metrics can’t answer some important questions like: How much of developers’ time is actually being saved thanks to GenAI tools? How are developers using these tools? What are the most beneficial use cases for GenAI tools that can be taught to the rest of the developers?

Experience sampling

Experience sampling is not quite as familiar as the other methods we’re discussing. In technical terms, experience sampling refers to a set of data collection methods for gathering systematic self-reports of behaviors or experiences as they occur in the individual’s natural environment.

Applied to developer productivity and GenAI: experience sampling involves taking a continuous random sample of developers as they complete tasks, and surveying or interviewing them in real-time to understand how they’re using GenAI tools and what benefits they’ve realized.

Experience sampling is a powerful data collection method that can provide your organization with two key insights that are difficult to obtain otherwise.

The first insight is around concrete time savings or ROI, which is a critical question that’s otherwise difficult to answer. Telemetry metrics and traditional surveys can only provide high-level numbers, whereas experience sampling can tell you exactly how many minutes or hours were saved on specific development tasks thanks to GenAI, from which organizations can extrapolate total estimated ROI in terms of time and dollars.

The second key insight gained through experience sampling is how exactly developers are using GenAI tools to positive effect. This is key for driving adoption: early adopters in your organization are likely to be self-driven and discover use cases, but for everyone else, simply dropping a tool like GitHub Copilot on them isn’t going to lead to adoption or positive results. To achieve successful adoption, organizations must provide guidance around practical and beneficial use cases, as well as proactively identify gaps and opportunities for further tooling improvements.

Experience sampling comes with great reward, but is also the most challenging of the discussed methods to implement. We’ve seen organizations build sophisticated tooling to deploy event-sampled data collection campaigns, and the required duration for these studies can be a point of friction for organizations that want complete answers immediately.

Surveys

Surveys are a powerful tool for capturing measurements and feedback about GenAI. In particular, surveys are highly useful for measuring developer adoption, satisfaction, and self-reported productivity.

Most organizations, for example, don’t have individual-level telemetry data on how regularly developers’ are utilizing GenAI tools for specific types of tasks. Periodic surveys where developers are directly asked questions about their level of utilization for different types of tasks can provide fast and reliable data points.

Similarly, while telemetry metrics provide one lens into developer productivity, self-reported measures of satisfaction and productivity can tell a different side of the story on the positive benefits of GenAI in your organization (for example: we’ve seen GenAI have direct measurable benefits on developer fulfillment and ease of completing development tasks).

The difficulties we see organizations face when it comes to surveying aren’t a surprise. Designing proper surveys is always a challenge, especially under the time pressures many leaders face to collect data. The periodic nature of surveys means that timing is important, and high-enough participation rates are required in order for reliable insights to be drawn from responses.

Putting it all together

We’ve outlined three methods of data collection and insights, and discussed how each can provide unique insights. Telemetry metrics are primarily useful for quantifying the impact of GenAI on developer output. Experience sampling is most useful for quantifying the ROI of these tools and their specific use cases. Surveys are best for measuring adoption and satisfaction with these tools, and developers’ self-reported productivity as a result of using them.

Organizations can get the most success by applying all three methods together to get the fullest insights into how GenAI is being used by developers, and how it’s impacting productivity.

Organizations should deploy surveys as soon as possible to establish baselines early, before GenAI tools have been fully rolled out. Running these surveys regularly, about every six to twelve weeks, helps track changes in developer adoption and satisfaction.

At the same time, organizations should keep an eye on their telemetry metrics to spot any changes or trends in developer productivity levels as GenAI tools are adopted. It’s important to dedicate effort to properly cleaning and normalizing data to ensure that you’re getting reliable signals.

Lastly, we strongly recommend that organizations run experience sampling studies in focused, four-week intervals. These studies can yield powerful data on the dollar-value ROI of GenAI tools, along with close-up insights into how developers are using GenAI to realize their productivity gains. These learnings can be shared back with other developers and internal platform teams, helping make clear the best use cases for GenAI as well as gaps and opportunities.

Download the full guide here. If you have any questions, reach out to me on LinkedIn.

Who’s hiring right now

Find recent DevEx job postings here.

That’s it for this week. If you found this issue useful, consider sharing it:

-Abi

P.S. We have some live discussions coming up that you can sign up for:

March 13th: Join us for a deep-dive on the market landscape of internal developer portals
March 14th: Learn about LinkedIn’s Developer Productivity Happiness (DPH) framework

Engineering Enablement

Discussion about this post