Measuring Developer Flow and Friction
Google’s 3-step approach to developing quantitative metrics for developer friction and flow.
This is the latest issue of my newsletter. Each week I cover the latest research and perspectives on developer productivity.
This week I read the latest installment of the “Developer Productivity for Humans” series by the Engineering Productivity research team at Google. Previous editions were authored by Ciera Jaspan and Collin Green, whom I interviewed here. This new paper, Measuring Flow, Focus, and Friction, is authored by Adam Brown, Alison Change, Ben Holtz, and Sarah D’Angelo, who also work with Ciera and Collin.
This paper describes the research team’s approach to developing quantitative metrics, specifically focusing on how they’ve developed metrics for developer flow and friction. While the specific metrics Google uses may be out of reach for most companies, their high-level approach to developing metrics is more broadly applicable.
My summary
It can be easy to lean into the most available signals when measuring developer behavior, but these signals often lack important context. For example, ‘number of builds’ is an easily accessible metric, but it fails to capture how the work is happening. Relying on a metric like number of builds, we can’t see whether developers are rapidly iterating in a flow state, or whether they’re experiencing friction at every step. These aspects help us understand the larger story of developer productivity.
“A slow build might be the proverbial tree falling in an empty forest: if the developer doesn’t notice the delay, then it may not constitute friction… While a single slow build is easy to detect quantitatively, hastily labeling it as friction runs the risk of crying wolf and claiming friction without considering the developer’s judgment or experience.”
Note: This reminds me of an example shared by Thomas Khalil, Head of Platform Engineering at Trivago, on my podcast. He said they surveyed developers and found that for some groups, builds were taking nearly an hour, but the developers felt that was reasonable. At the same time, another group of developers were working with builds than took less than 10 minutes, and they were very unhappy with that. The point is, we need both qualitative and quantitative insights to get the full story about what’s impacting developers.
Instead of relying on easily available data, the researchers propose a more human-centric approach to developing quantitative metrics. Their approach allows leaders to contextualize data, and also track the impact of improvement efforts. It includes the following steps:
Ask developers about their experience to identify areas of friction. The first step is to gather data directly from developers using surveys or interviews to get a better understanding of their experience.
Develop quantitative measures to better understand the areas of friction previously identified. The next step is to think about how the experiences they want to examine further manifest in system-based data. The goal here is to develop and then refine metrics that accurately represent those experiences.
Validate that quantitative measures accurately correspond to developers’ experiences. The final step is to validate the metrics by comparing them to additional feedback from developers. This ensures that the metrics are properly instrumented and accurately correspond to the specific experiences of developers that are being examined.
The rest of the paper walks through two examples that show how Google applies this approach: the first example describes how they developed quantitative metrics for the developer experience of flow, and the second describes how they developed metrics for the developer experience of friction.
Flow and friction were selected because they are important inputs to developer productivity. Developers are happier and more productive when they can work without facing friction and when they frequently experience a state of flow.
Measuring flow and focused work
Achieving a state of flow has been defined as the “optimal experience” and is often linked to feeling productive, focused, and accomplishing goals. However, it’s difficult to quantify — flow is a personal experience — so, after surveying and interviewing developers about their experience of flow, the researchers decided to develop a metric for focused work.
The justification for measuring focused work: developers achieve flow states if and only if they are doing focused work. The researchers recognize that developers can do focused work without achieving flow, so focused work is a broader measure than flow, but still can serve as a helpful proxy.
To measure “focused time,” the researchers decided to develop a metric that looked at task similarity. Meaning, when developers perform a series of related actions within a given period of time, that is recognized as focused work. This metric was developed using a natural language processing technique which is typically used to see how words in sentences are related. The researchers applied this technique to the work logs from different tools used by developers. By looking at the sequence of tasks in these logs, they could figure out if the developers were concentrating on similar tasks (indicating focus) or switching between unrelated tasks (suggesting a lack of focus). This way, they could identify when developers were likely doing focused work.
The researchers validated the metric by comparing to daily self-reported data from developers about whether they felt “in flow or focused” as they completed each of their tasks throughout the workday. The researchers found high agreement between their metric for focus time and the survey data.
One of the ways the focus time metric is used today at Google is to measure the impact of calendar management and company-wide interventions. For example, the researchers can answer these questions: do no meeting weeks enable developers to experience more time in flow or focus? Or, can condensing meetings and supporting focus time blocks improve developer productivity?
Measuring friction
The researchers aimed to measure "developer friction" to understand what hinders developer productivity and happiness. Friction is a broad topic, but the researchers wanted to develop a metric that could give a high-level understanding of the amount of friction experienced across groups. For example, the metric would allow them to say, “50% of developers experienced friction last week.”
They began by surveying a sample of developers at the end of their workday for two weeks. Developers were asked about whether they experienced friction, what they were working on when this friction occurred, and how they resolved the friction. Through this, they found some common areas of friction, which were associated with build and test latency, flaky tests, and issues with code changes being blocked due to continuous integration failures. These results enabled the research team to explore whether they could develop a “developer friction” metric that included these areas of friction as components.
Next, the researchers compared developers’ feedback with existing friction metrics (e.g., average build latency, number of flaky tests). They initially noticed these metrics didn't always match the developers' experiences, however there was a better match when the metrics were refined and aggregated to the developer. Further, they found that the quantitative and qualitative data sometimes differ: for example, in some cases, developers didn’t perceive an area of friction as an issue.
Ultimately, the researchers decided to develop a metric that tracks both how developers feel and the quantitative metrics for measuring friction. Specifically, they look at the areas of friction where higher values are negatively related to two sources of information:
developer sentiment items from the quarterly survey that are related to friction (e.g., lower ratings of satisfaction with code complexity or project velocity)
productivity metrics (e.g., fewer change lists, longer iteration loops).
The developer friction metric can be used to identify areas for improvement. For example, what workflows contribute to the most friction? Or, what tooling improvements in the past have reduced friction?
“Our approach to building the focus and friction metrics put the developers’ personal experience front and center, enabling us to build metrics that can look at interventions aimed at increasing focus or decreasing friction through the lens of the end impact on the developers themselves.”
Final thoughts
The specific metrics Google uses are clearly advanced, requiring instrumentation from a wide range of systems and work to fine-tune and validate metrics. Still, I think there are a few lessons we can take from their approach:
Start with surveys or interviews to understand what areas of friction developers are experiencing. Or if you’re interesting in a specific topic, such as the onboarding experience or an internal tool, ask developers about their experience with that topic. How satisfied are they? Is it unnecessarily difficult?
Then, when you have a better understanding of friction, select quantitative metrics that help you better understand the areas you want to examine.
Finally, when in doubt, verify that quantitative metrics match the experience of developers by asking them. Sometimes systems can shift or data pipelines can pick up something they shouldn’t, so by asking developers, we can make sure metrics are reliable.
Is there any you learned from Google’s approach that I missed here? I’d love to know in the comments or on LinkedIn.
That’s it for this week! If you’re interested in reading a guide for running an internal survey to identify problems impacting developer productivity, send me a connection request with the note “guide.”
Subscribe here if you haven’t already:
-Abi