Five years later: Reflecting on SPACE with the people who built it
The authors of SPACE met in person for the first time. Here's what five years of AI, remote work, and real-world use taught them.
Welcome to the latest issue of Engineering Enablement, a weekly newsletter sharing research and perspectives on developer productivity.
🗓 Join Justin Reock and me on June 18th for a research briefing on measuring AI agents, revisiting the Core 4, and more. Register here.
Last month, I had the privilege of attending the inaugural Developer Experience Research Forum at UC Irvine. It brought together researchers and practitioners from across academia and industry for a day of talks, conversations, and the kind of honest debate that only happens when the right people are in the same room.
The day included a panel that I won’t ever forget. For the first time since the SPACE framework was published in February 2021, all six of its authors were together in person: Nicole Forsgren, Margaret-Anne (Peggy) Storey, Chandra Maddila, Thomas Zimmermann, Jenna Butler, and me. I want to start by saying thank you to each of them. Collaborating on SPACE has been one of the most meaningful experiences of my career, and getting to sit alongside these colleagues five years later and take stock of what the framework has become was genuinely moving. I’m grateful to Tom, Iftekhar Ahmed, and the UCI team for making it happen, and to André van der Hoek for his amazing job moderating.
Unfortunately I don’t have a recording to make available to those who were unable to attend, but here is my attempt to capture the highlights of that conversation. The panel was wide-ranging, driven largely by questions from the audience, and covered a lot of ground. I’ll do my best to do it justice.
How SPACE came to be
For those less familiar with the backstory, SPACE did not emerge from a formal research program or a planned initiative. The idea started with Nicole. DORA, which she co-created, provided a measurement framework for software delivery, but there was a need for something that addressed developer productivity more broadly. So she reached out to a handful of colleagues (which thankfully included me), arrived with a few dimensions sketched out in her head, and the rest took shape over a series of Teams calls during what was still largely a remote-work world.
What’s remarkable is that several of us had never met in person before that project. We built the framework together, at a distance, and then watched it travel far beyond anything we had imagined. As I said on the panel:
“We could have never imagined that it was going to sort of grow into the thing it became. I don’t think you should ever hope to do something like that, because you’ll never be able to quite capture it. It’s like... right place, right time, things came together.”
The framework itself is straightforward in concept: five dimensions to consider when thinking about developer productivity. Satisfaction and wellbeing. Performance. Activity. Communication and collaboration. Efficiency and flow. The core argument is that productivity cannot be reduced to a single metric, and that a meaningful measurement approach should draw from at least three dimensions and include at least one perceptual measure. Metrics chosen well will often create productive tension with each other, and that tension is a feature, not a flaw.
Chandra reflected on the process of building it, including a detail I had honestly forgotten: the framework's original working name was not SPACE at all. It was FACTS. Trust was in there from the beginning, under a different label. Looking back on the framework more than five years later, Peggy said:
“I think we did a really good job. I think that the five dimensions have really held up really well. But they’re big — each of those dimensions are such huge concepts. And maybe what we need to do now is look at each of these in turn.”
Activity metrics: newly controversial, newly important
No dimension generated more discussion on the panel than Activity, and that’s not an accident. Activity metrics are the ones most organizations default to because they’re the easiest to instrument. They’re also the ones most prone to misuse.
What made the conversation interesting is that the panel did not argue for abandoning Activity measurement. The argument was more nuanced than that. Jenna put it directly:
"I actually think this is one of the areas where SPACE is newly important again, because you may have seen headlines about what percent of codebases are AI generated at this point. And I'm like, we're back there. We wrote about this a decade ago... some of those activity metrics like lines of code and PRs are newly resurfacing, and people are forgetting that we knew that this wasn't the greatest plan in isolation."
Chandra added that the scale has changed in a way that makes the problem even more acute. A single developer working with a swarm of agents can now generate an extraordinary volume of pull requests. The count alone tells you almost nothing about the quality, the impact, or the experience of the work.
The more useful question is not whether to measure activity, but which activity metrics are worth measuring and what you plan to do with them. I offered an example from my own work: Time-To-First-PR, meaning how long it takes a new hire to check in their first piece of code.
“Obviously easy to game, right? Have a new hire check in a trivial first PR... Turns out when you try to game it, when you explicitly try to have a trivial first check-in, it still leads to positive long-term outcomes. Why? Because that first code check-in has nothing to do with the code. It’s about learning your environment, setting up your system.”
I believe that good metric design involves choosing metrics where gaming them still gets you the outcome you actually want.
The politics of productivity measurement
One of the most candid moments of the panel came in response to an audience question about whether productivity measurement is inherently neutral or whether it inevitably becomes a political tool. The honest answer is that it is both, and you have to design for that reality.
Tom made the point that having five dimensions rather than one makes it structurally harder to play politics with the data. When you look at multiple dimensions simultaneously, you’ll often find they point in different directions, and that tension forces more careful thinking.
Jenna was direct about something that deserves to be said plainly. There is an elephant in the room across the industry right now about how many developers organizations need, and productivity metrics are being watched closely in that context.
"We tend to decouple from products and we're very... hoard-y with our data. We will give them trends. We'll let them know this is what's happening on a broad scale, or doing this had this impact. But we are not allowing individual managers, directors to look at people's information. We protect that because in theory, happy workers are productive workers. People who are terrified are not."
… and Nicole added additional framing that I found useful:
"Some data is better than no data... I know that for many of us here, we really do our best to measure in a way that is very neutral. But I know I'll have execs and other business divisions come to me and they'll say, 'Well, I need this [metric] to go up.' And I was like, 'Amazing. That's not on me. That's on you. I can give you the information and you can figure out if it goes up [or] down and why.'"
One practical safeguard worth noting is that some organizations deliberately bucket metrics together, so that no one can drive up a single number without being held accountable for the others in the cluster. It makes the kind of narrow gaming that distorts incentives structurally harder to do.
The C in SPACE: the most underinvested dimension
If Activity is the dimension that gets the most attention, Communication and Collaboration may be the one that gets the least. That gap is growing more consequential, and as Peggy put it, needs more focus than ever:
“Development is a team sport. And with AI, I don’t think there’s anyone in this room that doesn’t think that collaboration and communication hasn’t changed... If anybody here is thinking of using SPACE, make C one of the first things you look at.”
What makes this particularly important right now is that AI has changed collaboration patterns in ways we are only beginning to understand. Developers report asking questions of their AI tools that they used to ask colleagues. The texture of team communication is shifting. And yet the measurement infrastructure for tracking that dimension barely exists in most organizations. I shared a finding from my SPACE of AI paper that felt particularly relevant:
“C is the only dimension of SPACE that the majority of developers do not believe that AI has improved. Every other dimension showed improvement. Not C.”
That's a significant signal, and I think it points toward where the field needs to invest its energy next.

What would you add to SPACE today?
The question the audience asked that will stay with me longest was a simple one: if you were writing SPACE now, what would you add?
The five dimensions have held up well. The panel was in agreement on that. But five years of AI acceleration, remote and hybrid work, and a rapidly shifting sense of what software development even means has surfaced things the original framework didn’t fully anticipate. While we might not add new dimensions, if we were updating it today, we would add focus for:
Trust: This was the most consistent answer across the panel, acting as a foundational bedrock for satisfaction and performance.
Cognitive and Intent Debt: Based on Peggy’s recent work, are we losing overall understanding of codebases as AI writes more of them?
Deskilling: The worry that relying heavily on automated tools will cause core engineering capabilities to atrophy over time.
AI Addiction: Within the wellbeing dimension, tracking addiction-like behaviors with generative AI tools.
Ultimately, as Chandra mentioned, the strength of SPACE is that organizations can dial up or down different dimensions to meet rising needs. You don’t need to change the structure; you just need to rebalance the rubric.
"Things like well-being are very, very, very important. So I think reducing focus a little bit on the activity side... that doesn't fundamentally change what SPACE is. You can just use SPACE but rebalance the rubric."
We acknowledge that the framework was never designed to be a perfect model. But, as Peggy put it, that doesn’t mean it isn’t valuable.
"Some models are wrong, some are useful. It was supposed to change the conversation. It was supposed to make people think about the different aspects of productivity. And I think it did that."
That feels right to me. The framework was designed to change the conversation. I think it did. The work now is to keep refining what we measure within that space, with the same care we brought to defining it in the first place.
Final thoughts
It was a remarkable day for me. I’m grateful to UC Irvine for hosting it, to my co-authors for showing up, and to everyone in that room for asking the hard questions. If the industry continues to ask them with this much rigor and honesty, I think the next five years will be even more interesting than the last.
This week’s featured DevProd job openings. See more open roles here.
Ashby is hiring an Staff Platform Engineer | Remote
BambooHR is hiring a VP of Engineering | Utah (Hybrid)
Cashea is hiring an Infrastructure & Developer Productivity Platform Engineering Manager | Remote
Figma is hiring a Staff Software Engineer, Developer Experience | Remote; US
Morgan Stanely is hiring an AI Platform Engineer - Vice President | New York
That’s it for this week. Thanks for reading.




