Why aren't AI productivity gains higher?
Developers explain why the gains are more modest than expected.
Welcome to the latest issue of Engineering Enablement, a weekly newsletter sharing research and perspectives on developer productivity.
Last week, we shared data showing that the current productivity gains from AI in engineering are more modest than many headlines imply. To better understand this result, we’ve been interviewing developers across a range of companies to hear what they’re experiencing, and to dig into the reasons behind these more modest gains.
This newsletter summarizes the key themes we’re hearing in those conversations so far, along with direct anonymous quotes from the interviews we performed. This work is ongoing, so if you have something to add, please share your perspective in the comments or on LinkedIn. We’d love to hear from you.
Why are the real‑world gains from AI more modest than many people expect?
Developers we spoke with pointed to several reasons. The most common: coding is a relatively small share of how engineers spend their time, so even meaningful acceleration there has limited impact on overall throughput. Beyond that: new bottlenecks are being created, and organizational and social friction hampers AI’s potential impact.
Here’s what we heard.
1. Coding isn’t the main bottleneck
The most common theme we heard: AI does accelerate developers’ ability to write code, but coding only represents a percentage of the work engineers actually do. Even if AI cut coding time in half, it would only affect a fraction of the total time spent delivering a feature.
A lot of the slowdown isn’t from our tools, but from either other teams, services or processes.
Many senior devs say: “writing code is the easy part”... The bottleneck is the human side: alignment, planning, scoping, code reviews, etc.
AI can help developers write code fast, but writing faster code is not always more productivity. Most developers code only a small proportion of their time. There might be an expectation that this percentage is higher from those who write the headlines, but the reality is that most of the time developers spend is on navigating process and quality, change management, regulation, etc.
The easy tasks are a little easier. The tedious tasks are a little less annoying. Larger/more complicated tasks — I may be able to shave off hours to a day… but that’s not happening every day, and that doesn’t necessarily mean I ship 3x more PRs. Just means a 4d task can take 3.
2. Partial SDLC automation creates new bottlenecks, so AI gains cap out
AI has sped up code generation, but that has created or exposed bottlenecks downstream, particularly in code review and validation. More code is being produced, but the processes for checking that code haven’t scaled with it, and developers report that the time saved writing is often consumed by the extra scrutiny that AI-generated code requires.
AI can very significantly speed up initial engineering time, but often that saved time is spent on extended reviews, fact checking or issue remediation, resulting in net-zero productivity gain.
The opportunities for LLMs to save large amounts of time on new projects is offset by time spent, sometimes ineffectively, attempting to use them on existing projects whose complexity and intricacies lead the LLM to make changes that are at minimum difficult to review even for experienced owners of the code.
People do not have all parts of SDLC AI-enabled, wherever it does not exist becomes a bottleneck (planning, code review, verification). When people are trying to remove those bottlenecks, there are often lack of guardrails that creates misdirection and churned motions.
3. Social friction is a barrier to adoption
Polarization between pro- and anti-AI engineers, ambiguity about when and how AI should be used, and the absence of peer champions all slow the rate at which teams develop effective workflows with these tools.
There’s misalignment amongst engineers – the camps are pretty polarized in terms of pro-AI and anti-AI. When anti-AI engineers have status and loud voices it is challenging for everyone to know how to interact with ai tooling.
AI tooling isn’t discussed enough. Engineers are often unsure whether it’s high status or low status to be talking about using AI, so they just don’t. And people are reluctant to state their true views. If people say what they might really believe, they might seem crazy. In many organizations it is not socially acceptable to suggest that we re-evaluate changes to the code review process for example.
Having at least one other teammate who is bought in to using AI tools is essential, being an isolated solo-adopter does not allow you to materialize the gains in a meaningful way. Software development is a team sport.
4. Tooling and skill gaps are limiting gains
These two factors are hard to separate: immature tools make the learning curve steeper, and developers who are early in that curve get less out of the tools. Both are limiting gains, though developers who shared this view generally see it as a temporary problem.
On the skill side, using AI effectively is its own discipline. On the tooling side, AI assistants don’t always slot cleanly into existing developer workflows. The integrations that would make them feel native to how teams already work aren’t there yet, and getting agents to operate autonomously on real infrastructure remains an unsolved problem for many.
On skill gaps:
Someone just starting out with these tools is not as effective as someone who has incorporated them into their workflows for much longer. I have 1000+ hours of purely agentic coding experience and I have so much to learn. Many beginners may have just a few hours or tens of hours. Learning to agentically code is a skill.
Most of the task isn’t the actual coding, it’s articulating a fuzzy problem in a clear, easily understandable way.
This is more short term, but the pace of change in the tooling (e.g., the emergence of skills) means I’m spending a fair amount of time getting to know how to use the tools appropriately; which usually means going down various rabbit holes to discover how NOT to use them. That creates some (temporary?) friction that means I might fall back to my own skills when there’s production pressures.
Note: Leaders can distribute DX’s prompting guide and advanced prompting guide internally with developers as a way to help them build a foundational practice with AI.
On tool maturity:
The tools aren’t mature enough to fit into existing systems cleanly, and most developers haven’t figured out how to use them well yet. Not because they’re bad engineers — the workflow is just genuinely new and nobody’s handed them a playbook.
The difference between frontier models and models even a few months back can be significant.
Most developers are used to working inside well-defined tools — your IDE, your terminal, your CI pipeline. Everything has a clear place and a clear trigger. AI assistants don’t work that way… So you end up with higher adoption among developers who naturally gravitate toward that kind of open-ended tooling, and much lower adoption among developers who just want something that fits cleanly into what they’re already doing.
5. Most AI tools lack important context
AI tools perform well on problems that are self-contained and well-documented. But most real engineering work isn’t like that. The context that matters—why a system was designed a certain way, what the implicit rules are, what business constraints apply—typically isn’t written down. It lives in people’s heads. Until that knowledge is made explicit and accessible, AI will keep hitting a ceiling on the kinds of problems it can reliably help with.
The bigger issue I keep running into is that most codebases and systems aren’t set up for AI to actually help. Not an architecture problem — more that the knowledge of how things work lives in people’s heads. Why this service behaves this way, what the implicit contract between these two systems is, why that design decision got made three years ago — none of that is written down anywhere. An AI assistant can’t reason over a Slack thread from an archived channel or the mental model of the engineer who built it. Until that context gets surfaced into something concrete, you’re always going to hit a ceiling on what the tooling can do.
AI doesn’t have the context I do on making those decisions, as LLMs are built of massive stats, not direct knowledge of the niche ends of what I’m accomplishing.
I would (anecdotally) put forth that a lot of my work with AI is trying to get it setup to do the work with proper guardrails. It’s not safe to let run loose on our code-bases or production facing infrastructure as its not deterministic.
AI models are generally good at understanding what we are trying to achieve, especially when they have access to the workspace or codebase. They can often provide useful solutions for specific use cases, particularly when the problem is related to foundational aspects of a technology or programming language. However, what they often lack is a deeper understanding of the business context and the various factors that influence business logic. That broader context is important to fully understand the bigger picture of a project. In some cases, this limitation can be mitigated by carefully crafting prompts, but when the business logic becomes complex or the prompts become too large for relatively simple tasks, people may step back and revert to more traditional approaches.
Other observations about the current state of AI tools
While speaking with developers, we heard a few other interesting themes about what they’re currently experiencing.
1. Documentation robustness is improving, but its quality is yet to be determined.
The biggest quality gain I’m seeing is in improved documentation. Developers hate writing docs, so it will always be the last thing they tackle, or the first to get cut... if they tackle it at all. At least for me, I found our documentation getting much more robust thanks to AI being able to take that task off our plate. And it’s not just great for the developers to have access to these docs. Giving the AI agents access to it provides a quicker context boost to know how to tackle the next project.
Using it to write code documentation leads to terribly incorrect information, an active detriment to future work, creating more work. This is really painful. It’s just not its forte.
2. AI is especially powerful in greenfield projects.
Could you whip up an app from scratch 3x faster with AI? I would believe that. And if I had to just guess, that’s where I think a lot of the hype comes from. Someone says “look I created this thing from scratch in 1 day!” But, I think it is known that AI isn’t quite as proficient when working on a mature, large codebase...with legacy code and lots of nuance. Which is where most ICs are living in day to day, I think.
While multi-tasking on my main efforts, I was able to build an entire new microservice from scratch to tests an idea and provide a proof of concept of a new paradigm (this has massive comprehension debt, I haven’t even looked at the code, but the service does what I expect it to).
It has helped a lot with doing things like fast PoCs to check if something will work.
3. There’s concern that developers are offloading critical thinking.
I would say I’m losing muscle memory on how to do things (e.g. terminal commands like checking git history for when a file changed, checking what version of a dependency is installed, etc.). I’ve grown dependent on the AI to remember the right commands for me.
I think if someone blindly uses AI for everything, he/she slowly forgets day to day things.
AI helps speed up some tasks but may result in folks offloading their thinking. Particularly amongst junior developers, the ‘improved initial speed’ can come at the expense of skill development. Overall speed improvement is negligible.
I have also found myself having to explain to more junior engineers why what AI said about a problem isn’t the correct answer for us in a particular situation. AI empowers more junior engineers to work longer without asking engineering questions of the more senior folks. That is a sharp double-edged sword that I am still learning how to work with when mentoring people.
[For me, AI’s] handiness is augmented by me knowing when it’s wrong due to my engineering experience.
Final thoughts
Two takeaways stand out to me from these conversations. First, the gains from AI code assistants are structurally limited by how much of the job is actually coding. Microsoft’s research puts coding at around 16% of a developer’s time. Even dramatic acceleration to code generation can only move the overall needle so much.
Second, it’s the human aspects of the software development lifecycle that are the biggest bottlenecks. Handoffs between teams, the skill required to effectively prompt and manage agents, and the review burden that AI-generated code creates—these are where time is going, and where the next wave of productivity gains may come from.
For leaders, this raises a couple of important questions. Which parts of the SDLC beyond coding are candidates for AI assistance? And are the human conditions for capturing AI’s upside in place?
The 10% figure is not the ceiling. But closing the gap will likely require focusing on the parts of the job that AI hasn’t touched yet.
If you have something to add, please share in the comments or on LinkedIn.
This week’s featured DevProd job openings. See more open roles here.
Amplitude is hiring an Engineering Manager, Cloud Platform | San Francisco, CA
BNY is hiring an SVP, Application Development Manager | Pittsburgh, PA
Figma is hiring a Staff Software Engineer, Developer Experience | Remote; US
Lob is hiring a Staff Platform Engineer | Remote; US
Plaid is hiring a Software Engineer - Platform | New York, NY
Vercel is hiring a DX Engineer | Hybrid (Austin, New York City, San Francisco)
Zillow is hiring a Senior Product Manager, Developer Experience | Remote; US
That’s it for this week. Thanks for reading.


