Setting Effective Targets for Developer Productivity Metrics in the Age of Gen AI

From:

Gregor Ojstersek and Laura Tacho from Engineering Leadership <gregorojstersek@substack.com>

To:

Hidden Recipient <hidden@emailshot.io>

Date:

5/25/2025, 2:51 PM

͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Setting Effective Targets for Developer Productivity Metrics in the Age of Gen AI

Using AI usage as a metric alone is not the way to go, this is what to do instead!

Gregor Ojstersek and Laura Tacho

May 25

READ IN APP

Measure the impact of AI in engineering with DX (Sponsored)

Everyone’s asking the same question: “What productivity gains are we getting from AI?”

With DX, you can get answers about:

How much time developers are saving with tools like Copilot
Whether AI usage is improving throughput or quality
What’s holding some teams back from using AI more

Schedule a personalized product tour to see how DX helps you measure engineering productivity and the impact of AI.

Get a personalized demo

Let’s get back to this week’s thought!

Intro

Measuring developer productivity is an important topic for many organizations. Especially in current times when many organizations are looking at how to increase Software Development productivity by using AI.

Some organizations are also enforcing the use of AI or measuring AI usage as a KPI in the performance review. You can find my thoughts on enforcing the use of AI either being a good or a bad thing here:

Enforcing the use of AI in engineering teams - good or bad thing?

Gregor Ojstersek

Apr 13

Read full story

I’ve even heard of some companies thinking about doing leaderboards of who is using the most LLM credits or who has committed the most AI-generated code to the codebase.

Which is a totally wrong way to measure productivity.

To help us with how to do it the right way and set effective targets for measuring developer productivity, I am happy to have Laura Tacho, CTO at DX as a guest author to today’s newsletter article.

Before we hand it over to Laura, I’ll share a bit of my thoughts on measuring developer productivity in combination with pure output.

Measuring the wrong things inspires people to try to game the system

It’s really important to understand that if you purely just measure the output and usage, people will naturally be prone to use that as much as possible, which would provide the wrong results for the business.

And it can also inspire pure individualism, which you don’t want in your organization.

So, if you measure how much are your engineers using AI by tracking the LLM credits usage, people will set up cron jobs to use as many LLM credits as possible.

Which will actually have ZERO effect on business success. On the contrary, the company will lose a lot of money because of it.

And same is true with measuring:

lines of code being added,
amount of tasks finished,
story points being done,
number of hours being online.

All of such measures inspire people to do wrong things. I rather look at these 4 specific things:

Are they focusing on building the RIGHT things and challenging requirements.
How much are they helping others.
How are they contributing to the success of the whole team/organization.
What improvements have they implemented and ensure they get adopted by other engineers.

And these are the main things I value:

Team productivity > Developer productivity
Helping others > Completing your own tasks
Building the RIGHT things > Amount of things being built

Using AI alone or finishing a lot of tasks or story points doesn’t mean much if you don’t provide value to the business or you don’t share your knowledge with others and your team is not working well together.

You can read my thoughts on metrics and how I measure developer productivity in these 2 articles:

How to use engineering metrics for the success of engineers and teams

Gregor Ojstersek

October 20, 2024

Read full story

How I measure developer productivity

Gregor Ojstersek

October 8, 2023

Read full story

Now, let’s hand it over to Laura!

Setting targets for developer productivity metrics takes careful consideration

In some cases, setting the wrong goals can backfire by creating unintended consequences.

Teams might start focusing on optimizing the numbers instead of the system, especially if there are anti-patterns like tying bonuses to individual metrics, or setting blanket targets on metrics teams can't directly control.

At the same time, leaders want to drive meaningful improvement and use goals for motivation and accountability. Teams want transparency and direction on where to focus. Even so, it can be difficult to figure out what kind of targets are realistic in the first place.

These three practices help engineering leaders avoid pitfalls and encourage their teams to use data to improve the system, leading to the right outcomes:

Set goals on the right type of metrics
Use multi-dimensional systems of measurement
Consider organizational context when setting targets themselves

Without these three things, organizations run the risk of developers feeling mistrusted and micromanaged, teams gaming metrics rather than improving systems, and metrics becoming distorted so they no longer represent reality.

Set team goals on controllable input metrics, not output metrics

Not all metrics are immediately actionable because they measure big-picture trends, and are often summary metrics that are influenced by many other factors.

Setting goals on these kinds of metrics – output metrics – can incentivize the wrong type of behavior and disempower developers, as they feel they can’t meaningfully influence the numbers.

On the other hand, a different type of metric – controllable input metrics – are very actionable on the team level and contribute to improving the system.

Being able to identify the difference between these different types of metrics is an important skill for any devex leader.

Output metrics: These metrics represent what you want to get to, but are not directly actionable. That’s because they’re a summary of other factors, used best as a diagnostic tool but not as something to be directly influenced by a single process, tool, or action. Some examples include:
- Change Failure Rate
- PR throughput
Controllable input metrics: These measure behaviors or processes that teams directly influence, which then result in changes to the output metrics. For example, code review turnaround SLAs are controllable and can improve PR throughput, and reducing flaky CI tests can improve Change Failure Rate.

This pattern is not unique to developer experience and can be seen in other parts of life.

Let’s imagine you have low levels of iron in your blood. This level is an output metric, and setting a goal on it – without mapping it to controllable input metrics – can make improvement seem out of reach.

Instead, you want to focus on controllable input metrics like taking supplements, eating iron-rich foods, and avoiding coffee with meals.

Doing these activities will lead to a change in the output metric, which makes them more suitable for goal-setting. Similarly, engineering teams need to identify the actionable inputs that influence the larger output metrics.

Depending on an organization's size and complexity, it might still be preferable to set goals on output metric, like improving Change Failure Rate, in order to simplify reporting and align on a single goal.

In cases like this, it’s essential that frontline teams go through the process of metric mapping to break down the output metric into controllable input metrics, and that those input metrics have their own goals and structures of reinforcement around them.

Avoid gamification with multi-dimensional measurement and aligned incentives

A common objection to setting targets around metrics is the fear that developers will game the system.

Gamification is the phenomenon where individuals distort the data in order to make the metrics look good, without actually improving the system. Goodhart’s Law describes this phenomenon, summarized as “when a measure becomes a target, it ceases to be a good metric.”

Gamification is dangerous for organizations because while the metrics show surface-level improvements, the reality is that the systems are usually worse off – but those negative changes are largely invisible because they aren’t being measured properly.

Setting goals amplifies the incentive for individuals to game the system, because goals create accountability and pressure to deliver specific results.

When people know they're being evaluated against a specific number, especially if rewards or advancement opportunities depend on it, the temptation to find shortcuts or manipulate metrics becomes stronger than the motivation to make genuine improvements that might take longer to reflect in the measurements.

A well-designed system of measurement and intentional culture around using metrics can help protect from the effects of gamification. We know how humans behave when metrics are used for measurement and goal-setting. With that knowledge, it’s up to us to design better systems.

Use multidimensional measurements instead of one-dimensional metrics.
When you track multiple related metrics together, manipulating one metric usually affects others negatively, making gamification more obvious. DX Core 4 is an example of a multidimensional system of measurement.
Focus on learning and improvement rather than incentivizing or rewarding hitting specific thresholds.
Give teams time and autonomy to address the root causes affecting metrics. When teams feel pressured without having resources or authority to make real improvements, they're more likely to find ways to adjust the numbers without fixing the system.

Set realistic targets based on organizational context and strategy

When determining actual target values, one size doesn't fit all. Consider:

Past performance

Different teams start from different places. Instead of blanket targets across the organization, consider percentage improvements from each team's current baseline.

External benchmarks

Industry benchmarks (like the 75th percentile) provide useful reference points, but remember that context matters.

Effort curves

Improvement isn't linear. For example, moving from the 50th to 75th percentile often requires less effort than moving from the 75th to 90th percentile.

Metric characteristics

For some metrics, higher isn't always better (e.g., extremely short PR cycle times might indicate inadequate code reviews). Some metrics need SLAs or thresholds rather than continuous improvement targets.

Above all, remember that metrics don't replace strategy. They enhance it. Even with robust metrics, you still need human judgment to set appropriate goals in your specific context.

Actionable points to get you started

To apply these principles in your organization:

Clearly distinguish between controllable input metrics and output metrics
Identify the specific input metrics teams can influence
Show how these inputs connect to larger organizational goals
Set appropriate targets on those controllable metrics
Ensure teams have time and resources to address improvements
Monitor both input and output metrics to validate your approach

By following these guidelines, you can create a more productive environment focused on genuine system improvement rather than superficial number manipulation.

Last words

Special thanks to Laura for sharing her insights on this very important topic! Make sure to follow her on LinkedIn and also check out DX, they are doing a lot of great things in regards to measuring developer productivity.

We are not over yet!

What Does a CTO do?

Check out my latest video. I am sharing what a CTO does on a daily basis. The role heavily depends on the business and looks completely different based on the size of the organization.

New video every Sunday. Subscribe to not miss it here:

Subscribe to the channel!

Liked this article? Make sure to 💙 click the like button.

Feedback or addition? Make sure to 💬 comment.

Know someone that would find this helpful? Make sure to 🔁 share this post.

Whenever you are ready, here is how I can help you further

Join the Cohort course Senior Engineer to Lead: Grow and thrive in the role here.
Interested in sponsoring this newsletter? Check the sponsorship options here.
Take a look at the cool swag in the Engineering Leadership Store here.
Want to work with me? You can see all the options here.

Get in touch

You can find me on LinkedIn, X, YouTube, Bluesky, Instagram or Threads.

If you wish to make a request on particular topic you would like to read, you can send me an email to info@gregorojstersek.com.

This newsletter is funded by paid subscriptions from readers like yourself.

If you aren’t already, consider becoming a paid subscriber to receive the full experience!

Check the benefits of the paid plan

You are more than welcome to find whatever interests you here and try it out in your particular case. Let me know how it went! Topics are normally about all things engineering related, leadership, management, developing scalable products, building teams etc.

A guest post by

Laura Tacho

CTO @ DX

You're currently a free subscriber to Engineering Leadership. For the full experience, upgrade your subscription.

Upgrade to paid

Comment

Restack

Similar newsletters

There are other similar shared emails that you might be interested in: