Engineering Productivity: Measure What Matters

Measuring the productivity of most professions is exceptionally challenging. Would you measure a lawyer's productivity by the number of cases they close? Or a doctor's productivity by the number of operations they perform?

Moving to the software engineering realm, should your company start measuring lines of code produced by engineers? Please no. How about pull requests per engineer? Nah. Sprint velocity on the company level? 😬

While it may be tempting to get started with these metrics, resist the temptation. These metrics reward output, people who put in immense amounts of busywork. Another issue with these metrics is that they create unhealthy team dynamics - your company won't be more successful if teams compete internally, save that energy for competitors. Instead, your productivity metrics should focus on outcomes.

I've tried to summarize my thoughts on the company, team and individual levels.

Driving business outcomes on the company level

Let's take a look at a recent study, the book Accelerate and to what conclusions Nicole Forsgren, Jez Humble, and Gene Kim arrived after four years of research. They identified the following key performance indicators of high performing engineering organizations: lead time, deployment frequency, mean time to restore and change-fail percentage. Let's take a look at them, one by one.

Lead time is the amount of time it takes for the engineering teams to ship a feature, end-to-end. While it may be challenging to do this across the board (as features will differ in scope), you can simplify it a bit, and define it as the time it takes someone from opening a pull-request till deploying it to production.

Deployment frequency is more straightforward, as the name suggests, it represents the number of production deployments over a given period.

Mean time to restore represents the amount of time it takes to restore service, once you had an outage.

Change-fail percentage tells you what percentage of your deployments result in an outage that you have to restore.

The beauty of these metrics is that they are relatively simple to track. I bet you are already using some CI/CD pipelines, so the deployment frequency is given. Mean time to restore and change-fail percentage is something that your post-mortem meetings should already record. Without question, the most challenging metric here is around lead-times: it will be a compound metric that you'll derive from code review times, CI build times and deployment times.

While the mean time to restore and change-fail percentage metrics may seem like purely reliability metrics, they are not. The worse those indicators are, the more time engineers will spend on incidents and less on features that your customers are paying for.

Team health metrics

I believe that candid and compassionate feedback is central to a cohesive, high-functioning team. Managers, whether they are front-line managers or above, play a crucial role in creating a feedback culture: a culture where every employee feels safe and comfortable providing feedback to other members of the team, or of that of the broader organization.

While trust is not simple to measure either, anonymous surveys can help with it. By doing surveys for your teams regularly, you can also keep track of your progress in different areas (trust is just one of them).

Questions like these (well, more like statements that people have to agree or disagree with) might give you some idea of how you, as a manager, do when it comes to building trust:

"My manager proactively solicits feedback."
"My manager creates a safe environment, built on trust and respect."
"I can communicate openly with my manager and my team about our work, even when I disagree."

Other areas you might want to gauge are vision, alignment, learning and adapting. Questions like this may be useful:

"My manager learns from mistakes."
"My manager acts on the feedback they receive."
"My manager communicates how our team contributes to the company's overall mission."

Please note that while this section focuses on team health, most of the questions are about the team's manager. The reason for that is because most team members will try to model their behavior based on their manager. The manager's actions have a profound effect on what the team will treat as desired, accepted or discouraged behaviors.

Competencies for individuals

So far, we've covered what productivity metrics may look like on the company and team level: let's take a look at how productivity can be measured on the individual level.

In my opinion, the work engineers have to do is way too complex to be measured by a single metric, like the number of pull requests an engineer opens, or the number of RFC they write. No metric can do justice in trying to capture the real work. So why bother?

Instead, engineering productivity should be measured against career ladders (tracks). What's common in the career ladders most companies have, is that they identify a handful of critical competencies and behaviors that the company values and rewards. Progression.fyi has a good collection of open-sourced career ladders that you can look at. One of the reasons most companies should have an easily accessible and discoverable career ladder is because it helps set the expectations for both engineers and managers on what's expected of them, and what they need to achieve to advance in their career at their current company.

Okay, but what does it have to do with engineering productivity? Everything. Creating a career ladder is the perfect opportunity for setting the example of a productive engineer for each level. As a manager, you'll have a clear set of expectations and standards for each level, which helps you coach team members, whether they are struggling in their current role, or if they are preparing for a promotion.