Chapter 28: The trouble with measurement

Book: Succeeding with OKRs in Agile (2nd Edition)
Author: Allan Kelly
Part: V — Forewarnings
Chapter: 28 — The trouble with measurement
Reading time: 8 minutes
Tags: measurement-problems goodharts-law goal-displacement tunnel-vision targeting-the-measurable metric-gaming unintended-consequences measurement-limitations okr-pitfalls balanced-measurement quantification-dangers

Summary: Kelly presents the philosophical and practical dangers of over-relying on measurement. The chapter covers Goodhart’s Law (when a measure becomes a target, it ceases to be a good measure), goal displacement (pursuing the metric instead of the underlying goal), and tunnel vision from over-focusing on what’s quantifiable. Kelly argues for balancing measurement with judgment and remaining aware of what metrics cannot capture.


The Gross National Product does not include the beauty of our poetry or the intelligence of our public debate. It measures neither our wit nor our courage, neither our wisdom nor our learning, neither our compassion nor our devotion. It measures everything, in short, except that which makes life worthwhile.

Robert Kennedy, US politician, 1925–1968

If one is looking for a reason to discredit OKRs, look no further: this chapter makes an excellent starting point. Conversely, anyone promoting OKRs should understand the opposing arguments. It is for you to choose how you want to answer these critiques.

This book, along with much of the writing on OKRs, and indeed management writing as a whole, advocates explicit goals, the quantification of those goals and the measurement of progress. I believe such practices increase focus and therefore improve effectiveness.

However, I also acknowledge that reducing everything to a number is an oversimplification. Focus can lead to a blinkered view, and chasing goals risks unintended consequences.

Obviously these two views pull in different directions. Both are valid arguments, and both extremes are wrong – the truth lies somewhere in the middle.

Such problems and indeed this whole chapter may seem irrelevant to the day-to-day use of OKRs. However, these issues are insidious; over time they will detract from the efficacy of OKRs. The first step towards avoiding them is awareness.

Ultimately there may be no solution to such differences; instead one must learn to balance between the two opposing forces. Walking this tightrope is an ongoing balancing act that demands respect for both sides.

28.1 Targeting the measurable

As the opening quote highlights, there are things that might not be measurable. While there have been attempts to measure happiness, and at least one body produces a world

happiness report¹, how many of us actively quantify our happiness? How many of us target our happiness year-on-year? Are those who do quantify their happiness actually happier than those who don’t? And how can one tell?

Indeed, one can even argue that the most important decisions we make in life – whether to marry, who to marry, how many children to have, which house to buy, which jobs to accept and which to reject, and who to include in our will – get made not on data but on emotion, feelings and intuition.

Some of this might be laziness, after all; such quantification is hard and many of us lack the necessary skills. Earlier I give suggestions about how to measure and quantify targets. I respect Tom Gilb when he says he can measure love with a number – although I have never had a chance to ask his wife’s view of this measurement.

28.2 Questions measurement can’t answer

Even if one wants to measure everything and make data-driven choices, can one afford the cost? Consider a manager who finds the need to arbitrate in a conflict between two employees. Would they have the time to measure the conflict? Model the potential outcome and make a data-driven decision?

Even if they had time to do this, who would see the decision as fair and legitimate? Suppose they conclude that one employee should be let go as a result of the conflict. A rational decision might be that while Sasha is in the wrong, Alex is the one to fire, because Sasha is more productive. How would other employees see this decision?

The manager could revisit their model and could factor in the effects on other employees. How disgruntled would they be? How much would productivity change? What effect might the decision make on staff retention?

The model would grow and grow. The time needed to research the answer would grow, and while the manager might have a defensible position, would it satisfy others? Does rationality sit well with fairness? Maybe it would for a perfectly rational human – the so called homo economicus, economic man – but would it for you?

28.3 Goodhart’s Law

Goodhart’s Law²: 'When a measure becomes a target, it ceases to be a good

¹https://worldhappiness.report/

²https://en.wikipedia.org/wiki/Goodhart's\_law

measure.’ Charles Goodhart, professor of economics

Charles Goodhart coined his eponymous law in the 1970s while discussing government attempts to reduce inflation by targeting money supply – the amount of money in the economy. As the British Government tried to reduce inflation by reducing the money supply, the behaviour of money – or rather the behaviour of people using money – changed. Rather than cash, people could use cheques or credit cards; rather than put savings in a bank, they could use a non-bank building society. Over time it became increasingly difficult to even define what money is, something that is even more difficult in the age of bitcoin and digital money.

Similar phenomenon are seen elsewhere in society, for example hospitals chasing targets that adopt behaviours that meet the target but do not contribute to patient wellbeing, or train companies that extend journey times to meet punctuality targets.

Nor was Goodhart the only one to observe this phenomenon. Psychologist Donald T Cambell coined his own law:

Campbell’s law³: ‘The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.’ Donald T Campbell, social scientist

The most obvious example in software development is velocity, the popular mechanism agile teams use to track progress and make forecasts. I have seen teams where velocity only ever increased: at every sprint the team delivered more velocity points than ever before. Yet the amount of software functionality or capabilities didn’t keep pace.

Consciously or subconsciously, team members ‘devalue’ velocity points: estimates get bigger, so while the final number is larger, it represents less. That is what economists call inflation. It would be naive to think OKRs would somehow be exempt from such effects.

That might be a reason for changing measurements frequently. It also serves to emphasize the importance of ensuring that everyone understands why a target exists and that everyone agrees on how to measure it.

Explicit objectives and targets are good, because they set a course and serve as a guide to future decisions, enhance shared understanding and team working, promote focus and help demonstrate progress. But having a target without understanding risks hitting the target but missing the goal. Quantitive targets need to be combined with qualitative understanding.

³https://en.wikipedia.org/wiki/Campbell's\_law

28.4 Goal displacement

OKRs are good because they create focus – they allow teams to measure progress towards their goal. But sometimes people mistake the measurement for the goal itself, something sociologist Robert K Merton termed goal displacement.

For example, if a team target is ‘10% more visitors to the online shop’, it may be that ‘10%’ looms larger in the mind than ‘visitors’. There are a number of dubious means that can be used increase the number of visitors that might allow the target to be met while undermining its intention. Black-hat SEO techniques might boost visitor numbers in the short term while damaging them in the long term.

Similarly, focusing on the target and losing the context can lead to a drop in quality. ‘Low quality’ visitors may meet the target but not its intention.

For example, an OKR that asks a team to ‘deliver at least ten user stories per sprint’ might simply lead to teams dropping quality standards. Coders and testers could, consciously or subconsciously, overlook defects that would soon be found by customers.

The work involved in logging customer problems, administering remedial work, performing a fix, retesting and releasing a fix will probably be greater than the effort saved with the initial shortcut.

Attentive readers will recognize both these examples as cases of Goodhart’s Law and unintended consequences – another term credited to Robert K Merton.

Goals should be chased and targets should be met, but not at any cost. Teams may challenge norms: they should think of new approaches and try new ideas, but need to be conscious of company norms, culture and boundaries. They should not act maliciously or counter to long-term interests.

Of course there is a judgement call here – when are you challenging and when are you going too far? If in doubt, ask – have a conversation. If you find yourself reluctant to tell others about your approach, it may be that you already know it contravenes expected standards.

28.5 Overcoming tunnel vision

While elsewhere I have suggested taking a blinkered view during OKR delivery, one should not push that view too far. Choosing objectives to match what can be measured, pursuing objectives to the detriment of others or focusing exclusively on objectives in the midst of a crisis all represent dangerous tunnel vision.

There are times when it is right to ignore distractions and the chaos that surrounds us in order to focus on our goals. There are also times when ignoring what is happening around us is irresponsible. Unfortunately there is no rule or metric to tell us which path to follow, when to stay the course and when to go off-piste.

Rules of thumb

For a few days a quarter, when setting OKRs, default to expansive thinking – talk broadly and subjectively. Then narrow conversation to create objective OKRs.

While executing OKRs, default to objective thinking: focus on targets and measures.

As with all defaults, sometimes you need to override them.

Such problems will only become worse if organizations sanction team members for not meeting OKRs. That might not be a direct reprimand to one’s face – it might be a sarcastic comment, a decision to promote someone with a record of achieving OKRs, a financial bonus not awarded or a smaller pay rise.

Indeed, perceived sanctions – where an employee imagines sanctions when there are none – are probably more dangerous than actual sanctions, because they easily multiply in peoples’ minds. Such imagined slights are also more difficult to disprove.

Leaders at all levels need to work hard to counter tunnel vision and perceptions. The difficult part is to balance the wide, unblinkered view with the absolute focus that OKRs need to succeed.

As with so much else, iteration can help: look broadly and subjectively when deciding on OKRs, allow time to think expansively and hear different views. Decide on goals, focus the goals with numbers, then execute with that focus. Accept doubts when executing OKRs but don’t jump to change course. When a time-box ends, evaluate the results and return to broad subjective mode to learn lessons and set new OKRs.

Repeat. Iterate. What could be more agile than that?

28.6 A final warning: targets

One doesn’t have to go far into history to find examples of targets that lead good people astray. Whether body counts in Vietnam or cross-selling at Wells Fargo, there are plenty of examples of what happens when targets go too far. Pursuing numerical targets for 12 out

of 13 weeks is a powerful approach, but it needs to be moderated if the kind of problems described here are to be avoided.

So when goals are reviewed in the final week of the quarter and new ones set, think broadly. Let everyone speak openly and safely, listen to concerns and think again about purpose and mission.

One quarter pursing erroneous targets or one quarter encouraging malevolent practices may be bad, but it usually isn’t the end of the world. Be big enough to recognize problems and correct them.

A far bigger mistake is not to recognize problems and to repeat erroneous or misleading targets for another quarter. Or, as agile says: inspect and adapt.

28.7 Summary

Having spent most of this book arguing for objective, clear and quantified targets, this chapter highlights the dangers inherent in such targets. Avoiding these dangers starts with awareness. Deliberately iterating between subjective setting and objective execution is one way to balance both sides.

  • Complete measurement might not be possible, and even thorough measurement can be time-consuming and costly.
  • Even a completely rational and quantified decision may not seem equitable or reasonable to employees, customers or other stakeholders.
  • Quantified targets have a bad habit of causing unexpected side effects, unintended consequences. It is therefore important to combine both hard quantified targets with a softer understanding of objectives.
  • Aim to set objective OKRs while having subjective conversations about them. During execution be objective in focusing on OKRs, but allow your inner voice to raise doubts.