Book: Succeeding with OKRs in Agile (2nd Edition)
Author: Allan Kelly
Part: II — Writing OKRs
Chapter: 12 — Crafting key results
Reading time: 8 minutes
Tags: crafting-key-results experiment-based-okrs hypothesis-driven-development time-boxed-okrs survey-metrics knowing-when-to-stop key-result-techniques risk-management-okrs learning-oriented-okrs measurement-tools
Summary: Advanced techniques for crafting key results including framing them as experiments (to reduce risk), using hypothesis-driven development, time-boxing uncertain work, and leveraging surveys for qualitative measurement. Kelly emphasizes that phrasing key results as experiments gives teams permission to fail and enhances learning, while time-boxing prevents unlimited investment in uncertain outcomes.
The first order of business is to try. You must try until your brain hurts.
Elon Musk, entrepreneur
When writing key results the temptation is always to write the obvious: ‘Replace the payment screen’ or ‘Add new protocol XYZ’. If you apply the thinking outlined in the last chapter you should already be able to see some improvements here.
As you become experienced in writing OKRs, you will find that there are a number of ‘tricks’ that can be used to make key results more achievable while also giving the team more autonomy. As a result teams should find that they can aspire to greater goals because it is safe to fail.
This chapter outlines a few of the tricks I have found useful when crafting key results that are occasionally also useful for objectives.
12.1 Experiments
Of course you would like every key result to deliver benefit to the business, but sometimes you really don’t know what will happen. You don’t know if a solution can be crafted, or if the solution envisaged will produce the desired outcome. Traditionally teams would deal with such issues by undertaking pre-work, usually analysis and planning. But pre-work both detracts from the previous period and can be self-limiting.
Phrasing a key result as an experiment can be a useful way of attempting something with an uncertain outcome, or where the team is doubtful whether it can reach a goal. For example:
Increase customer page views by 10%
Could become:
Run three experiments to increase page views
Lest that sound too much like a plan of action rather than a goal to achieve:
Learn about increasing page views by running three experiments and share findings
Or even several discrete experiments:
Learn about SEO tool changes needed to increase page views with a series of experiments and summarize conclusions
Compare page view statistics from ten experimental modifications to the home page and outline recommended practices
Present measurements from ‘other customers liked…’ experiments added to pages
Notice the share findings’ and ‘summarize conclusions’ appendages which, while vague, implies that the learning isn’t kept in one person’s head.
None of these experiments have measurable outcomes as success criteria. The measurements that are taken are information generated by the experiment. Success is doing the experiment itself and absorbing the learning that comes from it. Success is not reaching a target, success – the goal is to learn.
You could include another key result to enhance experiments to achieve a goal:
Deliver 10% more page views using the results of experiments
While this risks creating domino key results – the experiments may not find a workable solution – some dependencies are acceptable and may be unavoidable.
The experimental approach also works when the team needs to tackle new technologies. Suppose a team has a web page it wants to make more dynamic, but lacks experience with the necessary technologies. The team might write a key result as an experiment:
Ascertain workload and difficulty of replacing current pages with dynamic JavaScript by undertaking experimental replacement of XYZ pages
(Observant readers might notice the ‘spike story’ hiding in that goal.)
Phrasing a key result as an experiment makes it safer for the team to take on risk. The result of an experiment is learning: the team has learned something, and learning has value. That
there might be working or even useful software is just a byproduct of learning. Even if there is no useful product, learning ensures that the team is in a better position to attempt the work next time.
Experiments mean that something is done, not that something is completely finished or finally decided. They just mean that something is done, the outcome is reviewed and learning happens.
Experiments can be particularly useful when dealing with people and process changes:
Experiment with a new workflow and visual board
Experiment with rotating on-call responsibility between team members; each team member should be on call at least twice
Experiments don’t necessarily change things for ever, and one might argue that ‘Learn ABC’ is an action rather than a goal. So don’t abuse this approach and welcome the challenges, so that you might find a better way still.
Still, experiments create learning and options; both have value. Our ‘stock’ of knowledge is increased, alternative ideas are investigated and information gathered before any final decision or commitment is made.
Some experiments are open-ended, for example ‘What happens if there are more images on web pages?’ Others test a particular hypothesis, for example ‘Adding images to web pages causes viewers to engage more’, or ‘Adding images to web pages increases load times and reduces viewers’.
12.2 Hypothesis-driven development
Experimentation can be formalized by stating a hypothesis to be tested up front. Hypothesisdriven development (HDD) is described by Barry O’Reilly:
Practicing hypothesis-driven development is thinking about the development of new ideas, products and services – even organizational change – as a series of experiments to determine whether an expected outcome will be achieved. The process is iterated upon until a desirable outcome is obtained or the idea is determined to be not viable¹*.*
¹Barry O’Reilly, How to Implement Hypothesis-Driven Development, https://barryoreilly.com/how-to-implement-hypothesisdriven-development/, retrieved September 2020
Hypothesis-based experiments may require little effort and use existing capabilities (for example, ‘$100 spent on Google Adwords will result in over $100 of additional sales’), or they may require significant effort to arrange (‘Android native app will generate over $10,000 in additional revenue’). Naturally, the less effort required to test a hypothesis, the more attractive it is to run an experiment.
Both objectives and key results can be phrased as a hypothesis to be tested by experimentation. Running a particularly involved experiment may be an objective in its own right. When there are multiple ideas for how to reach an objective, several hypotheses could be stated as independent key results, each could be tested, and the most promising used to reach the objective.
O’Reilly suggests a template for these experiments:
We believe Will result in We will have confidence to proceed when
When completed that might be a bit too long to put in the OKR itself, so you might put a short description in the OKR and the complete template in an appendix.
Stating an objective or key result as a hypothesis changes the success criteria. The outcome is no longer doing or delivering something, the outcome is learning: proving or disproving a theory. As long as the experiment is done, the outcome is met. Failure would be a failure to do the experiment, or the experiment itself failing.
Of course you probably want your experiments to work and the hypothesis to be proved true. However, disproving a theory can sometimes be more valuable than proving one. When an experiment proves something you already believe, you feel good. When an experiment disproves something, you need to find a new explanation, to learn more, and quite possibly to run more experiments.
12.3 Time-boxed
Careful readers will have noticed that several of the experiments suggested here are timelimited. Time-boxing is an old agile technique that can be used when setting key results. Like experimentation, it can encourage teams to take on risk or step into the unknown.
Experiment with a new workflow and visual board for four weeks.
The idea behind time-boxing is that there is far more work that could be done than there is time to do it. Doing some of the work would represent an improvement, more work might still be needed for a complete solution, but the work done nevertheless makes an improvement. The goal is analog rather than binary.
Therefore rather than try to do everything, one can specify an amount of time that can be spent on an activity knowing that some – although not all – the benefit will be realized. For example:
Spend one person-week improving the user interface.
Similarly, if you feel you need to investigate something before you decide, then time-box it:
Spend three person-days investigating options to improve the user experience.
This ring-fences time to do the work, but leaves open the actual work to be done: deciding what is to be done is itself part of the work. It might be that day one is spent drawing up a short list of work options, followed by a group review, then someone actually doing the work.
Time-boxing is particularly useful when facing technical work that has no immediate business benefit, but potentially increases capacity to deliver work in future. For example:
Spend two weeks of one person’s time refactoring the feed polling mechanisms.
As with experimentation, setting a key result as a time-box changes the nature of result measurement. No longer is the benefit being measured, rather it is the fact that the work is actually done that is measured. Naturally this implies that one has be fairly certain that some improvement will result from the work.
I would prefer not to allocate large swathes of time like this where the business benefit is uncertain (for example, ‘Whole team spends ten weeks improving SQL stored procedures’). Time-boxing can nevertheless be useful for striking a balance between competing demands.
12.4 Survey
Sometimes you want to make changes to people. Perhaps you want to address a problem that Product Owners in your organization have, or you want customers to see your product in a new light. In such cases you might want to write a key result that is tested by a survey. For example:
‘Improve Product Owner communication by convening a fortnightly show-andtell.’
Your test could be just ‘show-and-tell took place’. That would be a very binary test (yes it happened/no it didn’t), but it would not actually tell you whether the communications had actually improved. A survey might help here.
‘Improve Product Owner communication by convening a fortnightly showand-tell. Survey POs after third event and aim to have 75% agreeing that communications are better.’
You might combine this with the experimental approach above:
‘Experiment with regular Product Owner show-and-tell sessions to improve communications. Aim to have 75% of POs agreeing communications are better after six weeks.’
While one could argue that an instruction to run regular show-and-tells was itself too restrictive, there is a balance to be struck.
‘Improve communications between Product Owners. Aim for 75% of POs to agree that communications are better after six weeks.’
This key result could be criticized for being too vague. While it still uses the survey technique, it leaves open the question of what to do. Sometimes this might be the right thing to do, but sometimes something more specific is needed.
12.5 Knowing when to stop
If OKRs are going to be anything more than just statements of things to do, they need to have a clear end state: you need to know when you are done. Knowing when you are done is important because a) it allows you to move on to the next thing, and b) it allows you to take stock of where you are.
Concrete goals are great – ‘Rank top on Google for OKRs searches’ – but such goals can be very open-ended, and some goals are hard to quantify: ‘Make all employees happy’.
Once in a while it pays to step back and ask if you are pursuing the right goal, assess your chances of meeting the goal and consider the costs of doing so. Time-boxes, experiments and surveys provide break points at which you can notch up an achievement and consider your next move.
That said, once in a while you might want to write an open-ended OKR (for example ‘Fix as many bugs as possible this quarter’). Just be aware when you are doing this and ask yourself if there are alternatives.
Since OKRs are set, reviewed and reset on a regular basis, every OKR exists inside a time-box. Even the most open-ended OKR will get reviewed at the end of the quarter.
12.6 Summary
- Make sure you know how your OKRs are to be measured, and if necessary build the measuring tools.
- Phrasing results as experiments allows teams to take on more risk and enhance learning.
- ‘Just do it’ work can be set up as a time-box.
- Combine techniques and experiment with new ones to find what works best for your team in your environment.
- Think imaginatively about how you measure and learn from your own experiences with OKRs.