November 8, 2020

The Definition of Done for Digital Product Releases

How boys release: the impulse to add a feature → release
How men release: use HADI (Hypothesis →Action → Data → Insights)

Almost any digital product is a very complex system, and, to be honest, we do not know why it works or doesn't. Before you add changes, you cannot say it will help you to improve whatever you want to improve. If you can, most likely, you are biased toward your experience. Too many parameters to foresee anything — user behavior is too complicated. Sociologists and psychologists know it; they test everything, be like them, because in a nutshell, we do the same job. The only way to understand something is to analyze changes retrospectively. Hence, the first rule is deriving from the statement above:

1️⃣ Make sure you test hypotheses before releasing any changes to understand what you do with your product.

But before doing anything at all, you have to understand how to do it properly. Not all hypotheses are valuable to test. Remember: your yet another great idea ≠ the hypothesis. Moreover, on the stage of transforming an idea into a hypothesis, you can find out that you do not have enough data or your take on the situation was biased, and data shows you the contrary. It leads us to understand that:

2️⃣ Before testing, hypotheses must have enough data to show that a problem really exists not only in your imagination

Sufficient confidence means different for different types of testing. If you state that users cannot find out how to go deeper into your purchase funnel, conduct enough usability tests (5+, the number depending on how many users have problems). If you see the drop-off on one of the funnel steps, calculate confidence intervals for gathered data (do the math by using my calculator). Often you need to commingle available methods to get pre-testing insights. It depends. There is no good template for the excellent hypothesis. It's more creative work than most people think, even though it's based on hard math. Just get your thoughts together and do not ignore data from outside. Reject null hypothesis like a real researcher with proper sampling.

The next step is to test. The main questions here: what to measure and how to test? Usability testing with the task success rate? A/B-testing with the key metric conversion for step 1 → step 2, and the guardrail metric ARPPU?

First of all, choose your business-related metrics (make sure that your high-level abstract metrics like ARPPU and LTV are broken apart on more simple metrics) and/or UX-metrics.

Do not forget to set expected values for chosen metrics. If you are going to use A/B-test, you can rely on the minimum detectable effect or MDE (especially if you do not have enough users to detect small changes). But most likely, your expected values will be depending on your preferences.

Attention! Numbers ahead! For example, you are going to change the conversion for step 1 → step 2. But the conversion itself is not what you want to increase. It's ARPPU you wish to change up to, let's say, +$3 (less is not enough for you). Increasing the conversion on +0.8% will give you in the future the desirable ARPPU. Hence, you can set the expected value for the conversion +0.8%. But imagine that you have calculated the MDE and already know that you can get no less than +2% with your traffic. So you can't set less for the expected value. Do you wish to continue with this hypothesis in that scenario? It's up to you. Maybe you will not get +2% at the end of the testing. If so, the only conclusion you can come to is that there is the chance that the effect is not more than +2%, and it can be +0.8% (or more, or less). So, maybe, the better decision is to put the hypothesis aside until you can get the MDE not less than +0.8%, isn't it?

If you have almost nothing to get expected values from, you can take a look at similar tested hypotheses. Find what of them have similar problems or scope of changes and what outcomes were there.

Let's make the conclusion for everything above:

3️⃣ There is no all-purpose recipe for making a decision about to test or not. Learn the theory, do the math, know what is what.

Finally, after forming a hypothesis, conducting tests, getting insights, you can make the decision about to release or not to.

About "Best Practices"

If your team releases new features and product changes without forming hypotheses and testing, they don't know what they do. You can use "best practices on the market," but all our products are NEW products. When people see them, they do it for the first time. We do not produce chairs. Every time you release a new feature, people use it for the first time ever. Most IT-companies don't know what they do, and then everybody copies their "best practices" worldwide. Bad is more proliferated than good because good is the rare case here. More often, good is good only for them, not for you. So even "best practices" have to be tested for every particular case.

The best time for "best practices" is when you create and test an MVP. In other cases, do your job. But even in that case, you go from "best practices" to your practices conducting usability tests and interviews as early as possible.

4️⃣ Again: no such term as "best practices" for NEW things.

One More Thing to Remember

Even if all the above has not persuaded you yet, remember that forming hypotheses is something more. It is the source of truth for your team in the future, when every teammate has forgotten already what and why you did changes in your product. It leads to confusion.

5️⃣ Do not make yourself the future source of truth for your team — let it be well-formed hypotheses.

What You Are Able not to Test

The shortlist of things that you can release without hypotheses:

  • Bugs
  • Technical Debt
  • Design Debt

Make sure that your team understands properly what technical and design debt means at all.

Instead of Conclusion

By going through the whole cycle describing above, you set the Definition of Done for your releases.