In this series of three we’ll go into Experiment Driven Development and A/B testing. The opposite of developing features based on anecdotes heard in stories from the CEO’s next-door neighbor, EDD seeks proof and is iterative. EDD can be defined as fact-based development: development based on evidence gathered from the field, not intuition.
In EDD every new feature or process implemented is validated through a formal experiment design process, which looks to test a hypothesis that describes the status quo of the feature. Example hypotheses could range from “Making a button bigger does not impact clicks” all the way to “Making a web app responsive does not increase visitation” compared. In statistical terms, the base statement is referred to as the null hypothesis (H0), the status quo, and then an alternative hypothesis(H1) is proposed. The null hypothesis will usually state that the change introduced by the experiment will not affect the current behavior while the alternative supports that there is in fact a change.
The alternative hypothesis is a prediction of what is expected to happen before running the experiment. It can be a bold statement, not an open question and it should have three parts:
- The variable (if we add/change/remove…): the change that the experiment will measure against the current state of the feature/process.
- The desired result (then we expected to see…): what we expect to see after the change is introduced, a qualitative difference between the current state and new state.
The rationale behind the prediction (because we have seen that…): prior knowledge that has led you to come up with the current hypothesis (from prior observation).
For example, one can define the pair of hypotheses for a new registration form in a website as:
- H0: Changing the registration form from multiple to single page will not impact the current user registration rate.
- H1: Changing the registration from multiple to single page will increase the current user registration rate by 5% because we have previously seen that there is a 5% abandon rate on the multi page form format.
EDD is based on A/B Testing, which is a randomized experiment method to compare two variants of a single variable. In this case, a baseline metric is compared thanks to the definition control (status quo) and treatment (new feature) groups in order to determine if the variation has a significant impact or not. Ideally, most decisions to release a feature would be based on the results given by A/B Tests. At Sama, we want to find viable ideas or fail fast. Instead of developing a monolithic solution and pushing a release, we iterate through experiments, evaluating how features perform and, most importantly, if and how customers use them.
Next up: A/B Testing and A/B Testing with Python.