Back to blog

A/B Testing, explained

Data Science

A/B Testing, explained

What it is, how it works and why it’s important

Photo by Coffee Geek on Unsplash

In the year 2000, Google Engineers wanted to determine the optimum number of results they should display for the search engine results. To collect evidence that can help them make this decision, they performed something called an A/B test.

This was the first-ever A/B test done by Google, and they’re said to be the pioneers of A/B testing in the digital age. Today, big tech companies like Amazon, Netflix, and Spotify also recognize the value of this test and have done thousands of A/B tests to streamline their services.

Aside from the tech giants, many smaller companies also use this approach to make data-driven decisions, especially marketing decisions, to help improve their content and provide the best customer experience.

Now that you’re introduced to A/B testing, let’s dive into the weeds of this test.

Haven’t joined bitgrit’s new discord server? Join today for discussions on all things data science and AI, and more.

What is A/B testing?

A/B testing allows you to compare two versions (A and B) of something to determine which is more effective.

Let’s go back 100 years in time and see how A/B testing originated to understand it better.

In the 1920s, the statistician Sir Ronald A. Fisher came up with randomized controlled experiments to run agricultural experiments, answering questions about the effect of fertilizers. This ultimately laid the foundations and principles for A/B testing.

It was then used later on in the 60s to drive marketing strategies such as deciding between postcards or letters and television ads or radio.

Through the birth of the internet, A/B testing went digital. Today, it’s used to compare two versions of an app, a website, or even things like newsletters and email subject lines to determine which is more effective (depending on a specific metric).

How does it work

An A/B test is a randomized controlled experiment (assuming it meets the conditions of controlled factors and randomized users).

What does that mean? Let’s break it down.

Randomized

Randomization is mainly involved in the equation to remove hidden bias from an experiment. It gives us the ability to say that our results were actually caused by a certain improvement or change in the current way of doing things (in the context of A/B testing).

The general rule is that if random sampling is done, we can generalize our findings to the population. And if there is a random assignment of treatment, we can determine a cause-and-effect relationship.

Controlled

It is controlled because we have a control group and a treatment group, where the control group is assigned the current way of things, and the treatment group is assigned the new way of things. This allows us to compare our results with something; in other words, the control group is our baseline.

To put this all together, in the context of a web page design, users will be randomly sampled and randomly assigned treatment by splitting into the control and the treatment groups. The control group would see the original version of the page, and the treatment group will see the new design.

Testing for effect

To test for any effects or significance, we use hypothesis tests. The assumed distribution of the samples also determines what kind of test to use. For example, the Click-through rate has a binomial distribution, and Fisher’s exact test is used.

Thus, the A/B test allows you to compare two versions of a single variable and test for an “effect” of the new feature scientifically and not just by gut instinct.

Let’s solidify all these details with an example.

Interested in Data Science? Subscribe to our newsletter Data Science news and the best resources to learn DS and ML!

Example: Which button is better?

Let’s say you have a new website, and you want to learn if the new button design will make users click to learn more about your company.

Wikimedia

From the image above, we see two variants of the button, which we will call A and B.

  • version A has a simple blue button
  • version B has a green button with an arrow

Before we start our experiment, we define a specific hypothesis. For this example, it would be that revamping the design of the learn more button would effectively encourage users to click on the button.

To run our experiment, we first do a simple random sample of the users and then do a random assignment to evenly split the users into control group (users receiving version A) and treatment group (users receiving version B).

We wait a week or two and then compare them with specific metrics. In our case, the metric is click rate.

Then, based on the evidence of your test, you can decide whether the change in color scheme and the addition of an arrow are significant enough. From there, you can decide to apply the changes for all users or continue iterating on the experiment to get more confident results.

Now that you know what A/B testing is, here are a few more examples of different applications of A/B tests.

Bonus examples

1. E-commerce store

Goals: Increase the number of checkouts and improve customers’ shopping experience.

A/B tests on:

  • product recommendations
  • design
  • color scheme

2. Media Company

Goals: Increase views and impression and improve SEO

A/B tests on:

  • newsletter
  • font
  • presence of images and graphics

3. Software Company

Goals: Increase sales, attract buyers

A/B tests on:

  • special offers
  • product pricing
  • call-to-action

A/B testing is a very useful framework in this digital age. It’s relatively inexpensive to implement, and it helps leaders make data-driven decisions. It’s also popular in data science and is a valuable skill to have as a data scientist.

This was a very simple introduction to A/B testing, as we didn’t get into the nitty-gritty details of hypothesis testing and statistics, along with how to run an A/B test yourself. No worries! I’ve linked a few resources below 👇 for you!

Thank you for reading.


Follow Bitgrit’s socials 📱 to stay updated!