Product Analytics
6 min read
A/B Testing: What to do when you do not have enough traffic on your site? (Part I)
Written by
Vinay Roy
Published on
8th July 2020

The best part of Product management is that you have to take dozens of impactful decisions everyday. While you may have data to assist with some critical decisions, many a times there aren’t enough data points. One such case is if you are managing a product, a surface area of the product, or a part of the product funnel that does not get enough users. This makes running an A/B test (or Randomized Control Trial) to decide which variant to launch, a challenge. If we calculate the sample size for such cases by using the methodology described here, we will realize that we will end up waiting many weeks to gain significance. Waiting surely is not always an option. What do we do in such cases? Product managers at startups have to live through this every day but even product managers at big companies get into this issue time and again.

One option is to trust your instinct but there are many other tricks or strategies that you can employ to navigate this issue. In this article I will share some methodologies that you can use when you run into this issue.

User testing: This is always my most recommended method. No matter whether you are at a startup or a big company, if you have the budget and opportunity to reach out to a user to do user testing, nothing like it. You will gain immensely from qualitative analysis. This sometimes becomes the only method when your product is either new or does not have enough traffic to justify running an A/B test.

Talk to your users — this is the single most effective A/B test that you can run

Lower the confidence level: Many a times, we take ‘95% confidence level’ as a rule without contextualizing it to our specific scenario. Just to be clear, here are the definitions of confidence level and significance level

  1. Significance level: The significance level (alpha) is the probability of rejecting the the null hypothesis when it is true. We would want it to be as small as possible.
  2. Confidence level: Calculated as (1 — alpha) is the probability that if the test were repeated over and over again, the results would match the results from the actual, which we have no way to know, 95 percent of the time.

We surely want to increase our confidence level as the higher the confidence level the better. However, higher confidence level means typically more samples would be needed to reach significance and hence longer time to reach a conclusion.

If the surface area that you are testing has limited traffic then it is okay to reduce the confidence level to 80% rather than taking 95% as a golden rule. This would mean accepting a higher risk of false positives or type I error, Rejecting null hypothesis when it is true.

Reducing the confidence level to work around the constrains of an A/B test has made me realize that

A/B testing and decision making using statistical tools is as much a science as an art

Use direction impact until you have enough traffic to run statistical tests: As we sometimes tend to use this to confirm our bias by indulging in peeking, it is best to avoid this as much as possible. One way to do this is to decide beforehand how long we want to run the test given the business needs and interpret the directional impact on the metric of our choice. It is still better than not running an A/B test. Once you start building traffic to your site, you can always reevaluate how the directional impact maps to actual results.

Track Target Persona conversions: Not all product features cater to all personas on your website. When we look at aggregated results, we tend to miss the impact that it may have on the target persona. This is also called Simpson’s paradox in statistics. For example at Zeus Living, suppose we have these two personas (Just for the sake of this discussion)
a Business traveler who travels for a month and
a family relocating to a different city
If we released a feature that only caters to the second persona that is ‘family relocating to a different city’ but if the overall ratio of this persona is very small in the total traffic then the overall uplift on a metric that takes whole website traffic as denominator will miss the uplift or it will take many more days/weeks to reach significance.

However, if we disaggregated the traffic and looked at the impact on just the target persona, our uplift will be much more noticeable and we will reach significance much sooner.

Track Micro conversions: There are two types of conversions that we can track on our product:

Macro conversions (or macro goals) are the actions a user take that serves the primary objective of our website. For example, at ZeusLiving, a website for 30+ days stays, macro goals are leases. We would want users to lease on our website. However, realistically only a few users will reach this goal in a day.

Micro conversions (or micro goals) are actions that a user takes that are either important step on the path towards macro conversion or highly correlated with the macro conversion, even if it is not a necessary step in the macro conversion. An example of the first kind of micro conversion would be number of users who start the booking flow, while the other micro conversions would be # users who added dates on the listing detail page of ZeusLiving.

Now since macro conversions are hard to come by, it makes sense to use macro as a proxy to run hypothesis testing. We can also regression and run a significance on the correlation output to understand the directional impact of a proxy metric on the final macro conversion metric.

Avoid Multi-variate tests: The more variants we run, the more samples we would need to take a decision. When traffic is low, running a MVT is a really bad idea. A better idea is to pick the top two contenders in your hypothesis testing and compete them against each other. Don’t use A/B testing as means to automate human decision making process. This way you wont be waiting for many weeks to ship the first version out.

Move the denominator of the interest metric close to the point of interest: If you are asking by changing a CTA on the product page, how much add to cart conversion changes, then you should use the denominator for the metric as those who actually landed on the product page rather than the overall traffic to the landing page of the website. Why?

For the same Minimum Detectable Effect (MDE) of 10%, based on how high the baseline was this is how the graph of required sample size will look like.

How baseline conversion impacts Sample size required for MDE of 10%

By moving your denominator closer to the point of interest you will increase the size of the baseline conversion and hence would need fewer samples than would be needed if you had taken the top of the funnel traffic as the denominator. Another reason is the higher in the funnel the denominator traffic is, the higher the variance and larger the sample size that will be needed.

Use one sided test rather than 2 sided: We can see that sample size needed for a one-sided significance test is much lower than a two sided test.

  • If we want to validate that Control ≠ Treatment, we will have to check the null hypothesis Control = Treatment, using a two-sided test;
  • If we want to validate that the treatment isn’t worse compared to control (Control ≤ Treatment), our null hypothesis will be Control > Treatment. we should use a one-sided test.

power.prop.test(n = NULL, p1 = 0.2, p2 = 0.22, power = 0.8, alternative = ‘two.sided’, sig.level = 0.05)

Two-sample comparison of proportions power calculation

             n = 6509.467
            p1 = 0.2
            p2 = 0.22
     sig.level = 0.05
         power = 0.8
   alternative = two.sided

NOTE: n is number in *each* group

power.prop.test(n = NULL, p1 = 0.2, p2 = 0.22, power = 0.8, alternative = ‘one.sided’, sig.level = 0.05)

Two-sample comparison of proportions power calculation

             n = 5127.385
            p1 = 0.2
            p2 = 0.22
     sig.level = 0.05
         power = 0.8
   alternative = one.sided

NOTE: n is number in *each* group

So instead of 6500 samples per group, we need only 5100 samples, 22% fewer samples to reach significance.

That is it for this article. In the next post, I will discuss two of my most favorite/preferred methods to calculate significance for low traffic website: Sequential sampling and Continuous monitoring

Read our other articles on Product Leadership, Product Growth, Pricing & Monetization strategy, and AI/ML here.

As a photographer, it’s important to get the visuals right while establishing your online presence. Having a unique and professional portfolio will make you stand out to potential clients. The only problem? Most website builders out there offer cookie-cutter options — making lots of portfolios look the same.

That’s where a platform like Webflow comes to play. With Webflow you can either design and build a website from the ground up (without writing code) or start with a template that you can customize every aspect of. From unique animations and interactions to web app-like features, you have the opportunity to make your photography portfolio site stand out from the rest.

So, we put together a few photography portfolio websites that you can use yourself — whether you want to keep them the way they are or completely customize them to your liking.

12 photography portfolio websites to showcase your work

Here are 12 photography portfolio templates you can use with Webflow to create your own personal platform for showing off your work.

1. Jasmine

Stay Updated with Growthclap's Newsletter

Subscribe to our newsletter to receive our latest blogs, recommended digital courses, and more to unlock growth Mindset

Thank you for subscribing to our newsletter!
Oops! Something went wrong while submitting the form.
By clicking Subscribe, you agree to our Terms and Conditions