This is the fifth article out of 12 articles that describe my journey while attending lessons for the Growth Marketing Minidigree at the CXL Institute.
At the end of last week, I started the course about Statistics fundamentals for testing. I have to admit that I wasn’t big math and statistics fan back in uni, so this was quite a challenge. Although the course is short and only lasts about half an hour, it’s very intensive and difficult, especially if you’re new in the world of statistics.
The instructor is Ben Labay. He does UX research at ConversionXL and has a background in statistics and data science. According to the course, if you don’t know basic statistics then you can’t properly evaluate test results or even case studies of A/B testing. That’s why it covers the statistical concepts that you need to extract outcomes out of your A/B tests easily.
Sampling: Population, Parameters, Statistics
It starts with sampling (population, parameters, and statistics). Here are some outcomes of this lesson:
- A population can be considered all potential users or people in a group or things that we want to measure.
- The parameter is the value that we want to compare in the two cases we are testing.
- The statistics involve the standard deviation and the mean.
Mean, Variance, Confidence
Now, we get to some of the more basic concepts on our way to understanding some of the more important ones of A/B testing like the significance of power and the importance of sample size.
The mean is the most common measure of central tendency.
The shape of the data — how spread out the data is — is what’s known as the common variance. The more common measure of variability in statistics is the standard deviation. Having a true understanding of how it’s calculated, and how it’s related to the shape of the data, as well as how it will vary depending on what that data looks like is really important in understanding the need of certain sample sizes, or how confidence level, intervals,work together.
The confidence intervals include:
- Sample Size
- Confidence level
The confidence interval represents the amount of error allowed in the A/B testing.
Statistical significance (p-value)
Statistical significance helps us quantify whether a result is likely due to chance. So, when a finding is significant, it means that you can feel confident that it’s real, not that you’ve just got lucky in choosing the samples.
The P-value is the probability of obtaining the difference we saw from a sample if there really isn’t a difference for all the users. This means that it’s a false positive.
The conventional, somewhat arbitrary threshold for declaring statistical significance is a P-value of less than 0.05. And what that means is that there’s a less than 5% chance of a false positive.
It’s important to remember that the P-value does not tell us that the probability of B is better than A. And similarly, it doesn’t tell us the probability that we will be making a mistake in selecting B over A.
The P-value is what you get after a test is run; it tells you the probability of obtaining a false positive, while the confidence level is what you set before the test and affects the confidence interval and the difference.
Understanding how to correctly look at a test and interpret it, and understand what you’re looking at, and understanding what kind of sample sizes you might be needing or not needing, is statistical power. Basically, the power of any test of significance is defined as the probability that it will reject a false null hypothesis.
Statistical power is the likelihood that a study will detect an effect when there is an effect to be detected.
The bigger effects are easier to detect than the smaller effects, while larger samples offer greater test sensitivity than smaller samples.
Sample size and how to calculate it
For A/B testing, the right sample size comes largely down to how large of a difference you want to be able to detect, should one exist at all. The other factors are the level of confidence that you want to have, the power and variability of that data. Values closer to 50% have higher variability.
The final several minutes of the course were reserved for the four most common statistic traps.
- Regression to the mean and sampling error
- Too many variants
- Click rates and conversion rates
- Frequentist vs. Bayesian test procedures
The next course on the list is for Google Analytics. The course is held by Chris Mercer from MeasurementMarketing.io, someone who is all about analytics and measurement. He has a very impressive background in Google Analytics, Google Tag Manager, Google Data Studio, Dashboards, Facebook Analytics, etc.
The course is very intensive. There’s really a lot to learn about Google Analytics, especially if you haven’t been using it before. Personally, I think this is one of the most important segments of growth marketing, so I’ll make sure I pay special attention to the course. The truth is, I thought I knew how Google Analytics works, but it turns out, I’d only scratched the surface. This course dives quite deep into each kind of report and how to use it, so I’m looking forward to leveling up my Google Analytics game this and next week.
I already watched the introduction to the Google Analytics basics, which shows you the difference between Google Analytics, Google Tag Manager, and Google Data Studio, and why it’s important to know them all. It also gives an overview of what you can do with Google Analytics and how to use it to complete your marketing goals.
Next, there’s the introduction to using reports, giving a short overview of each report type. Then, there is the introduction to admin, showing you what you can do when you have admin permissions of a certain GA account. Now, I’ll be learning about real-time reports and their purpose.
The course lasts for about 9 hours and there are many additional resources that you should pass to understand what’s going on, so I think that the entire next week will be about Google Analytics. Looking forward to see how that goes :)