Statistics Primer
Once tracing is complete and the data collected. The next step in analyzing it is to leverage some descriptive statistics to get a feeling for the data. TracerBench will handle this for you and provided within the various TracerBench Reports.
Population & Sample
Population and Sample are part of the foundation of statistical hypothesis testing.
A population is a collection of data which you want to make an assumption on. For example in a swimming pool of
water this represents all of the water in the pool. Since testing every drop of water is not realistically possible.
A subset of the population (subset of the pool water) is tested to analyze and make an assumption, which is called a
sample.
To represent the population well, a sample should be randomly collected and adequately large. If the sample is
random and large enough, you can use the information collected from the sample to make an assumption about the
larger population. Then leverage a hypothesis test to estimate the percentage of the sample to the population.
Probability Sampling
Sampling involves choosing a method of sampling (how are we going to sample). In the example of the swimming pool,
are we going to only sample the pool water from the shallow end? deep end? both ends? The method we decide on will
influence the data we result with.
The two major categories in sampling are probability and non-probability sampling. For a given population (swimming
pool), each element (drop of water) of that population has a chance of being "picked" as part of the sample (cup of
water). In other words, no single element of the population has a zero chance of being picked. The
odd/chances/probability of picking any element is known or can be calculated. This is possible if we know the total
number in the entire population and are able to determine the odds of picking any one element. Probability sampling
involves randomly picking elements from a population which is why no element has a zero chance of being picked to be
part of a sample.
Null hypothesis (H0)
The null hypothesis is a statement about a population. A hypothesis test uses sample data to determine whether to reject the null hypothesis. The null hypothesis states that a population parameter (such as the mean, the standard deviation, etc.) is equal to a hypothesized value. The null hypothesis is often an initial claim that is based on a previous analysis or insights.
Standard Deviation & Variance
The standard deviation is the most common measure of how spread out the data are from the mean (dispersion). The
greater the standard deviation, the greater the spread in the data.
The variance measures how much the data is scattered about their mean. The variance is equal to the standard
deviation squared.
Confidence Interval
A confidence interval is a range of values, derived from sample statistics, that is likely to contain the value of
an unknown population parameter. Since they are random it's unlikely that two samples from a population will yield
identical confidence intervals. However, if the sampling is repeated many times, a certain percentage of the
confidence intervals would contain the unknown population parameter. For example in a 95% confidence interval, 5%
would contain the unknown population parameter.
Confidence intervals are commonly more useful than hypothesis tests because they provide a way to assess
practical
significance in addition to statistical significance. They help you determine what parameter value is, instead of
what it is not.
Power
The power of a hypothesis test is the probability that the test correctly rejects the null hypothesis. The power of
a hypothesis test is affected by the sample size, difference, variance and the significance level of the test.
If a test has low power, you might fail to detect an effect and mistakenly conclude that none exists. If a test has
a power that is too high very small effects/changes might seem to be significant.
Wilcoxon rank sum with continuity correction
Wilcoxon rank sum essentially calculates the difference between each set of pairs in the test and analyzes the differences in the pairs. Continuity correction essentially is as simple as adding or subtracting 0.5 to the x-value of a distribution with a lookup table to determine when to add or subtract 0.5.
Statistical Significance
Statistical significance itself doesn't imply that your results have practical consequence. If you
use a test with
very high power, you might conclude that a small difference from the hypothesized value is statistically
significant. However, that small difference might be meaningless to your situation. Your technical insight should be
leveraged to determine whether the difference is practically significant. With a large enough sample, you can likely
reject the null hypothesis even though the difference is of no practical importance.
(sources: https://cbmm.mit.edu, https://www.wyzant.com, https://support.minitab.com)