How to compare two systems

Jan 04, 2023

A common situation we all face during a project is when we should compare two alternatives, for example, the most common for me is during performance analysis. Usual situations could range from evaluating the improvements of a platform deployment or a setup change.

While these situations are usually solved by making nice charts and comparing visually the results, this method is not always the most accurate as precision, representation and your own eyes can make you take a wrong answer.

To help us with this there are two simple statistical approaches that can make your life much easier, we want to provide a quick overview of them in this post. Note, these techniques work when there are two systems to compare and the workload is similar.

Paired observations

Given n experiments, run in the two systems of interest, where there is a one-to-one correspondence between each observation we are in a paired observations situation. For example when the same workload is run, under the same environmental situations, etc. Otherwise, the situation is non-paired.

In this situation, we would build a single sample of N pairs, where each component would be the observation for one system. For each pair, we calculate the difference in performance (whatever this means) and a confidence interval is computed. If the interval includes zero, the systems would be not different.

And the great thing is you can easily run these numbers in excel or alike.

Example

Variance difference comparison

T-dist table to find the percentiles using the student distribution

As the reader can see, the confidence interval in this example includes zero, so we can conclude the two systems in comparison are equal.

Non-paired observations

When the observations are not paired, what would be the most common situation, the analysis is a bit more complex but still straightforward. We’re going to use a t-test.

Given two samples na and nb, we would need to construct an estimate for the variance and the degrees of freedom.

For this, we would need to

Calculate the sample means and standard deviations
Compute the mean difference and their standard deviation
Calculate the number of degrees of freedom
And the latest, get the confidence interval.

If the confidence interval includes zero, the difference would not be significant, otherwise, the sign would show us which system is better.

As noted, the calculation is a “bit” more complex, but still, everyone can do it in excel.

Example

T.test details for a non-paired comparison

And when you just wanna use pure spreadsheet functions, for example in google sheets you can just use the t.test function, this function will return you the probability the two sets are the same.

Pere’s corner

Discussion about this post