◆ Powerful

Variance as Spread

Two data sets with identical averages can be entirely different in character. Variance and standard deviation measure how far outcomes scatter around the centre — and in finance, medicine, and public policy, ignoring spread is how people get badly misled.

Time: 12 minutes

Requires: Unit 1.5

Opening Hook

Suppose you are comparing two investment products. Both are marketed as offering an average annual return of 7 percent. Your financial adviser presents them side by side on a single sheet of paper with that number in bold at the top. Same expected return. Same time horizon. Same fees. From the summary sheet, they look identical.

They are not identical. Product A has returned between 5 and 9 percent in every year for the past decade. Its worst year was 5 percent; its best was 9 percent. Product B has returned between minus 23 percent and plus 37 percent over the same period. Its worst year wiped out nearly a quarter of your capital; its best year nearly doubled it. The average across all ten years, for both products, is 7 percent. The experience of holding them is completely different.

The number that tells you this is not the average. It is the spread.

The Concept

In the previous unit, you learned about expected value: the probability-weighted average of all possible outcomes. Expected value tells you where the centre of a distribution sits. Variance tells you how far the outcomes scatter around that centre.

Here is the formal definition. The variance of a set of numbers is the average of the squared distances from the mean. Take each value, subtract the mean, square the result, and then average all those squared results. Squaring serves two purposes: it makes all the distances positive (so that values above and below the mean do not cancel each other out), and it gives extra weight to outcomes that are far from the centre.

The squaring, however, creates a units problem. If your data is in pounds, the variance comes out in pounds-squared, which is difficult to interpret in concrete terms. The fix is the standard deviation: the square root of the variance. This brings the measure back into the original units. If your investment returns are measured in percentage points, the standard deviation is also in percentage points. A fund with a standard deviation of 2 percentage points is doing something very different from one with a standard deviation of 15 percentage points, even if both have the same mean return.

For Product A above, the standard deviation is roughly 1.3 percentage points. For Product B, it is roughly 18 percentage points. Those numbers, placed alongside the average return, tell you almost everything you need to know about the character of each investment.

There is one further refinement that is worth knowing. If you want to compare spread across data sets that are measured in different units, or that have very different means, the standard deviation alone is not enough. A standard deviation of 3 might be large if the mean is 5, and tiny if the mean is 500. The coefficient of variation corrects for this by dividing the standard deviation by the mean and expressing the result as a percentage. A data set with a mean of 5 and a standard deviation of 3 has a coefficient of variation of 60 percent. One with a mean of 500 and a standard deviation of 3 has a coefficient of variation of 0.6 percent. The coefficient of variation lets you compare the relative spread of data sets regardless of their scale.

In everyday terms, the most important thing to understand is this: variance is uncertainty. High variance means a wide range of possible outcomes. Low variance means a narrow range. Two things with the same average can have completely different levels of uncertainty attached to them, and that distinction often matters more than the average itself.

Why It Matters

The same-average, different-spread problem shows up in domains where the consequences of ignoring it can be severe.

In finance, the entire field of risk management is built on this distinction. When a fund manager tells you about historical returns, the mean return is only half the story. A fund with high volatility (high variance in returns) exposes you to sequence-of-returns risk: if the fund has a bad year at the wrong time, say, shortly before you need to draw on the capital, the damage may be irreversible. Two people who both invested over the same twenty-year period with the same average annual return might end up with very different amounts, depending on the order in which the good and bad years arrived. The average return is silent on this.

In medicine, the spread problem can be actively dangerous to ignore. A clinical trial might find that a new treatment produces, on average, a 10-point reduction in systolic blood pressure. That sounds straightforwardly good. But if the distribution of outcomes is wide, averaging a 40-point reduction in some patients with a 20-point increase in others, the average effect says almost nothing useful about what the drug will do to any particular person. A treatment that helps some patients enormously while harming others moderately might show the same mean effect as a gentler drug that produces consistent modest improvements across the board. If the trial report shows only the mean, you cannot distinguish between these two very different drugs. Responsible clinical reporting shows the full distribution of outcomes, or at minimum the standard deviation, precisely because the mean alone can mislead.

In policy, identical average outcomes are frequently used to declare success when the underlying distributions are moving in opposite directions. Average income rising while median income stagnates often means the distribution of incomes is becoming more spread out: more people at the bottom, more people at the top, and the average dragged upward by the gains at the top. Average exam scores at a school can be identical to those at a neighbouring school while concealing a completely different shape: one school with scores tightly clustered around the average, another with scores scattered from very low to very high. Whether that spread is a problem or a feature depends on what the school is trying to do, but reporting only the average makes the question invisible.

How to Spot It

The clearest documented case of variance being suppressed to mislead comes from the UK with-profits life insurance industry in the 1990s and early 2000s. With-profits policies were sold to millions of savers as products that offered “smoothed” returns: the insurer would hold back some returns in good years and release them in bad years, producing a steadier income stream than direct equity investment. The headline figure used in most marketing materials was the projected annual return, typically shown as a single number.

What many policyholders were not told was that the smoothing mechanism relied on a large buffer of retained surplus, and that the true variability of returns, while managed in the short term, was stored up and released through terminal bonuses that could be cut substantially without notice. When equity markets fell sharply in 2000 to 2002, dozens of insurers imposed Market Value Reductions, penalties that could reduce a policy’s cash-in value by 15 to 20 percent relative to what policyholders had been led to expect. The headline return figures had shown a smooth, reassuring average. The spread in actual outcomes, concentrated in the tail, had been effectively invisible.

The tell, in this case and in most cases where variance is being suppressed, is the absence of a range. Any time a financial product, a drug, a policy, or a forecast is described only by its average or expected outcome, and no range or uncertainty band is shown, variance is being withheld. The question to ask is: what does the distribution of outcomes look like? What happened to the people who got less than the average? What is the worst plausible case, and how likely is it?

A related tell is the use of long-term historical averages to imply a reliable single outcome. “This fund has returned an average of 8 percent per year over twenty years” tells you nothing about whether the returns were stable or wildly variable year to year. It is consistent with twenty consecutive years of 8 percent, or with a sequence that included minus 40 percent and plus 60 percent in adjacent years. The average is the same. The experience is not.

Your Challenge

Two secondary schools in the same area both report an average exam score of 63 out of 100 across their Year 11 students.

At School A, the scores range from 28 to 96. The school has a small number of very high scorers and a significant number of students scoring below 40.

At School B, the scores range from 51 to 74. Almost all students score within 10 points of the average.

Which school has done better? Which would you prefer your child to attend? Does your answer change depending on where you expect your child to sit in the distribution? What additional information about variance would help you decide?

There is no answer on this page. That is the point.

References

Investment variance and sequence-of-returns risk: Pfau, W.D., “Safe Savings Rates: A New Approach to Retirement Planning over the Glide Path,” Journal of Financial Planning (2011). The concept of sequence risk and its relationship to return variance is also discussed in Kitces, M., “Understanding Sequence of Return Risk — Safe Withdrawal Rates, Bear Market Crashes, And Bad Decades,” Kitces.com (2014). URL: https://www.kitces.com/blog/understanding-sequence-of-return-risk-safe-withdrawal-rates-bear-market-crashes-and-bad-decades/

Variance in clinical outcomes: Rothwell, P.M., “Treating individuals 2. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation,” The Lancet, vol. 365, pp. 176-186 (2005). The point that mean treatment effects obscure heterogeneous individual responses is foundational to the personalised medicine literature; see also Kent, D.M. et al., “The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement,” Annals of Internal Medicine, vol. 172, no. 1 (2020).

UK with-profits insurance and Market Value Reductions: Financial Services Authority, “With-Profits Review: Factsheet” (2003). Financial Services Authority, “Treating Customers Fairly — Progress and Next Steps” (2004). The FSA imposed requirements on insurers to communicate MVR risk more clearly following widespread complaints; see also Which?, “With-profits endowments: the problems explained” (2002). The scale of the problem is documented in the FSA’s CP187 consultation paper (2003) on with-profits regulation reform.

Coefficient of variation: Everitt, B.S., The Cambridge Dictionary of Statistics, 4th edition (Cambridge University Press, 2010), entry “coefficient of variation.” For applied discussion, see Wheelan, C., Naked Statistics: Stripping the Dread from the Data (W.W. Norton, 2013), Chapter 3.