Variance is a measure of dispersion. It is the expectation of the squared deviation of a random variable from its population mean or sample mean. In simple words, it is a measure of how far a set of numbers is spread out from their average value.

Symbol of Variance is **σ**^{2} (Sigma Square).

Variance is the average of thesquareddifferences from the Mean.

To calculate the variance follow these steps:

- Find the Mean (the simple average of the numbers)
- Subtract the Mean from each number
- Square the Result (the
*squared difference*) and take sum - Take the average of squared differences.

## Example

Data = 8, 9, 10, 11, 12

**Step 1:**Find the Mean (the simple average of the numbers)- Mean: (8 + 9 + 10 + 11 + 12) ÷ 5 = 10

**Step 2:**Subtract the Mean from each number

Score | Deviation from the mean |
---|---|

8 | 8 – 10 = -2 |

9 | 9 – 10 = 1 |

10 | 10 – 10 = 0 |

11 | 11 – 10 = 1 |

12 | 15 – 10 = 5 |

**Step 3**: Square each deviation from the mean and take sum- (-2)
^{2}+ (1)^{2}+ (0)^{2}+ (1)^{2}+ (2)^{2}=**10**

- (-2)
**Step 4:**Divide the sum of squares by*n – 1*or*N*- Divide the sum of the squares by
*n*– 1 (for a sample) or*N*(for a population). - Since we’re working with a sample, we’ll use
*n*– 1, where*n*= 5. - 10/(5-1) =
**2.5**

- Divide the sum of the squares by

## Advantages

- Statisticians use variance to see how individual numbers relate to each other within a data set, rather than using broader mathematical techniques such as arranging numbers into quartiles.
- The advantage of variance is that it treats all deviations from the mean as the same regardless of their direction.
- The squared deviations cannot sum to zero and give the appearance of no variability at all in the data.

## Disadvantages

- Variance gives added weight to outliers. These are the numbers far from the mean. Squaring these numbers can skew the data. Another pitfall of using variance is that it is not easily interpreted. Users often employ it primarily to take the square root of its value, which indicates the standard deviation of the data set. As noted above, investors can use standard deviation to assess how consistent returns are over time.

## Video

## Code

Starting Python 3.4, the standard library comes with the *variance function (sample variance or variance n-1)* as part of the statistics module:

```
# Python code
# variance() function of Statistics Module
# Import statistics module
import statistics
# Creating a sample of data
sample = [8, 9, 10, 11, 12]
# Print variance of the sample set
print(f"Variance of sample set is {statistics.variance(sample)}")
```

Variance of sample set is 2.5

The *population variance* (or *variance n*) can be obtained using the `pvariance`

function:

```
# Python code
# pvariance() function of Statistics Module
# Import statistics module
import statistics
# Creating a sample of data
population = [8, 9, 10, 11, 12]
# Print variance of the sample set
print(f"Variance of population set is {statistics.pvariance(population)}")
```

Variance of population set is 2

**Related Links**