## Introduction to Statistics

By John A. Knox and Alexandra Estrella

#### I - Definitions

**mean**: the average of a set of numbers

**median**: The number found in the middle when
looking at the set of numbers from smallest to largest

**mode**: most commonly occuring value in a set of numbers

- Example for mean, median, and mode:

**Sample data set**:
5°
10°
10°
7°
3°

**mean** = 5 + 10 + 10 + 7 + 3 = 35, 35/5 = 7

**median** = 3° 5° 7° 10° 10°

**mode** = 10°

#### I - Advanced Definitions

**Variance**: a measure of how data points differ from the mean

- Data Set 1: 3, 5, 7, 10, 10

Data Set 2: 7, 7, 7, 7, 7

- Data Set 1: mean = 7, median = 7

- Data Set 2: mean = 7, median = 7

But we know that the two data sets are not identical! The variance shows how they are different.
**Formula for variance**:

**S**^{2} = (1/(N-1) × (the sum of (each data point - mean)^{2})

**Formula applied to data set 1**:

S^{2}_{data set 1} =
( 1/(5-1) ) × ( (3-7)^{2} +
(5-7)^{2} +
(7-7)^{2} +
(10-7)^{2} +
(10-7)^{2} )

__note__: N = number of data points

S^{2}_{data set 1} =
1/4 × { (-4)^{2} +
(-2)^{2} +
(0)^{2} +
(3)^{2} +
(3)^{2} }

S^{2}_{data set 1} =
1/4 × ( 16 + 4 + 0 + 9 + 9 )

S^{2} = 1/4 × 38

S^{2} = 38/4

S^{2} = 9.5 for data set 1

**Formula applied to data set 2**:

S^{2}_{data set 2} = (1/4) ×
(0 + 0 + 0 + 0 + 0)

S^{2} = 1/4(0)

S^{2} = 0 for data set 2

**Standard Deviation** is "S," the square root of the variance:

**S = Square root of [(1/(N-1)) × (sum of (each data point - mean)**^{2})]

- measure of the difference from the mean. Large S means the data is spread widely around the mean.
- units are the same as the data itself

- S for data set 1 above is 3.08. S for data set 2 above is 0.