Introduction to Statistics
By John A. Knox and Alexandra Estrella
I - Definitions
- mean: the average of a set of numbers
median: The number found in the middle when
looking at the set of numbers from smallest to largest
mode: most commonly occuring value in a set of numbers
- Example for mean, median, and mode:
Sample data set:
5°
10°
10°
7°
3°
mean = 5 + 10 + 10 + 7 + 3 = 35, 35/5 = 7
median = 3° 5° 7° 10° 10°
mode = 10°
I - Advanced Definitions
- Variance: a measure of how data points differ from the mean
- Data Set 1: 3, 5, 7, 10, 10
Data Set 2: 7, 7, 7, 7, 7
- Data Set 1: mean = 7, median = 7
- Data Set 2: mean = 7, median = 7
But we know that the two data sets are not identical! The variance shows how they are different.
- Formula for variance:
S2 = (1/(N-1) × (the sum of (each data point - mean)2)
Formula applied to data set 1:
S2data set 1 =
( 1/(5-1) ) × ( (3-7)2 +
(5-7)2 +
(7-7)2 +
(10-7)2 +
(10-7)2 )
note: N = number of data points
S2data set 1 =
1/4 × { (-4)2 +
(-2)2 +
(0)2 +
(3)2 +
(3)2 }
S2data set 1 =
1/4 × ( 16 + 4 + 0 + 9 + 9 )
S2 = 1/4 × 38
S2 = 38/4
S2 = 9.5 for data set 1
Formula applied to data set 2:
S2data set 2 = (1/4) ×
(0 + 0 + 0 + 0 + 0)
S2 = 1/4(0)
S2 = 0 for data set 2
- Standard Deviation is "S," the square root of the variance:
S = Square root of [(1/(N-1)) × (sum of (each data point - mean)2)]
- measure of the difference from the mean. Large S means the data is spread widely around the mean.
- units are the same as the data itself
- S for data set 1 above is 3.08. S for data set 2 above is 0.