## Introduction to Statistics

#### III - Time Series

**Definition**: data for a variable at different points in time.

Simple Example of relating two time series:

Time 1 | Time 2 | Time 3 | Time 4 | Time 5 | |
---|---|---|---|---|---|

Time Series (TS) I: | 7 | 8 | 5 | 4 | 6 |

Time Series (TS) II: | 2 | 3 | 0 | -1 | 1 |

**TS I**: Mean = 6

**TS II**: Mean = 1

**TS I Standard Deviation**:

**S = Square root of ( 1/(N-1) ) [Sum of (each data set - mean) ^{2}]**

S = Square root of 1/4 (7-6)^{2} +
(8-6)^{2} +
(5-6)^{2} +
(4-6)^{2} +
(6-6)^{2}

S = Square root of 1/4 (1 + 4 + 1 + 4 + 0)

S = Square root of 1/4 (10)

S = Square root of 10/4

S = Square root of 2.5

S ~ 1.58

**TS II Standard Deviation**:

S = Square root of 1/4 (1 + 4 + 1 + 4 + 0)

S = Square root of 10/4

S = Square root of 2.5

S ~ 1.58

Visually, it's obvious that these two time series are related. We can make this even more obvious by plotting one time series versus the other in a "scatter plot":

The two time series are linearly related. For example all the points in the scatter plot lie in a nice straight line. How do you express this relationship with numbers?

#### IV - Linear Correlation Coefficient

The statistical definition of "relatedness" of two time series is called correlation. We can calculate a "correlation coefficient" *r* that is a measure of how two time series are related. If r = 1, the two series are perfectly positively correlated, which means that as one variable gets larger, the other one does too. If r = -1, the two time series are perfectly negatively correlated, which means that as one variable gets larger the other one gets smaller. If r = 0, then the two variables are not related.

**How do you calculate a correlation coefficient?**

**r = sum ( each time period of (I _{t} - I_{mean}) (II_{t} - II_{mean}) )
/ ( (N-1)*(S_{I})*(S_{II}) )**

Where I_{t} is the value of Time Series I
at time equals t and I_{mean} is the mean of
Time Series I.

**Example using TS I and TS II**:

r = [((7-6)(2-1) + (8-6)(3-1) + (5-6)(0-1) + (4-6)(-1- 1) + 0) ] / [ (5-1)(1.58)(1.58) ]

r = (1 + 4 + 1 + 4 + 0) / (4 * 1.58 * 1.58)

r = 10/10

r = 1

Therefore, the two time series are perfectly positively correlated. But in real life, r is almost never 1, -1, or 0. In the next section we learn how to interpret the significance of r in a real-life situation.