Introduction to Statistics
III - Time Series
Definition: data for a variable at different points in time.
Simple Example of relating two time series:
Time 1 | Time 2 | Time 3 | Time 4 | Time 5 | |
---|---|---|---|---|---|
Time Series (TS) I: | 7 | 8 | 5 | 4 | 6 |
Time Series (TS) II: | 2 | 3 | 0 | -1 | 1 |
TS I: Mean = 6
TS II: Mean = 1
TS I Standard Deviation:
S = Square root of ( 1/(N-1) ) [Sum of (each data set - mean)2]
S = Square root of 1/4 (7-6)2 + (8-6)2 + (5-6)2 + (4-6)2 + (6-6)2
S = Square root of 1/4 (1 + 4 + 1 + 4 + 0)
S = Square root of 1/4 (10)
S = Square root of 10/4
S = Square root of 2.5
S ~ 1.58
TS II Standard Deviation:
S = Square root of 1/4 (1 + 4 + 1 + 4 + 0)
S = Square root of 10/4
S = Square root of 2.5
S ~ 1.58
Visually, it's obvious that these two time series are related. We can make this even more obvious by plotting one time series versus the other in a "scatter plot":
The two time series are linearly related. For example all the points in the scatter plot lie in a nice straight line. How do you express this relationship with numbers?
IV - Linear Correlation Coefficient
The statistical definition of "relatedness" of two time series is called correlation. We can calculate a "correlation coefficient" r that is a measure of how two time series are related. If r = 1, the two series are perfectly positively correlated, which means that as one variable gets larger, the other one does too. If r = -1, the two time series are perfectly negatively correlated, which means that as one variable gets larger the other one gets smaller. If r = 0, then the two variables are not related.
How do you calculate a correlation coefficient?
r = sum ( each time period of (It - Imean) (IIt - IImean) ) / ( (N-1)*(SI)*(SII) )
Where It is the value of Time Series I at time equals t and Imean is the mean of Time Series I.
Example using TS I and TS II:
r = [((7-6)(2-1) + (8-6)(3-1) + (5-6)(0-1) + (4-6)(-1- 1) + 0) ] / [ (5-1)(1.58)(1.58) ]
r = (1 + 4 + 1 + 4 + 0) / (4 * 1.58 * 1.58)
r = 10/10
r = 1
Therefore, the two time series are perfectly positively correlated. But in real life, r is almost never 1, -1, or 0. In the next section we learn how to interpret the significance of r in a real-life situation.