Autocorrelation and Partial Autocorrelation¶

Autocorrelation measures the linear relationship between lagged variables in a time series data. The ACF plot shows different autocorrelation coefficients. For example, $r_1$ measures the relationship between $y_t$ and $y_(t-1)$. When a plot has trends, then the ACF decreases gradually as lags increase. The partial auto-correlation function measures the correlation between current variable and lagged variable after eliminating the correlation from previous lags. In simple terms, the PACF removes the lags that cause autocorrelation.

In [61]:
plot_acf(timeseries1.price.values)
plot_pacf(timeseries1.price.values);

We have strong evidence for non stationarity cause the above ACF is “decaying”, or decreasing, very slowly, and remains well above the significance range. This is indicative of a non-stationary series. Moreover, in the plotted data of the previous plot we observe a singificant upword trend that also implies non stationarity.Even though, Let's now try the Augmented Dickey-Fuller test to check our TimeSeries for stationarity.

To identify the nature of data, we will be using the null hypothesis.

H0: The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H1: The alternative hypothesis: It is a claim about the population that is contradictory to H0 and what we conclude when we reject H0.

Ho: It is non-stationary

H1: It is stationary We will be considering the null hypothesis that data is not stationary and the alternate hypothesis that data is stationary.

In [9]:
from statsmodels.tsa.stattools import adfuller
def adfuller_test(value):
    result=adfuller(value)
    labels = ['ADF Test Statistic','p-value','#Lags Used','Number of Observations']
    for value,label in zip(result,labels):
        print(label+' : '+str(value) )
    if result[1] <= 0.05:
        print("strong evidence against the null hypothesis(Ho), reject the null hypothesis. Data is stationary")
    else:
        print("weak evidence against null hypothesis,indicating it is non-stationary ")
In [10]:
adfuller_test(timeseries1['price'])
ADF Test Statistic : -3.174039259406031
p-value : 0.021537430792049415
#Lags Used : 15
Number of Observations : 600
strong evidence against the null hypothesis(Ho), reject the null hypothesis. Data is stationary

The Dickey-Fuller test showed that the data is stationary! So, regardless what we obseved in the ACF,PACF and Series plot before, we will proceed by taking as grand that the series is stationary with the implemantation of an ARMA model. In the second part of our analysis we will try to apply some transformations to the data and explore it further

In [ ]:
##  5.2 - Explore Time-Series for model selection
### Autocorrelation and Partial Autocorrelation
Autocorrelation measures the linear relationship between lagged variables in a time series data. The ACF plot shows different autocorrelation coefficients. For example, $r_1$  measures the relationship between $y_t$ and $y_(t-1)$. When a plot has trends, then the ACF decreases gradually as lags increase. The partial auto-correlation function measures the correlation between current variable and lagged variable after eliminating the correlation from previous lags. In simple terms, the PACF removes the lags that cause autocorrelation.

plot_acf(timeseries1.price.values)
plot_pacf(timeseries1.price.values);

We have strong evidence for non stationarity cause the above ACF is “decaying”, or decreasing, very slowly, and remains well above the significance range. This is indicative of a non-stationary series. Moreover, in the plotted data of the previous plot we observe a singificant upword trend that also implies non stationarity.Even though, Let's now try the Augmented Dickey-Fuller test to check our TimeSeries for stationarity.

To identify the nature of data, we will be using the null hypothesis.

H0: The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H1: The alternative hypothesis: It is a claim about the population that is contradictory to H0 and what we conclude when we reject H0.

Ho: It is non-stationary

H1: It is stationary We will be considering the null hypothesis that data is not stationary and the alternate hypothesis that data is stationary.

from statsmodels.tsa.stattools import adfuller
def adfuller_test(value):
    result=adfuller(value)
    labels = ['ADF Test Statistic','p-value','#Lags Used','Number of Observations']
    for value,label in zip(result,labels):
        print(label+' : '+str(value) )
    if result[1] <= 0.05:
        print("strong evidence against the null hypothesis(Ho), reject the null hypothesis. Data is stationary")
    else:
        print("weak evidence against null hypothesis,indicating it is non-stationary ")

adfuller_test(timeseries1['price'])

The Dickey-Fuller test showed that the data is stationary! So, regardless what we obseved in the ACF,PACF and Series plot before, we will proceed by taking as grand that the series is stationary with the implemantation of an ARMA model. In the second part of our analysis we will try to apply some transformations to the data and explore it further