Following Michael’s idea (here), I wanted to go further, based on his intuition (and dataset that he kindly sent me, there). If we consider the two series of Nikkei index and SP500 index in euros, we have to following graph,

> library(RODBC)
> base = odbcConnectExcel("http://perso.univ-rennes1.fr/arthur.charpentier/spx_nky_eurusd.xls", readOnly = TRUE)
> series1 = sqlQuery(base,query="select * from [Tabelle1$A2:B8837]") # SPX
> series2 = sqlQuery(base,query="select * from [Tabelle1$D2:E8631]") # NKY
> series3 = sqlQuery(base,query="select * from [Tabelle1$G2:H8945]") # EURUSD
> odbcCloseAll()
> series4=merge(series1,series3)
> series4$SPEUR=series4$SPX/series4$EURUSD
> series5=merge(series4,series2)
> x=(as.Date(series5[,1])-as.Date("01/01/0000","%d/%m/%Y"))/365.25
> yl=range(series5[,4])
> xl=c(1975,2010)
> plot(x,series5[,4],axes=FALSE,xlab="",ylab="",type="l",
+ lwd=3,col="red",xlim=xl,ylim=yl)
> axis(1)
> axis(2, col="red")
> par(new=TRUE)
> yl=range(series5[,5])
> plot(x,series5[,5],axes=FALSE,xlab="",ylab="",type="l",
+ lwd=3,col="blue",xlim=xl,ylim=yl)
> axis(4, col="blue")
> mtext("SP500 in Euros", 2, line=2, col="red", cex=1.2)
> mtext("NKY", 4, line=2, col="blue", cex=1.2)
Those two series series seem to have a similar pattern, so an idea can be translate the SP500 on the left,

Those two series are extremely correlated, with a correlation of 0.9572,
> X1=series5[2501:n,4]
> X2=series5[1:(n-2500),5]
> cor(X1,X2)
[1] 0.9572484
But are the two series cointegrated (see here, here or there for material on cointegration) ? Well, using standard procedure, we first have to prove that the two series are integrated. First, let us look at the autocorrelograms,


> acf(X2,lag=1000,col="light green")
> acf(X1,lag=1000,col="light green")
> library(tseries)
> adf.test(X1)
Augmented Dickey-Fuller Test
data: X1
Dickey-Fuller = -1.0768, Lag order = 17, p-value = 0.9264
alternative hypothesis: stationary
> adf.test(X2)
Augmented Dickey-Fuller Test
data: X2
Dickey-Fuller = -1.2905, Lag order = 17, p-value = 0.8788
alternative hypothesis: stationary
But if we want to go further, we have to find the cointegration relationship between the two series. From an heuristic point of view, a linear regression should be a good proxy,
> reg=lm(X1~X2)
> plot(residuals(reg))


Augmented Dickey-Fuller Test
data: residuals(reg)
Dickey-Fuller = -5.176, Lag order = 17, p-value = 0.01
alternative hypothesis: stationary
Message d’avis :
In adf.test(residuals(reg)) : p-value smaller than printed p-value
> pp.test(residuals(reg))
Phillips-Perron Unit Root Test
data: residuals(reg)
Dickey-Fuller Z(alpha) = -46.9775, Truncation lag parameter = 11,
p-value = 0.01
alternative hypothesis: stationary
Message d’avis :
In pp.test(residuals(reg)) : p-value smaller than printed p-value
When we look at the autocorrelation function, it looks like we do have a stationary series.
This idea is - more or less - the idea of Engle-Granger two step procedure. But actually, we can not directly use Dickey-Fuller’s test to see if residuals are integrated. This was proved in Phillips and Ouliaris (1990), who also proposed a test (see e.g. here),
> library(tseries); po.test(cbind(X1,X2))
Phillips-Ouliaris Cointegration Test
data: cbind(X1, X2)
Phillips-Ouliaris demeaned = -53.1766, Truncation lag parameter = 57,
p-value = 0.01
Message d’avis :
In po.test(cbind(X1, X2)) : p-value smaller than printed p-value
Another similar function can be found in R
> library(urca)
> summary(ca.po(cbind(X1,X2)))
########################################
# Phillips and Ouliaris Unit Root Test #
########################################
Test of type Pu
detrending of series none
Call:
lm(formula = z[, 1] ~ z[, -1] - 1)
Value of test-statistic is: 45.2032
Critical values of Pu are:
10pct 5pct 1pct
critical values 20.3933 25.9711 38.3413
Thus, we has to admit that those series are cointegrated.
Based on that idea, it is possible to model the stationary component, and forecast it for the next ten years, based on the assumption that we know the behavior of one time series. Hence, if we add the confidence interval due to the stationary component uncertainty, we have the following graph,

de vulgarisation dans Nature,
le premier "
et
on se demande si
cause
,
et dans quelle mesure (à supposer que cette
causalité soit quantifiable, ou mesurable). La loi du processus est la loi du processus du couple
.
Cette dernière s’écrit, à la date t, conditionnellement au passé, noté
et
,










sont
des bruits blancs indépendants. Comme
.
De manière similaire, notons que 
.
un
bruit blanc.





,
ce qui permettra d’interpréter le vecteur
comme le processus d’innovation. Alors
sont nuls, i.e. 
sont nuls, i.e. 



