Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Visualizing Uncertainty Using Jackknife

DZone's Guide to

Visualizing Uncertainty Using Jackknife

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Once again, I (re)discovered last week at the Rmetrics conference that old tools can be extremely interesting to illustrate complex ideas, like uncertainty in fnancial markets, and stock prices. For instance a 99.5% quantile: we look for the scenario that occur with a probability of 1 out of 200. Are there nice ways to illustrate that quantity ?

Consider the monthly evolution of the SP500 index over the last 22 years,

> library(quantmod)
> getSymbols('^GSPC', from='1990-01-01')
[1] "GSPC"
> GSPC = adjustOHLC(GSPC,
+ symbol.name='^GSPC')
> MGSPC = to.monthly(GSPC)
> CLOSE = MGSPC$GSPC.Close
> plot(CLOSE)

It is possible to use Jackknife technique to illustrate uncertainty. The idea, in Jackknife, it to remove one of the observations, and to do that for all observations. More formally, from a sample , we define a (sub)sample where observation  as been removed, i.e. . Then, we can study all samples when one observation was removed.

Here, in the context of financial time series, over 270 months, we can wonder what might have been the final value of the index if one observation (i.e. one month) had been removed. It is actually the idea of Jackknife,

> R=diff(log(CLOSE)); R=R[-1]
> n=length(R)
> X=rnorm(n,mean(R),sd(R))
> X=R
> MX=t(matrix(X,n,n))
> MX=exp(MX)
> diag(MX)=1
> SMX=MX
> for(k in 2:n){SMX[,k]=SMX[,k-1]*(MX[,k])}

We can plot the different trajectories of the index, when we remove one month,

> init=as.numeric(CLOSE[1])
> plot(1:n,init*cumprod(exp(X)),type="l",
+ xlab="",ylab="",col="white")
> for(k in 1:n){lines(0:n,init*c(1,SMX[k,]),
+ col="light blue")}
> lines(0:n,init*c(1,cumprod(exp(X))),lwd=2,
+ col="blue")

 

This can be used to understand sensitivity, or unccertainty, of financial time series,

We can then look closer at the final value of the index, over those 270 scenarios,

or we also use a Box-Plot,

Here we can clearly see the impact: if we remove one good month, the index ends around 1250, while it reaches 1650 if we remove a bad month. The difference is huge. So instead of talking about volatility (which is actually a complex concept), that Jackknife idea of remove observations might be more intuitive, and much easier to get a first understanding of uncertainty. But those ideas of resampling are great. I will post a nice application soon (but first, I will discuss with some colleagues in Lyon).

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:

Published at DZone with permission of Arthur Charpentier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}