DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Visualization of probabilistic forecasts

Visualization of probabilistic forecasts

Rob J Hyndman user avatar by
Rob J Hyndman
·
Nov. 26, 14 · Big Data Zone · Interview
Like (0)
Save
Tweet
11.66K Views

Join the DZone community and get the full member experience.

Join For Free

This week my research group dis­cussed Adrian Raftery’s recent paper on “Use and Com­mu­ni­ca­tion of Prob­a­bilis­tic Fore­casts” which pro­vides a fas­ci­nat­ing but brief sur­vey of some of his work on mod­el­ling and com­mu­ni­cat­ing uncer­tain futures. Coin­ci­den­tally, today I was also sent a copy of David Spiegelhalter’s paper on “Visu­al­iz­ing Uncer­tainty About the Future”. Both are well-​​worth reading.

It made me think about my own efforts to com­mu­ni­cate future uncer­tainty through graph­ics. Of course, for time series fore­casts I nor­mally show pre­dic­tion inter­vals. I pre­fer to use more than one inter­val at a time because it helps con­vey a lit­tle more infor­ma­tion. The default in the fore­cast pack­age for R is to show both an 80% and a 95% inter­val like this:

It is some­times prefer­able to use a 50% and a 95% inter­val, rather like a box­plot:

In some cir­cles (espe­cially macro­eco­nomic fore­cast­ing), fan charts are pop­u­lar:

Per­son­ally, I don’t like these at all as they lose any spe­cific prob­a­bilis­tic inter­pretabil­ity. What does the darker shaded region actu­ally refer to? At least in the pre­vi­ous ver­sion, it is clear that the dark region con­tains 50% of the probability.

The above three exam­ples are eas­ily pro­duced using the fore­cast package:

view source
print?
1.fit <- ets(hsales)
2.plot(forecast(fit),include=120)
3.plot(forecast(fit,level=c(50,95)),include=120)
4.plot(forecast(fit,fan=TRUE),include=120)

For multi-​​modal dis­tri­b­u­tions I like to use high­est den­sity regions. Here is an exam­ple applied to Nicholson’s blowfly data using a thresh­old model:



The dark region has 50% cov­er­age and the light region has 95% cov­er­age. The fore­cast dis­tri­b­u­tions become bimodal after the first ten iter­a­tions, and so the 50% region is split in two to show that. This graphic was taken from a J. Fore­cast­ing paper I wrote in 1996, so these ideas have been around for a while!

It is easy enough to pro­duce fore­cast HDR with time series. Here is some R code to do it:

view source
print?
01.#HDR for time series object
02.# Assumes that forecasts can be computed and futures simulated from object
03.forecasthdr <- function(object, h = ifelse(object$m > 1, 2 * object$m, 10),
04.                        nsim=2000, plot=TRUE, level=c(50,95), xlim=NULL, ylim=NULL, ...)
05.{
06.  require(hdrcde)
07.  # Compute forecasts
08.  fc <- forecast(object)
09.  ft <- time(fc$mean)
10.  
11.  # Simulate future sample paths
12.  sim <- matrix(0,nrow=h,ncol=nsim)
13.  for(i in 1:nsim)
14.    sim[,i] <- simulate(object, nsim=h)
15.  
16.  # Compute HDRs
17.  nl <- length(level)
18.  hd <- array(0, c(h,nl,10))
19.  mode <- numeric(h)
20.  for(k in 1:h)
21.  {
22.    hdtmp <- hdr(sim[k,], prob=level)
23.    hd[k,,1:ncol(hdtmp$hdr)] <- hdtmp$hdr
24.    mode[k] <- hdtmp$mode
25.  }
26.  # Remove unnecessary sections of HDRs
27.  nz <- apply(abs(hd),3,sum) > 0
28.  hd <- hd[,,nz]
29.  dimnames(hd)[[1]] <- 1:h
30.  dimnames(hd)[[2]] <- level
31.  
32.  
33.  # Produce plot if required
34.  if(plot)
35.  {
36.    if(is.<span>null</span>(xlim))
37.      xlim <- range(time(object$x),ft)
38.    if(is.<span>null</span>(ylim))
39.      ylim <- range(object$x, hd)
40.    plot(object$x,xlim=xlim, ylim=ylim, ...)
41.    # Show HDRs
42.    cols <- rev(colorspace::sequential_hcl(52))[level - 49]
43.    for(k in 1:h)
44.    {
45.      for(j in 1:nl)
46.      {
47.        hdtmp <- hd[k,j,]
48.        nint <- length(hdtmp)/2
49.        for(l in 1:nint)
50.        {
51.          polygon(ft[k]+c(-1,-1,1,1)/object$m/2,
52.                  c(hdtmp[2*l-1],hdtmp[2*l],hdtmp[2*l],hdtmp[2*l-1]),
53.                col=cols[j], border=FALSE)
54.        }
55.      }
56.      points(ft[k], mode[k], pch=19, col="blue",cex=0.8)
57.    }
58.    #lines(fc$mean,col='blue',lwd=2)
59.  }
60.  
61.  # Return HDRs
62.  return(list(hdr=hd,mode=mode,level=level))
63.}

We can apply it using the exam­ple I started with:

view source
print?
1.z <- forecasthdr(fit,xlim=c(1986,1998),nsim=5000,
2.              xlab="Year",ylab="US monthly housing sales")


The dots are modes of the fore­cast dis­tri­b­u­tions, and the 50% and 95% high­est den­sity regions are also shown. In this case, the dis­tri­b­u­tions are uni­modal, and so all the regions are inter­vals.
Visualization (graphics) Distribution (differential geometry) Time series Papers (software) Uncertainty R (programming language) Data (computing) Clear (Unix)

Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The Right Way to Hybridize Your Product Development Technique
  • Demystify the Cybersecurity Risk Management Process
  • Data Analysis Using Google Cloud Data Studio
  • How to Integrate Zoom in React Application?

Comments

Big Data Partner Resources

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo