Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# The Statistics of Easter

· Big Data Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

How to Simplify Apache Kafka. Get eBook.

This morning, there was an interesting post entitled “why does Easter move around so much?” online on http://economist.com/blogs/economist-explains/…

In my time series classes, I keep saying that sometimes, series can exhibit seasonlity, but the seasonal effect can be quite irregular. It is the cas for river levels, where snowmelt can have a huge impact, and it is irregular. Similarly, chocolate sales (even monthly, or quarterly) depends on Easter. Because it can be either in March, or in April, the seasonal pattern is not as regular as flower sales for instance (Valentine beeing always on February 14th, as far as I remember). If we look at the word eggson http://google.com/trends/q=eggs…, we do observe a cycle related to Easter.

The title of the article published by http://economist.com/blogs/economist-explains/… claims that there is a lot of variability on Eater’s day. Let us check ! The answer to the question “When is Easter ?” can be the following (if we want a short answer): Easter Sunday is the first Sunday after the first full moon after vernal equinox. For more details, see e.g. http://ortelius.de/east. The algorithm used to compute the date of Easter can is online, on http://smart.net/~mmontes/….

```> century = year/100
> G = year % 19
> K = (century - 17)/25
> I = (century - century/4 - (century - K)/3 + 19*G + 15) % 30
> I = I - (I/28)*(1 - (I/28)*(29/(I + 1))*((21 - G)/11))
> J = (year + year/4 + I + 2 - century + century/4) % 7
> L = I - J
> EasterMonth = 3 + (L + 40)/44
> EasterDay = L + 28 - 31*(EasterMonth/4)```

Actually, this algorithm can be found in some R packages. Here we use the date of Easter from AD 1000 and AD 3000,

```> library(timeDate)
> E=Easter(1000:3000)
> D=as.Date(E)
> table(months(D))/2001

april     march
0.7651174 0.2348826```

(April being before March, in the alphabetical order) If we look at the distribution of the date, it is the following, the starting point being March 1st,

```> J=as.numeric(D-as.Date(paste("01/03/",1000:3000,sep=""),"%d/%m/%Y"))
> hist(J,breaks=seq(20,55),col="light green")```

And if we look at the autocorrelation function, we can observe that indeed, after 19 years, there is a strong correlation (that could be seen in the algorithm given previously),

`> plot(acf(J))`

But in order to get a better understanding of the dynamics, we can also look at transiftion matrices. Define

```> Q=quantile(J,seq(0,1,by=.25))
> Q[1]=Q[1]-1
> C=cut(J,Q)```

Then, the one year transition matrix is (in %)

```> k=1; n=length(C)
> B=data.frame(X1=(C[1:(n-k)]),X2=(C[(k+1):n]))
> (T=table(B\$X1,B\$X2))

(20,31] (31,39] (39,46] (46,55]
(20,31]       0       0     265     277
(31,39]     316       0      13     182
(39,46]     224     264       0       0
(46,55]       1     247     211       0
> P=T/apply(T,1,sum)
> round(P*1000)/10

(20,31] (31,39] (39,46] (46,55]
(20,31]     0.0     0.0    48.9    51.1
(31,39]    61.8     0.0     2.5    35.6
(39,46]    45.9    54.1     0.0     0.0
(46,55]     0.2    53.8    46.0     0.0```

I.e. if  Easter was early in the year (say in March, in the first quartile), then very likely, the year after, it will be late in the year (with 50% chance in the third quartile, and 50% chance in the fourth one).

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.