Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# R: Markov Chain Wikipedia Example

· Big Data Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Over the weekend I’ve been readingaboutMarkov Chains and I thought it’d be an interesting exercise for me to translate Wikipedia’s example into R code.

But first a definition:

A Markov chain is a random process that undergoes transitions from one state to another on a state space.

It is required to possess a property that is usually characterized as “memoryless”: the probability distribution of the next state depends only on the current state and not on the sequence of events that preceded it.

that ‘random process’ could be moves in a Monopoly game, the next word in a sentence or, as in Wikipedia’s example, the next state of the Stock Market.

The diagram below shows the probabilities of transitioning between the various states:

e.g. if we’re in a Bull Market the probability of the state of the market next week being a Bull Market is 0.9, a Bear Market is 0.075 and a Stagnant Market is 0.025.

We can model the various transition probabilities as a matrix:

``````M = matrix(c(0.9, 0.075, 0.025, 0.15, 0.8, 0.05, 0.25, 0.25, 0.5),
nrow = 3,
ncol = 3,
byrow = TRUE)

> M
[,1]  [,2]  [,3]
[1,] 0.90 0.075 0.025
[2,] 0.15 0.800 0.050
[3,] 0.25 0.250 0.500``````

Rows/Cols 1-3 are Bull, Bear, Stagnant respectively.

Now let’s say we start with a Bear market and want to find the probability of each state in 3 weeks time.

We can do this is by multiplying our probability/transition matrix by itself 3 times and then multiplying the result by a vector representing the initial Bear market state.

``````threeIterations = (M %*% M %*% M)

> threeIterations
> threeIterations
[,1]    [,2]    [,3]
[1,] 0.7745 0.17875 0.04675
[2,] 0.3575 0.56825 0.07425
[3,] 0.4675 0.37125 0.16125

> c(0,1,0) %*% threeIterations
[,1]    [,2]    [,3]
[1,] 0.3575 0.56825 0.07425``````

So we have a 56.825% chance of still being in a Bear Market, 35.75% chance that we’re now in a Bull Market and only a 7.425% chance of being in a stagnant market.

I found it a bit annoying having to type ‘%*% M’ multiple times but luckily the expm library allows us to apply a Matrix power operation:

``````install.packages("expm")
library(expm)

> M %^% 3
[,1]    [,2]    [,3]
[1,] 0.7745 0.17875 0.04675
[2,] 0.3575 0.56825 0.07425
[3,] 0.4675 0.37125 0.16125``````

The nice thing about this function is that we can now easily see where the stock market will trend towards over a large number of weeks:

``````> M %^% 100
[,1]   [,2]   [,3]
[1,] 0.625 0.3125 0.0625
[2,] 0.625 0.3125 0.0625
[3,] 0.625 0.3125 0.0625``````

i.e. 62.5% of weeks we will be in a bull market, 31.25% of weeks will be in a bear market and 6.25% of weeks will be stagnant.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
bigdata ,big data ,r language ,big data applications

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.