Addition Carries and Markov Chains
Python is a great language for data science because of mathematical capabilities. Read on to get a look at the complex math Python can handle.
Join the DZone community and get the full member experience.Join For Free
Suppose you add two long numbers together. As you work your way toward the result, adding digits from right to left, sometimes you carry a 1 and sometimes you don't. How often do you carry 1?
Now suppose you add three numbers at a time. Now your carry might be 0, 1, or 2. How often does each appear? How do things change if you're not working in base 10?
John Holte goes into this problem in detail in . We'll only look at his first result here.
Suppose we're adding m numbers together in base b, with each digit being uniformly distributed. Then at each step of the addition process, the probability distribution of the amount we carry out only depends on the amount we carried in from the previous step. In other words, the carries form a Markov chain!
This means that we can do more than describe the distribution of the amounts carried. We can specify the transition probabilities for carries, i.e. the distribution on the amount carried out, conditional on the the amount carried in.
When we add two base ten numbers, m = 2 and b = 10, we have the transition matrix
This means that on average we're as likely to carry 0 as to carry 1. But, more specifically, there's a 55% chance that we'll carry a 1 if we carried a 1 on the previous step, and also a 55% chance that we won't have a carry this time if we didn't have one last time.
If we add three numbers, things get a little more interesting. Now the transition matrix is:
This gives the probabilities of each out carry for each in carry. We can compute the long run distribution from the matrix above to find that when adding three long numbers, we'll carry 0 or 1 with probability 1/6 each, and carry 1 with probability 2/3.
When adding three binary numbers, the transition matrix is fairly different than for base 10
but the long run distributions are the same: carry a 1 two-thirds of the time, and split the remaining probability equally between carrying 0 and 2.
The transition matrices don't change much for larger bases; base 100 doesn't look much different than base 100, for example.
Finally, let's do an example adding five numbers in base 10. The transition matrix is
and the long run distribution is 1/60, 13/60, 33/60, 13/60, and 1/60 for carries of 0 through 4.
In general, when adding m numbers in base b, Holte gives the transition probabilities as:
Here is some Python code to compute the transition matrices.
from scipy import floor, zeros, array from scipy.special import comb from numpy.linalg import matrix_power # transition probability def pi(i, j, b, m): s = 0 for r in range(0, j - int(floor(i/b)) + 1): s += (-1)**r * comb(m+1, r) * comb(m-i-1 +b*(j-r+1), m) return s/b**m def transition_matrix(b, m): P = [ [pi(i, j, b, m) for j in range(m)] for i in range(m) ] return array(P)
You can find the long run probabilities by raising the transition matrix to a large power.
 John M. Holte. The American Mathematical Monthly, Vol. 104, No. 2 (Feb., 1997), pp. 138-149
Published at DZone with permission of John Cook, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.