Gelfands’s question asks whether there is a positive integer n such that the first digits of jn base 10 are all the same for j = 2, 3, 4, …, 9. (Thanks to @republicofmath for pointing out this problem.) This post will explore Gelfand’s question via probability.
The MathWorld article on Gelfand’s question says that the answer is no for values of n less than 100,000. That range seemed small to me. My intuition was that you’d need to try larger values of n to have a reasonable chance of finding a solution.
So how big a value of n should you explore? To answer that question, consider picking n at random. If the leading digits of jn were randomly distributed, what would the chances be that n would satisfy Gelfand’s question? Leading digits are not random, but they do often have regular distributions.
Suppose the leading digits are uniformly distributed among 1 to 9, and that the leading digits for each base are independent. Then the probability of jn beginning with 1 (or any other chosen digit) for all j from 2 to 9 would be (1/9)8, and the probability of all leading digits matching any digit would be 9 times larger. That would say the probability of all leading digits matching would be on the order of 2 × 107, so we should try on the order of 107 or 108 values of n to have a decent chance of finding one.
But there’s a problem with the above argument: leading digits are not uniformly distributed. Benford’s law says that leading digits are often distributed very unevenly. When a system follows Benford’s law, the probability of a leading digit being d is log10(1 + 1/d). An uneven distribution of leading digits raises the probability that all the digits will match. If the leading digits of jn follow Benford’s law, and are independent, then the probability of all matching is 6.84 × 10-5, two orders of magnitude higher than assuming a uniform distribution.
Do the leading digits of jn follow Benford’s law? Yes, at least approximately, based on testing values of n from 1 to 1,000.
Now suppose the probability of a randomly selected value of n satisfying Gelfand’s question is p = 6.84 × 10-5. If we tried 100,000 random values of n, the probability of having no successes would be about 0.00107. So my intuition was wrong. If the random heuristic given here were correct, we’d stand a good chance of seeing a solution if we tried values of n up to 100,000.
Where does the probabilistic heuristic go wrong? In the assumption that the distributions of the leading digits are independent.
Let xj be the sequence jn for n running from 1 to 1000. Some of the xj are correlated and some are not.
Some of the correlations are easy to see. For example, the sequences for x2 and x4 have correlation 0.493. That’s because 4n = (2n)2. So if you know the first digit of 2n, you can often guess the first digit of 4n. In light of this, it’s not surprising that x2 is also correlated with x8, and x3 is correlated with x9.
You might speculate that xj and xk are uncorrelated if and only if j and k are relatively prime. Some sequences, such as x2 and x3 support that guess. But x4 and x5 are negatively correlated, r = -0.433, for reasons that are not obvious. I suspect that the negative correlations are the key to understanding the question.
Update: I have verified that the answer to Gelfand’s question is no for n up to 109.