Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

· Big Data Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

My previous post on Gelfand's Question looked at a problem that requires repeatedly finding the first digit of kn wherek is a single digit but n may be on the order of millions or billions.

The most direct approach would be to first compute kn as a very large integer, then find it’s first digit. That approach is slow, and gets slower as n increases. A faster way is to look at the fractional part of log kn = n log k and see which digit it corresponds to.

If n is not terribly big, this can be done in ordinary precision. But when n is large, multiplying logk by n and taking the fractional part brings less significant digits into significance. So for very large n, you need extra precision. I first did this in Python using SymPy, then switched to C++ for more speed. There I used the `quadmath` library for `gcc`. (If n is big enough, even quadruple precision isn’t enough. An advantage to SymPy over `quadmath` is that the former has arbitraryprecision. You could, for example, set the precision to be 10 more than the number of decimal places in n in order to retain 10 significant figures in the fractional part of n log k.)

The `quadmath.h` header file needs to be wrapped in an `extern C `declaration. Otherwise `gcc`will give you misleading error messages.

The 128-bit floating point type `__float128` has twice as many bits as a `double`. The `quadmath`functions have the same name as their standard `math.h` counterparts, but with a `q` added on the end, such as `log10q` and `fmodq` below.

Here’s code for computing the leading digit of kn that illustrates using `quadmath`.

```#include <cmath>
extern "C" {
}

__float128 logs[11];

for (int i = 2; i <= 10; i++)
logs[i] = log10q(i + 0.0);

int first_digit(int base, long long exponent)
{
__float128 t = fmodq(exponent*logs[base], 1.0);
for (int i = 2; i <= 10; i++)
if (t < logs[i])
return i-1;
}```

The code always returns because `t` is less than 1.

Caching the values of `log10q` saves repeated calls to a relatively expensive function. So does using the search at the bottom rather than computing `powq(10, t)`.

The linear search at the end is more efficient than it may seem. First, it’s only search a list of length 9. Second, because of Benford’s law, the leading digits are searched in order of decreasing frequency, i.e. most inputs will cause `first_digit` to return early in the search.

When you compile code using `quadmath`, be sure to add `-lquadmath` to the compile command.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.