# A "truly horrible" pseudorandom number generator

I’ve gone back to A History of Data Visualization, and there’s a nice discussion on the RANDU algorithm.

The short version is that it’s defined by the recurrence relation:

$V_{j+1} = 65539 V_j \bmod 2^{31}$

On 1960’s hardware it looked ok. Here’s the first few terms starting at $$V_0 = 1$$, rescaled to the unit interval [0,1]:

randu <- 1
for (i in 1:10000) {
n <- (65539 * last(randu)) %% (2^31)
randu <- append(randu, n)
}

tibble(randu = randu / (2^31)) %>%
mutate(x = row_number()) %>%
ggplot(aes(x = x, y = randu)) +
geom_point()

It takes a little time to get going, but once it’s going it looks like it’s bouncing all over the place. You can do a boxplot:

tibble(x = randu / (2^31)) %>%
ggplot(aes(x = x)) +
geom_boxplot()

And the median is at 0.5, the quartiles are at 0.25 and 0.75, the whiskers go all the way to 0 and 1…

This algorithm stinks, and Knuth has described it as “truly horrible”.

I wasn’t great at number theory, but when I saw the algo something went off in my head. Maybe 8 perfect riffle shuffles as a highly related concept.

Behold RANDU in two dimensions, plotting $$V_n$$ against $$V_{n+1}$$:

tibble(x = randu / (2^31)) %>%
mutate(y = lag(x)) %>%
ggplot(aes(x = x, y = y)) +
geom_point()

In 2d it still looks like it works. But in 3d:

tibble(x = randu / (2^31)) %>%
mutate(y = lag(x)) %>%
mutate(z = lag(y)) %>%
plot_ly(x = ~x, y = ~y, z = ~z) %>%
add_markers(size = 0.1)

Drag that around a bit. You’re looking for something like this.

I’m going to call this a special case of Schneier’s Law - anyone can construct a pseudorandom number generator that they can’t find a fault with. But with enough eyes any bug is shallow.