KEMBAR78
Algorithm designs · Issue #19 · tc39/proposal-random-functions · GitHub
Skip to content

Algorithm designs #19

@tabatkins

Description

@tabatkins

I've been reviewing the literature and existing implementations for guidance on how I'm going to specify the algorithms for .random(), .number(), .int(), and .bigint(), and I think I've come to some conclusions.

First, my writeup of a few of the algorithms is documented at https://github.com/tc39/proposal-random-functions/blob/main/generation-methods.md.

Second, my conclusions about each method:

.random()

(Number, between 0 and 1.)

I think I'm going to just use the standard "create a 53-bit integer, load it into a float, divide it by 2^53" approach used by many libraries. This uses a single 64-bit block, is fast, and is perfectly uniform.

The only downside to me is that it can generate 0, and I think it's probably most useful to default to a double-open range instead. But for .random() I can compromise and stick with the simple, widely-used solution.

I don't think it's worthwhile to take the second approach in the doc, which uses 128 bits to extend the range closer to zero and fill in every possible float, but I could be convinced. ^_^

.number()

(Number, between min and max)

The standard approach (generate a number in 1-2, subtract one, multiply by range, add min) is cheap and simple to understand, but terrible for uniformity, and only ends up exploiting 52 bits of randomness.

Instead, I think it's worthwhile to go with the approach in https://github.com/tc39/proposal-random-functions/blob/main/generation-methods.md#number-in-min-max-1, using 3 blocks of 64 bits. This approach uses two blocks to generate a perfectly uniform value in the range (with values separated by the largest distance between neighboring floats in the range), then uses the third block to fill in some additional entropy while maintaining value uniformity. It generates all possible floats in the range until you get sufficiently close to zero (where "sufficiently close" is "52 exponents down from the larger of the endpoints".

Probably defaulting to a double-closed range, but with options to include either or both ends. (The difference is trivial in the algorithm.)

.int() and .bigint() (and stepped .number())

(Integer or BigInt, between min and max)

Going with the two approaches listed in the doc. For smaller values (where the range is less than 64 bits wide), use Canon's method - 2 64-bit blocks, with a bias less than 2^-64. For larger values, use the method I've developed in the doc - use Canon's method to generate the most significant 63 bits of the range, then fill the rest randomly and do a rejection at the end if needed (less than 2^-62 chance of rejection occurring). This ensures that you generate every possible int between the endpoints, with uniform value even if you're outside the safe integer range.

Probably defaulting to a double-closed range, but with opens to exclude either or both ends. (The difference is trivial in the algorithm.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions