Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimized uniform small integers? #124

Open
nbecker opened this issue Jun 25, 2018 · 4 comments
Open

optimized uniform small integers? #124

nbecker opened this issue Jun 25, 2018 · 4 comments

Comments

@nbecker
Copy link

nbecker commented Jun 25, 2018

I've been using my own wrapper to generate uniform random integers of small specific bit widths. The idea is to cache the output of a block of random_uintegers and then demux into a stream of small width values. For example, you might have a stream of 1-bit (0/1) values. The output array is any integer dtype.

I just did some benchmarking comparing my approach for 1-bit values compared to random_int, and see a quite large speedup.

Maybe something similar could be useful for randomstate?

@bashtage
Copy link
Owner

The hard part with this approach is the reproducibility requires storing any unused bits. This adds to state which I think is more-or-less a no-no. For example, the BM gaussian in NumPy is a bit annoying to carry around.

I'm a little surprised that 1-bit values are slow -- they all generate 32 or 64 values from a single draw of the prng and store these in arrays where each element is 8 bits.

The better approach is to use randomgen which explicitly produces a set of basicRNGs that were designed to be easily incorporated into user Cython or Numba code.

@nbecker
Copy link
Author

nbecker commented Jun 25, 2018

My bench:
from randomgen import RandomGenerator, Xoroshiro128
rs = RandomGenerator (Xoroshiro128 (0))

from pn2 import pn64
p1 = pn64 (rs, 1)
from timeit import timeit

print (timeit('p1(1000)', globals=globals(), number=1000000))
print (timeit('rs.randint(0, 2, size=1000)', globals=globals(), number=1000000))

result;
3.2882123820018023
13.305304736997641

My 'pn64' class is c++ code that calls random_uintegers via python interface and caches a block of results, then gives them out into a stream M-bits at a time.

I could potentially call c code directly instead of going through python interface (if I knew how)

BTW, I am using randomgen, guess I should have filed this issue there

@bashtage
Copy link
Owner

bashtage commented Jun 25, 2018 via email

@bashtage
Copy link
Owner

Hmm, setting np.bool doesn't shave off much time. I suspect that most of the time is in allocating the memory for the output array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants