Skip to content

whisprer/c-simd-rng-lib

Repository files navigation

<<<<<<< HEAD [README.md]

Universal-Architecture-Rng-Lib-Devcopy

Release Version Build Status

Commits Last Commit Issues Version Platform Python License

Universal-Architecture-Rng-Lib-Devcopy Banner

======= # Universal Architecture PRNG std. Replacement C++ Lib [cite: 1]

[README.md]

cd509531d21c9ae543c02eb4eb47c4885bb1b8a8

C-Simd-Rng-Lib

<<<<<<< HEAD

[README.md]

Universal-Architecture-Rng-Lib-Devcopy

Release Version Build Status

Commits Last Commit Issues Version

Release Version Build Status

Commits Last Commit Issues Version

cd509531d21c9ae543c02eb4eb47c4885bb1b8a8 Platform Python License

<<<<<<< HEAD Universal-Architecture-Rng-Lib-Devcopy Banner


[README.md]

Universal-Architecture-Rng-Lib-Devcopy

Release Version Build Status

Commits Last Commit Issues Version Platform Python License

Universal-Architecture-Rng-Lib-Devcopy Banner


[README.md]

Universal-Architecture-RNG-Lib_-_devcopy

Release Version Build Status

Commits Last Commit Issues Version Platform Python License

Universal-Architecture-RNG-Lib_-_devcopy Banner


[README.md]

Universal-Architecture-RNG-Lib_-_devcopy

Release Version Build Status

Commits Last Commit Issues Version Platform Python License

Universal-Architecture-RNG-Lib_-_devcopy Banner

=======

C-Simd-Rng-Lib Banner

A high-performance, cross-platform random number generation library with SIMD and GPU acceleration. [cite: 1]

Overview - [resuced from eird formatting doldrums by the one and only RTC!!! thanxyou :) ] [cite: 2]

universal_rng_lib is a fast, flexible RNG library written in modern C++. [cite: 3] It supports a range of algorithms including Xoroshiro128++ and WyRand, with runtime autodetection of the best CPU vectorization (SSE2, AVX2, AVX-512, NEON) and optional OpenCL GPU support. [cite: 3]

It significantly outperforms the C++ standard library RNGs and can replace them in scientific simulations, games, real-time systems, and more. [cite: 4]

Features [cite: 5]

  • ✅ Multiple PRNGs: Xoroshiro128++, WyRand
  • ✅ SIMD Acceleration: SSE2, AVX2, AVX-512, NEON (auto-detect at runtime)
  • ✅ OpenCL GPU support (optional)
  • ✅ Scalar fallback for universal compatibility
  • ✅ Batch generation for improved throughput
  • ✅ Support for 16–1024 bit generation
  • ✅ Cross-platform: Windows (MSVC, MinGW), Linux
  • ✅ MIT Licensed

Quick Start

Requirements

  • C++17-compatible compiler
  • CMake 3.15+
  • Ninja (recommended)
  • OpenCL SDK (optional)

Build Instructions

Linux/Mac (bash)

git clone [https://github.com/YOUR_USERNAME/universal_rng_lib.git](https://github.com/YOUR_USERNAME/universal_rng_lib.git)
cd universal_rng_lib
mkdir build && cd build
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build . --parallel [cite: 6]
./rng_selftest
Windows (MSYS2 MinGW64 shell)
Bash

git clone [https://github.com/YOUR_USERNAME/universal_rng_lib.git](https://github.com/YOUR_USERNAME/universal_rng_lib.git)
cd universal_rng_lib
mkdir build && cd build
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DRNG_WITH_OPENCL=OFF
cmake --build . --parallel [cite: 7]
./rng_selftest.exe
Note: If AVX2 is supported, it will automatically be compiled in and used.

Usage Example 

C++

#include "universal_rng.h"
#include <iostream>

int main() {
    // Create RNG instance (seed, algorithm_id, bitwidth)
    universal_rng_t* rng = universal_rng_new(1337, 0, 1);
    // Generate 64-bit random integer [cite: 9]
    uint64_t val = universal_rng_next_u64(rng);
    std::cout << "Random u64: " << val << std::endl; [cite: 10]

    // Generate double in range [0,1)
    double d = universal_rng_next_double(rng);
    std::cout << "Random double: " << d << std::endl; [cite: 11]

    // Cleanup
    universal_rng_free(rng);
} [cite: 12]
Replace C++ Standard RNG
To use universal_rng_lib in place of std::mt19937 or std::default_random_engine:

Replace all instances of:

C++

std::mt19937 rng(seed);
``` [cite: 13]

**with:**
```cpp
auto* rng = universal_rng_new(seed, 0, 1);  // use algorithm 0 = Xoroshiro128++
Replace:

C++

rng(); // or dist(rng)
with:

C++

universal_rng_next_u64(rng);
Use universal_rng_next_double(rng); for floating-point needs. 

Replace cleanup:

C++

delete rng;
with:

C++

universal_rng_free(rng);
File Structure
.
├── include/                # All public headers
│   └── universal_rng.h    # Main header
├── Benchmarking/           # Benchmarking Results 
│                           # [compared against C++ std. lib] [cite: 15]
├── src/                    # Source code
│   ├── simd_avx2.cpp
│   ├── simd_sse2.cpp
│   ├── simd_avx512.cpp
│   ├── universal_rng.cpp
│   └── runtime_detect.cpp
├── lib_files/              # Prebuilt binaries
│   ├── mingw_shared/
│   ├── msvc_shared/
│   └── linux_shared/
├── extras/                 # Environment setups and tools
│   └── windows/
├── docs/                   # In-depth design documentation [cite: 16]
│   ├── key_SIMD-implementation_design-principles.md
│   ├── explain_of_3-7's_refactor.md
│   └── opencl-implementation-details.md
└── tests/                  # Self-test and benchmarks
SIMD & Dispatch Design
Auto-detects best available instruction set at runtime

Gracefully falls back to scalar or SSE2

Batches can be used to further accelerate performance

Detection failures are handled gracefully

Example detection result:

YAML

CPU feature detection:
  SSE2: Yes
  AVX2: Yes
  AVX512: No
[cite_start]Trying AVX2 implementation... [cite: 17]
Using AVX2 implementation
Benchmarking & Performance
Batch mode yields 1.7×–2.5× speedup over naive generation

AVX2 performs ~3–5× faster than std::mt19937

AVX-512 versions under development

License
MIT License – see LICENSE.md for full terms. 

Reference
This library is partially inspired by:

David Blackman & Sebastiano Vigna's paper on Scrambled Linear PRNGs (SLRNG)
>>>>>>> cd509531d21c9ae543c02eb4eb47c4885bb1b8a8

About

A Replacement for the C++ std. RNG Lib that employs SIMD

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

 
 
 

Contributors