|
| 1 | +# `FastPFOR` Rust Wrapper |
| 2 | + |
| 3 | +Rust wrapper for the [FastPFOR C++ library](https://github.com/fast-pack/FastPFor), providing fast integer compression codecs optimized for sorted sequences. |
| 4 | + |
| 5 | +## Quick Start |
| 6 | + |
| 7 | +```rust |
| 8 | +use fastpfor::cpp::{FastPFor128Codec, Codec32}; |
| 9 | + |
| 10 | +let codec = FastPFor128Codec::new(); |
| 11 | +let input = vec![1, 2, 3, 4, 5, 100, 200, 300]; |
| 12 | +let mut compressed = vec![0u32; input.len() + 1024]; |
| 13 | + |
| 14 | +let encoded = codec.encode32(&input, &mut compressed).unwrap(); |
| 15 | +let mut decompressed = vec![0u32; input.len()]; |
| 16 | +let decoded = codec.decode32(encoded, &mut decompressed).unwrap(); |
| 17 | + |
| 18 | +assert_eq!(decoded, input.as_slice()); |
| 19 | +``` |
| 20 | + |
| 21 | +### 64-bit Integers |
| 22 | + |
| 23 | +Some codecs support 64-bit integers via the [`Codec64`] trait: |
| 24 | + |
| 25 | +```rust |
| 26 | +use fastpfor::cpp::{FastPFor128Codec, Codec64}; |
| 27 | + |
| 28 | +let codec = FastPFor128Codec::new(); |
| 29 | +let input = vec![100u64, 200, 300, 400]; |
| 30 | +let mut compressed = vec![0u32; input.len() * 2 + 1024]; |
| 31 | +let encoded = codec.encode64(&input, &mut compressed).unwrap(); |
| 32 | + |
| 33 | +let mut decompressed = vec![0u64; input.len()]; |
| 34 | +let decoded = codec.decode64(encoded, &mut decompressed).unwrap(); |
| 35 | +assert_eq!(decoded, input.as_slice()); |
| 36 | +``` |
| 37 | + |
| 38 | +See [`Codec32`] and [`Codec64`] trait documentation for buffer sizing guidelines. |
| 39 | + |
| 40 | +## Codec Selection |
| 41 | + |
| 42 | +> **Note:** See individual codec documentation below for detailed descriptions and use-case recommendations. |
| 43 | +
|
| 44 | +### General Purpose (Recommended) |
| 45 | + |
| 46 | +- [`FastPFor128Codec`], [`FastPFor256Codec`] - Best all-around choice. Fast decode, good compression for sorted/clustered data. Support 64-bit. |
| 47 | +- [`SimdFastPFor128Codec`], [`SimdFastPFor256Codec`] - SIMD-optimized variants for maximum throughput |
| 48 | + |
| 49 | +### Patched Frame-of-Reference Variants |
| 50 | + |
| 51 | +Frame-of-reference encoding with exception handling. Excellent for monotonic sequences (timestamps, IDs). |
| 52 | + |
| 53 | +- [`PForCodec`] - Standard implementation |
| 54 | +- [`SimplePForCodec`] - Simplified variant with lower complexity |
| 55 | +- [`NewPForCodec`] - Enhanced exception handling |
| 56 | +- [`OptPForCodec`] - Optimized for common patterns |
| 57 | +- [`PFor2008Codec`] - Reference implementation from research paper |
| 58 | +- **SIMD variants:** [`SimdPForCodec`], [`SimdNewPForCodec`], [`SimdOptPForCodec`], [`SimdSimplePForCodec`] |
| 59 | + |
| 60 | +### Binary Packing |
| 61 | + |
| 62 | +Bit-packing based on maximum bit width. Good for uniform data distributions. |
| 63 | + |
| 64 | +- [`BP32Codec`] - Standard 32-bit block binary packing |
| 65 | +- [`FastBinaryPacking8Codec`], [`FastBinaryPacking16Codec`], [`FastBinaryPacking32Codec`] - Different block sizes |
| 66 | +- [`SimdBinaryPackingCodec`] - SIMD-optimized variant |
| 67 | + |
| 68 | +### Variable Byte Encoding |
| 69 | + |
| 70 | +Best for unsorted data and small integers. Simple and widely compatible. |
| 71 | + |
| 72 | +- [`VByteCodec`] - Standard variable byte encoding (1-5 bytes per integer) |
| 73 | +- [`VarIntCodec`] - Standard varint format. Supports 64-bit. |
| 74 | +- [`VarIntGbCodec`] - Group varint with shared control information |
| 75 | +- **SIMD variants:** [`MaskedVByteCodec`], [`StreamVByteCodec`] - Excellent decode speed on modern CPUs |
| 76 | + |
| 77 | +### Simple Encodings |
| 78 | + |
| 79 | +Efficient for small positive integers (typically < 2^16). |
| 80 | +**Does not support arbitrary u32 inputs.** |
| 81 | + |
| 82 | +- [`Simple16Codec`] - 16 packing modes in 32-bit words |
| 83 | +- [`Simple9Codec`] - 9 packing modes for flexibility |
| 84 | +- [`Simple8bCodec`] - 8 packing modes in 64-bit words |
| 85 | +- [`Simple9RleCodec`], [`Simple8bRleCodec`] - With run-length encoding for repeated values |
| 86 | +- [`SimdGroupSimpleCodec`], [`SimdGroupSimpleRingBufCodec`] - SIMD-optimized |
| 87 | + |
| 88 | +### Utility |
| 89 | + |
| 90 | +- [`CopyCodec`] - No compression (baseline for benchmarking) |
| 91 | + |
| 92 | +## Thread Safety |
| 93 | + |
| 94 | +Codec instances **have internal state** that is **cleared after each operation**. |
| 95 | +They are **not thread-safe** during concurrent encode/decode operations. |
| 96 | + |
| 97 | +Use one of these strategies: |
| 98 | + |
| 99 | +- Create separate codec instances per thread |
| 100 | +- Synchronize access with mutexes |
| 101 | +- Use thread-local storage for codec instances |
| 102 | + |
| 103 | +## Architecture |
| 104 | + |
| 105 | +This module uses [CXX](https://cxx.rs/) to safely bridge Rust and C++: |
| 106 | + |
| 107 | +- Each codec wraps a C++ `IntegerCODEC` instance via [`UniquePtr`] |
| 108 | +- The [`Codec32`] and [`Codec64`] traits provide the Rust API |
| 109 | +- Memory is automatically managed by CXX and Rust's ownership system |
| 110 | + |
| 111 | +See the [FastPFOR C++ library documentation](https://github.com/fast-pack/FastPFor) for underlying implementation details. |
| 112 | + |
| 113 | +[`Codec32`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/trait.Codec32.html |
| 114 | +[`Codec64`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/trait.Codec64.html |
| 115 | +[`Exception`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/type.Exception.html |
| 116 | +[`UniquePtr`]: https://docs.rs/cxx/latest/cxx/struct.UniquePtr.html |
| 117 | +[`BP32Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.BP32Codec.html |
| 118 | +[`CopyCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.CopyCodec.html |
| 119 | +[`FastBinaryPacking8Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.FastBinaryPacking8Codec.html |
| 120 | +[`FastBinaryPacking16Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.FastBinaryPacking16Codec.html |
| 121 | +[`FastBinaryPacking32Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.FastBinaryPacking32Codec.html |
| 122 | +[`FastPFor128Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.FastPFor128Codec.html |
| 123 | +[`FastPFor256Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.FastPFor256Codec.html |
| 124 | +[`MaskedVByteCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.MaskedVByteCodec.html |
| 125 | +[`NewPForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.NewPForCodec.html |
| 126 | +[`OptPForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.OptPForCodec.html |
| 127 | +[`PFor2008Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.PFor2008Codec.html |
| 128 | +[`PForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.PForCodec.html |
| 129 | +[`SimdBinaryPackingCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdBinaryPackingCodec.html |
| 130 | +[`SimdFastPFor128Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdFastPFor128Codec.html |
| 131 | +[`SimdFastPFor256Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdFastPFor256Codec.html |
| 132 | +[`SimdGroupSimpleCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdGroupSimpleCodec.html |
| 133 | +[`SimdGroupSimpleRingBufCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdGroupSimpleRingBufCodec.html |
| 134 | +[`SimdNewPForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdNewPForCodec.html |
| 135 | +[`SimdOptPForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdOptPForCodec.html |
| 136 | +[`SimdPForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdPForCodec.html |
| 137 | +[`SimdSimplePForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimdSimplePForCodec.html |
| 138 | +[`Simple16Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.Simple16Codec.html |
| 139 | +[`Simple8bCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.Simple8bCodec.html |
| 140 | +[`Simple8bRleCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.Simple8bRleCodec.html |
| 141 | +[`Simple9Codec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.Simple9Codec.html |
| 142 | +[`Simple9RleCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.Simple9RleCodec.html |
| 143 | +[`SimplePForCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.SimplePForCodec.html |
| 144 | +[`StreamVByteCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.StreamVByteCodec.html |
| 145 | +[`VByteCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.VByteCodec.html |
| 146 | +[`VarIntCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.VarIntCodec.html |
| 147 | +[`VarIntGbCodec`]: https://docs.rs/fastpfor/latest/fastpfor/cpp/struct.VarIntGbCodec.html |
0 commit comments