Skip to content

Commit cf00a48

Browse files
claudeconnortsui20
authored andcommitted
optimize InnerProduct for SorfTransform and Dict-encoded constant queries
Pulls two reduction rules into `InnerProduct::execute` that together make cosine-similarity queries against TurboQuant-compressed columns land on direct codebook lookups instead of decoding the full column per row. Case 1 (`try_execute_sorf_constant`): when one side of `InnerProduct` is `ExactScalarFn<SorfTransform>` and the other is a constant-backed tensor extension, rewrite to `InnerProduct(sorf_child, forward_rotate(zero_pad(const)))` and recursively re-execute so case 2 can fire on the rewritten tree. This works because SORF is orthogonal: `<T(R^{-1} x), c> = <x, R · zero_pad(c)>` where `T` is the `padded_dim -> dim` truncation applied inside `SorfTransform::execute`. Gated on `element_ptype == F32` because SorfTransform's trailing `f32 -> element_ptype` cast breaks the rewrite's semantics for F16/F64. Case 2 (`try_execute_dict_constant`): when one side's storage is `FSL(Dict(u8, f32))` with `values.len() <= 256` and the other side is a constant-backed tensor extension with F32 element ptype, compute each row's inner product via direct codebook lookup `acc += q[j] * values[codes[j] as usize]`. An explicit product table `P[j, k] = q[j] * values[k]` was prototyped and measured ~10% slower on the `similarity_search` bench because the 1 KiB `values` table stays in L1 across all rows while a 1 MiB product table does not. End-to-end `similarity_search` bench on dim=768, 10k rows (median): - TurboQuant: 8.84 ms -> 8.01 ms (-9%), now faster than uncompressed's 10.5 ms median. Ten new unit tests cover both fast paths, the mirrored argument orders, empty `len == 0`, `padded_dim == dim` and `padded_dim > dim`, and the fallback cases for plain (non-Dict) FSL storage and dicts with more than 256 values. Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
1 parent 56770bc commit cf00a48

1 file changed

Lines changed: 795 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)