|
| 1 | +#ifndef FARR_COMPAT_H |
| 2 | +#define FARR_COMPAT_H |
| 3 | + |
| 4 | +/** |
| 5 | + * farr_findVarInFrame — compatibility wrapper for Rf_findVarInFrame |
| 6 | + * =================================================================== |
| 7 | + * |
| 8 | + * BACKGROUND |
| 9 | + * ---------- |
| 10 | + * R 4.5.0 added R_getVar / R_getVarEx as the public C API replacements for |
| 11 | + * the non-API functions Rf_findVarInFrame and Rf_findVar. Their sibling |
| 12 | + * Rf_findVarInFrame3 was simultaneously flagged as non-API. R CMD check |
| 13 | + * already reports uses of Rf_findVarInFrame3; the remaining two may follow |
| 14 | + * in a future release. Writing R Extensions 4.5.0, §'Moving into C API |
| 15 | + * compliance' maps: |
| 16 | + * |
| 17 | + * Rf_findVarInFrame → R_getVar / R_getVarEx (added in R 4.5.0) |
| 18 | + * Rf_findVar → R_getVar / R_getVarEx (added in R 4.5.0) |
| 19 | + * |
| 20 | + * This header must be included after <Rcpp.h> (achieved via common.h, which |
| 21 | + * includes <Rcpp.h> first). Do NOT include <Rinternals.h> directly; CRAN |
| 22 | + * requires packages to access R internals only through the public headers. |
| 23 | + * |
| 24 | + * BEHAVIORAL DIFFERENCES: Rf_findVarInFrame vs R_getVarEx |
| 25 | + * -------------------------------------------------------- |
| 26 | + * |
| 27 | + * Property Rf_findVarInFrame(rho,sym) R_getVarEx(sym,rho,FALSE,dflt) |
| 28 | + * ------------------------- --------------------------- ----------------------------- |
| 29 | + * Argument order (env, symbol) (symbol, env, inherits, dflt) |
| 30 | + * Parent-frame search No No (inherits = FALSE) |
| 31 | + * Promise forcing Yes (doGet = TRUE) Yes (forces PROMSXP bindings) |
| 32 | + * Symbol not in frame Returns R_UnboundValue Returns dflt |
| 33 | + * Error on not-found No No (when dflt is provided) |
| 34 | + * R_MissingArg binding (*) Returns R_MissingArg Signals getMissingError |
| 35 | + * API status (R >= 4.5.0) Non-API (may be removed) Public/stable C API |
| 36 | + * Availability All R versions R >= 4.5.0 only |
| 37 | + * |
| 38 | + * KEY SUBTLETIES |
| 39 | + * -------------- |
| 40 | + * 1. Argument reversal. |
| 41 | + * Rf_findVarInFrame(rho, sym) ≡ R_getVarEx(sym, rho, FALSE, dflt) |
| 42 | + * The env and symbol positions swap, and two extra arguments are required. |
| 43 | + * |
| 44 | + * 2. Not-found sentinel. |
| 45 | + * Passing R_UnboundValue as the default to R_getVarEx makes it return |
| 46 | + * R_UnboundValue when the symbol is absent — matching Rf_findVarInFrame. |
| 47 | + * Using R_getVar (no default) would throw an error, like base::get(). |
| 48 | + * |
| 49 | + * 3. Promise forcing. |
| 50 | + * Both functions force PROMSXP bindings before returning. For the |
| 51 | + * '...' / R_DotsSymbol use case this is harmless: the DOTSXP itself is |
| 52 | + * not a PROMSXP; the individual dot elements inside are promises, but |
| 53 | + * they are only touched later via explicit CAR() calls by the caller. |
| 54 | + * |
| 55 | + * 4. (*) R_MissingArg — the critical difference for '...' lookup. |
| 56 | + * When a function with '...' is called with no extra arguments (e.g. |
| 57 | + * f() where f <- function(...) ...), R binds R_DotsSymbol to R_MissingArg |
| 58 | + * in the call frame. The symbol IS bound, but: |
| 59 | + * - Rf_findVarInFrame returns R_MissingArg directly, no error. |
| 60 | + * - R_getVarEx signals a getMissingError ("argument '...' is missing, |
| 61 | + * with no default"), matching base::get() semantics. |
| 62 | + * This getMissingError is a longjmp-based R condition, NOT a C++ exception, |
| 63 | + * so it CANNOT be caught with C++ try/catch — the longjmp bypasses all |
| 64 | + * catch blocks entirely. |
| 65 | + * |
| 66 | + * CHOSEN STRATEGY: We use R_existsVarInFrame (public, stable API) as a |
| 67 | + * fast pre-check. If the symbol is unbound, we return R_UnboundValue |
| 68 | + * immediately. When the symbol IS bound and IS R_DotsSymbol, we evaluate |
| 69 | + * ...length() in rho to detect empty dots (where R_getVarEx would longjmp). |
| 70 | + * ...length() is a SPECIALSXP, available since R 3.2.0, that returns 0 for |
| 71 | + * both R_NilValue and R_MissingArg bindings without forcing dot promises. |
| 72 | + * If ...length() == 0, we return R_MissingArg directly (matching the old |
| 73 | + * Rf_findVarInFrame behavior); otherwise, R_getVarEx is safe to call. |
| 74 | + * |
| 75 | + * For non-dots symbols, R_MissingArg bindings are theoretically possible |
| 76 | + * (e.g. a formal parameter with no default called without an argument) |
| 77 | + * but this wrapper is only used for R_DotsSymbol lookups in filearray. |
| 78 | + * Non-dots lookups fall through to R_getVarEx directly. |
| 79 | + * |
| 80 | + * Performance (all times per call on Apple M4, N = 1,000,000): |
| 81 | + * |
| 82 | + * Scenario Rf_findVarInFrame This wrapper |
| 83 | + * --------------------- ------------------- ---------------------- |
| 84 | + * Unbound symbol 5.8 ns 6.7 ns (1.2x) |
| 85 | + * Normal binding 6.2 ns 16.2 ns (2.6x) |
| 86 | + * Populated dots 6.0 ns 96.8 ns (16.2x) |
| 87 | + * Empty dots (MissingArg) 5.3 ns 79.2 ns (14.9x) |
| 88 | + * |
| 89 | + * vs R_tryCatchError: ~14,000 ns (1,700-2,700x) — unacceptable. |
| 90 | + * |
| 91 | + * The dots overhead (~90ns) is entirely from evaluating ...length() via |
| 92 | + * Rf_eval. This runs once per subset/assign call (not per element), so |
| 93 | + * the absolute cost is negligible in practice. |
| 94 | + * |
| 95 | + * 5. R_UnboundValue vs R_NilValue vs R_MissingArg. |
| 96 | + * Rf_findVarInFrame returns R_UnboundValue for unbound symbols, but every |
| 97 | + * caller in filearray immediately maps R_UnboundValue → R_NilValue (since |
| 98 | + * both mean "nothing to iterate"). This wrapper absorbs that mapping so |
| 99 | + * callers need no extra boilerplate — just swap Rf_findVarInFrame with |
| 100 | + * farr_findVarInFrame. |
| 101 | + * |
| 102 | + * USAGE |
| 103 | + * ----- |
| 104 | + * farr_findVarInFrame(rho, symbol) reproduces Rf_findVarInFrame behavior |
| 105 | + * with two intentional simplifications: |
| 106 | + * - Searches only frame rho, no parent-frame walk-up. |
| 107 | + * - Forces PROMSXP bindings. |
| 108 | + * - Returns R_NilValue when symbol is not bound in rho (NOT R_UnboundValue). |
| 109 | + * - Returns R_MissingArg when the binding is a missing-argument marker |
| 110 | + * (including empty '...' on R >= 4.5.0). |
| 111 | + * On R < 4.5.0 it calls Rf_findVarInFrame + maps R_UnboundValue → R_NilValue. |
| 112 | + * On R >= 4.5.0 it uses R_existsVarInFrame + ...length() + R_getVarEx. |
| 113 | + */ |
| 114 | + |
| 115 | +/* Portable wrapper. ------------------------------------------------------- */ |
| 116 | +static inline SEXP farr_findVarInFrame(SEXP rho, SEXP symbol) { |
| 117 | +#if R_VERSION >= R_Version(4, 5, 0) |
| 118 | + /* Fast-path: symbol not bound in this frame at all → R_NilValue. */ |
| 119 | + if (!R_existsVarInFrame(rho, symbol)) { |
| 120 | + return R_NilValue; |
| 121 | + } |
| 122 | + /* |
| 123 | + * Symbol IS bound. For R_DotsSymbol, the binding may be R_MissingArg |
| 124 | + * (empty dots). R_getVarEx would longjmp in that case, so we pre-check |
| 125 | + * with ...length() — a SPECIALSXP that safely returns 0 for both |
| 126 | + * R_NilValue and R_MissingArg dot bindings without forcing promises. |
| 127 | + */ |
| 128 | + if (symbol == R_DotsSymbol) { |
| 129 | + SEXP call = PROTECT(Rf_lang1(Rf_install("...length"))); |
| 130 | + int n = Rf_asInteger(Rf_eval(call, rho)); |
| 131 | + UNPROTECT(1); |
| 132 | + if (n == 0) { |
| 133 | + return R_MissingArg; |
| 134 | + } |
| 135 | + } |
| 136 | + /* Safe to call R_getVarEx: symbol exists and is not R_MissingArg. */ |
| 137 | + return R_getVarEx(symbol, rho, |
| 138 | + static_cast<Rboolean>(FALSE), R_NilValue); |
| 139 | +#else |
| 140 | + SEXP res = Rf_findVarInFrame(rho, symbol); |
| 141 | + return (res == R_UnboundValue) ? R_NilValue : res; |
| 142 | +#endif |
| 143 | +} |
| 144 | + |
| 145 | +#endif /* FARR_COMPAT_H */ |
0 commit comments