Antalya 26.3: address OAuth security audit#1777
Open
zvonand wants to merge 55 commits into
Open
Conversation
`test_arrowflight_interface` failed with `Token authentication is not configured` after the M-19 split between Basic and Bearer authentication paths. pyarrow's `authenticate_basic_token` performs a `Handshake` with `Basic <base64>`, captures the `authorization` response header, and reuses it verbatim on every subsequent RPC. The middleware unconditionally emitted `Bearer <token>` in `SendingHeaders`, so the client's next call arrived as `Bearer <base64(user:password)>` and was routed through the token-auth path -- which has no processors configured in this test, hence the failure. Echo back the same scheme the client sent: Basic stays on the Basic path, Bearer keeps going through token validation. CI report: https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1777&sha=f1a7b011b93262d89f3e02dbff3876c965259f38&name_0=PR&name_1=Integration%20tests%20%28amd_binary%2C%203%2F5%29
test_sql_create_jwt_user_with_processor_pin / _with_claims / _no_pin_uses_auto_discovery asserted on `SHOW CREATE USER` output via the default `TabSeparated` format, which escapes single quotes -- the test then compared an unescaped SQL literal against an escaped one. Switch the three `SHOW CREATE USER` calls to `FORMAT TSVRaw` so the SQL literal makes it through verbatim. test_keycloak_auth/* all failed with `Token authentication is disabled` because the `openid` processor's `<configuration_endpoint>` is HTTP, and H-26 now refuses non-HTTPS URLs returned in the OIDC discovery document (defense in depth against poisoned discovery substituting cloud-metadata or in-cluster targets). The test Keycloak speaks plain HTTP, so wire `keycloak_discovery` in manual trust-chain mode -- exactly the escape hatch the H-26 error message documents -- and update the `test_openid_discovery` docstring accordingly. CI: - https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1777&sha=bb1b56b0ad9ec85064ffd52990e8cf0fa403a858&name_0=PR&name_1=Integration%20tests%20%28amd_asan%2C%20db%20disk%2C%20old%20analyzer%2C%203%2F6%29 - https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1777&sha=bb1b56b0ad9ec85064ffd52990e8cf0fa403a858&name_0=PR&name_1=Integration%20tests%20%28amd_asan%2C%20db%20disk%2C%20old%20analyzer%2C%202%2F6%29
H-26 unconditionally rejects non-HTTPS URLs returned by an OIDC discovery document. The check is defense in depth (the primary defense is <remote_url_allow_hosts>), but it has no escape hatch for operators who knowingly run their IdP over plain HTTP -- they were forced to abandon `configuration_endpoint` and hand-wire the trust chain. Add a per-processor `<allow_http_discovery_urls>` flag, false by default. When true, the HTTPS scheme check on userinfo_endpoint / introspection_endpoint / jwks_uri is suppressed (the <remote_url_allow_hosts> check still runs), and a warning is logged at processor construction so the gap is visible in operator logs. Revert the keycloak integration test back to using `configuration_endpoint` with the new opt-out, since the test Keycloak speaks HTTP.
L-02 added a cooldown branch in `JWKSClient::getJWKS`: when
`now - last_request_send < refresh_timeout` and no JWKS is cached, throw
"in cooldown after a recent failed fetch" so concurrent waiters do not
re-hammer a failing endpoint. The L-02 change updated `last_request_send`
unconditionally before issuing the HTTP call, so a thrown fetch still
arms the cooldown for subsequent callers.
But `last_request_send` is default-initialized to `time_point{}`, i.e.
the `steady_clock` epoch. On Linux the epoch is roughly system boot time
(and containerized hosts with isolated `CLOCK_MONOTONIC` can have a very
recent epoch), so `now - time_point{}` equals the steady-clock uptime.
When the host uptime is less than `jwks_cache_lifetime` (default 3600s)
the cooldown branch triggers on the very first call -- before any real
fetch has been attempted -- and serves "in cooldown after a recent
failed fetch" on freshly-started CI runners. That is what the
`test_keycloak_auth` flake on the targeted shard shows:
https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1777&sha=ec127073b3a4fdb7d1bf3f43978ecbce7173a549&name_0=PR&name_1=Integration%20tests%20%28amd_asan%2C%20targeted%29
Change `last_request_send` to `std::optional<time_point>`; treat
`nullopt` as "no attempt has been made yet" and skip the cooldown check
in that case. Also drop the leftover `high_resolution_clock` in the
write-locked branch (libc++ aliases it to `steady_clock` so it built,
but mixing clock types reads as accidental).
`test_sql_create_jwt_user_with_processor_pin` and `test_sql_create_jwt_user_with_claims` asserted that the rejected response did NOT contain the username. ClickHouse's auth-failure body embeds the username (`Code: 516. DB::Exception: <user>: Authentication failed: password is incorrect, or there is no user with such name. (AUTHENTICATION_FAILED)`), so the assertion tripped on the exact case it was meant to verify. Switch to a positive check on the `AUTHENTICATION_FAILED` marker. CI: https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1777&sha=495a8cf5f3052eb33a760169cb81bb2b5770d25c&name_0=PR&name_1=Integration%20tests%20%28amd_asan%2C%20targeted%29
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Address OAuth security audit
CI/CD Options
Exclude tests:
Regression jobs to run: