[v1.x] fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2405
Conversation
There was a problem hiding this comment.
LGTM — clean, focused fix with comprehensive test coverage.
Extended reasoning...
Overview
This v1.x backport adds a field_validator to OAuthClientMetadata that coerces empty-string values to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri). The diff touches exactly two files: the model definition and its test suite.
Security risks
None. The change only widens accepted input at the deserialization boundary — empty strings that previously raised ValidationError now parse as None. Valid URLs, explicit None, and omitted keys are entirely unaffected. No auth logic, token handling, or permission checks are modified.
Level of scrutiny
Low. This is a mechanical, non-breaking fix that follows an established pattern already present in the same file (normalize_token_type on OAuthToken). The validator is a simple equality check with a two-branch return. The fix is scoped entirely to DCR response parsing via OAuthClientMetadata and its subclass OAuthClientInformationFull.
Other factors
Test coverage is thorough: parametrized per-field coercion, all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull, and a negative case confirming non-empty invalid URLs still raise. The pre-existing gap in OAuthMetadata and ProtectedResourceMetadata flagged in the inline comment is explicitly not introduced by this PR and is worth a follow-up, but does not block merging this correct fix.
| @field_validator( | ||
| "client_uri", | ||
| "logo_uri", | ||
| "tos_uri", | ||
| "policy_uri", | ||
| "jwks_uri", | ||
| mode="before", | ||
| ) | ||
| @classmethod | ||
| def _empty_string_optional_url_to_none(cls, v: object) -> object: | ||
| # RFC 7591 §2 marks these URL fields OPTIONAL. Some authorization servers | ||
| # echo omitted metadata back as "" instead of dropping the keys, which | ||
| # AnyHttpUrl would otherwise reject — throwing away an otherwise valid | ||
| # registration response. Treat "" as absent. | ||
| if v == "": | ||
| return None | ||
| return v | ||
|
|
||
| def validate_scope(self, requested_scope: str | None) -> list[str] | None: | ||
| if requested_scope is None: | ||
| return None |
There was a problem hiding this comment.
🟣 This PR correctly adds empty-string coercion for OAuthClientMetadata, but OAuthMetadata and ProtectedResourceMetadata have the same defect for their optional AnyHttpUrl fields and are left unprotected. This is a pre-existing gap — the PR does not introduce or modify these classes — but the same Postel's law rationale applies equally to them.
Extended reasoning...
What the bug is
The PR adds a field_validator to OAuthClientMetadata that coerces empty strings to None for five optional AnyHttpUrl fields. The same class of defect exists in two sibling models that were not touched by this PR:
OAuthMetadata(RFC 8414 Authorization Server Metadata):registration_endpoint,service_documentation,op_policy_uri,op_tos_uri,revocation_endpoint,introspection_endpoint— all declaredAnyHttpUrl | Noneat lines 135, 142–149, with no empty-string validator.ProtectedResourceMetadata(RFC 9728 Protected Resource Metadata):jwks_uri,resource_documentation,resource_policy_uri,resource_tos_uri— all declaredAnyHttpUrl | Noneat lines 164, 169–171, with no empty-string validator.
How it manifests
If a non-compliant authorization server or resource server echoes any of these optional URL fields as "" (rather than omitting the key), Pydantic's AnyHttpUrl validator rejects the empty string and raises ValidationError. The entire discovery response is discarded — the same failure mode this PR was authored to fix.
Step-by-step proof
- A non-compliant AS returns its metadata with
"op_tos_uri": "". - The client calls
OAuthMetadata.model_validate(data). - Pydantic attempts to coerce
""throughAnyHttpUrl— nofield_validatorintercepts it. AnyHttpUrlraisesValidationErrorbecause an empty string is not a valid URL.- The caller discards the response and OAuth discovery fails entirely.
The same sequence applies to any of the unprotected fields in ProtectedResourceMetadata.
Why existing code does not prevent it
The new validator is defined only on OAuthClientMetadata and is not inherited by OAuthMetadata or ProtectedResourceMetadata (they are not subclasses). Neither of those models defines any field_validator for their URL fields.
Impact
Any MCP client connecting to a non-compliant server that echoes empty-string URL fields in its authorization server metadata or protected resource metadata response will fail OAuth discovery entirely. The behavior is identical to what the PR describes for DCR responses — a valid discovery document is thrown away because of a cosmetic serialization quirk in a single optional field.
How to fix
Add the same _empty_string_optional_url_to_none field_validator (or a shared mixin) to OAuthMetadata and ProtectedResourceMetadata, covering all their optional AnyHttpUrl fields.
Pre-existing status
This is a pre-existing issue that predates this PR. The PR neither introduces these models nor adds new call sites for them — it is scoped entirely to DCR response parsing via OAuthClientMetadata. The gap in coverage is worth addressing as a follow-up given the PR's stated motivation.
There was a problem hiding this comment.
LGTM — clean backport of an already-merged fix with thorough test coverage.
Extended reasoning...
Overview
Adds a to coercing empty-string values to for five optional fields (, , , , ). Two files changed: the model definition and its test file.
Security risks
None. The change widens accepted input at deserialization — empty strings that previously caused now parse as . Valid URLs and values are unaffected. No auth bypass or data exposure is possible from this coercion.
Level of scrutiny
Low. This is a v1.x backport of #2404, which already merged to main. The validator pattern (, identity passthrough for non-empty) is identical to the existing validator on in the same file. The negative test () confirms non-empty garbage URLs still raise, so the guard is not over-permissive.
Other factors
Test coverage is comprehensive: parametrized per-field, all-fields-together, valid-URL passthrough, inheritance via , and a negative case. My prior inline comment noted a pre-existing gap in and — that remains worth a follow-up issue but is out of scope for this targeted fix.
|
Pending #2435 as this fails CI until that lands. |
There was a problem hiding this comment.
LGTM — small, well-tested input-widening validator on unused optional metadata fields; my earlier note about sibling models is a pre-existing follow-up, not a blocker.
Extended reasoning...
Overview
This PR adds a single field_validator(mode="before") to OAuthClientMetadata that coerces "" → None for five optional AnyHttpUrl fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri), plus five focused tests in tests/shared/test_auth.py. It is a v1.x backport of #2404. The CI blocker (#2435, pydantic 2.13 compat) has landed and been merged into this branch.
Security risks
None. The five affected fields are explicitly annotated in the source as "currently unused, but we support & store them for potential future use" — they are display/informational metadata, not used in any redirect, token, or authorization decision. The validator only widens accepted input by mapping the empty string to None; valid URLs, None, and omitted keys behave exactly as before, and non-empty invalid strings still raise (covered by a negative test). There is no auth-bypass, injection, or data-exposure surface here.
Level of scrutiny
Low. This is an 18-line defensive deserialization tweak following the exact pattern of the existing normalize_token_type validator on OAuthToken in the same file. The logic is a single equality check with no branching beyond if v == "". Inheritance through OAuthClientInformationFull (the actual DCR response model) is explicitly tested.
Other factors
My earlier inline comment flagged that OAuthMetadata and ProtectedResourceMetadata have the same gap — that is a pre-existing condition, out of scope for this targeted fix, and suitable for a follow-up. Test coverage is thorough (parametrized per-field, all-fields-together, passthrough, subclass inheritance, and negative case). No outstanding reviewer requests remain.
Coerce
""toNonefor the five OPTIONAL URL fields onOAuthClientMetadataso that DCR responses from servers that echo omitted metadata as empty strings parse successfully.v1.x backport of the same change targeting
main(#2404).Motivation and Context
RFC 7591 §2 marks
client_uri,logo_uri,tos_uri,policy_uri, andjwks_urias OPTIONAL. Some authorization servers in the wild (observed: Udemy, deepsense.ai, Apify, windsor.ai, firecrawl, and others) echo the client's omitted metadata back as""instead of dropping the keys:{"client_id": "abc123", "client_uri": "", "logo_uri": "", "tos_uri": "", "policy_uri": "", ...}AnyHttpUrlrejects"", sohandle_registration_responseraisesValidationErrorand callers discard an otherwise valid registration — the server returned 201 with a realclient_id.The servers are technically non-compliant (an empty string is not a valid URL), but the only reasonable interpretation of
""for an OPTIONAL field is "absent". This applies Postel's law at the deserialization boundary, matching the existingnormalize_token_typevalidator onOAuthTokenin the same file. BecauseOAuthClientInformationFullinherits fromOAuthClientMetadata, the coercion applies to parsed DCR responses as well.How Has This Been Tested?
New
TestOAuthClientMetadataEmptyUrlCoercioncovers per-field coercion (parametrized over all five fields), all-fields-together, valid-URL passthrough, inheritance throughOAuthClientInformationFull(the DCR-response model), and a negative case confirming non-empty invalid URLs still raise.Breaking Changes
None. This only widens accepted input — previously-rejected
""values now succeed asNone. Valid URLs,None, and omitted keys are unchanged.Types of changes
Checklist
Additional context
RFC 7591 §2: https://datatracker.ietf.org/doc/html/rfc7591#section-2