Skip to content

[v1.x] fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2405

Merged
felixweinberger merged 3 commits intov1.xfrom
fweinberger/dcr-empty-url-coerce-v1x
Apr 13, 2026
Merged

[v1.x] fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2405
felixweinberger merged 3 commits intov1.xfrom
fweinberger/dcr-empty-url-coerce-v1x

Conversation

@felixweinberger
Copy link
Copy Markdown
Contributor

@felixweinberger felixweinberger commented Apr 8, 2026

Coerce "" to None for the five OPTIONAL URL fields on OAuthClientMetadata so that DCR responses from servers that echo omitted metadata as empty strings parse successfully.

v1.x backport of the same change targeting main (#2404).

Motivation and Context

RFC 7591 §2 marks client_uri, logo_uri, tos_uri, policy_uri, and jwks_uri as OPTIONAL. Some authorization servers in the wild (observed: Udemy, deepsense.ai, Apify, windsor.ai, firecrawl, and others) echo the client's omitted metadata back as "" instead of dropping the keys:

{"client_id": "abc123", "client_uri": "", "logo_uri": "", "tos_uri": "", "policy_uri": "", ...}

AnyHttpUrl rejects "", so handle_registration_response raises ValidationError and callers discard an otherwise valid registration — the server returned 201 with a real client_id.

The servers are technically non-compliant (an empty string is not a valid URL), but the only reasonable interpretation of "" for an OPTIONAL field is "absent". This applies Postel's law at the deserialization boundary, matching the existing normalize_token_type validator on OAuthToken in the same file. Because OAuthClientInformationFull inherits from OAuthClientMetadata, the coercion applies to parsed DCR responses as well.

How Has This Been Tested?

New TestOAuthClientMetadataEmptyUrlCoercion covers per-field coercion (parametrized over all five fields), all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull (the DCR-response model), and a negative case confirming non-empty invalid URLs still raise.

Breaking Changes

None. This only widens accepted input — previously-rejected "" values now succeed as None. Valid URLs, None, and omitted keys are unchanged.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

RFC 7591 §2: https://datatracker.ietf.org/doc/html/rfc7591#section-2

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, focused fix with comprehensive test coverage.

Extended reasoning...

Overview

This v1.x backport adds a field_validator to OAuthClientMetadata that coerces empty-string values to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri). The diff touches exactly two files: the model definition and its test suite.

Security risks

None. The change only widens accepted input at the deserialization boundary — empty strings that previously raised ValidationError now parse as None. Valid URLs, explicit None, and omitted keys are entirely unaffected. No auth logic, token handling, or permission checks are modified.

Level of scrutiny

Low. This is a mechanical, non-breaking fix that follows an established pattern already present in the same file (normalize_token_type on OAuthToken). The validator is a simple equality check with a two-branch return. The fix is scoped entirely to DCR response parsing via OAuthClientMetadata and its subclass OAuthClientInformationFull.

Other factors

Test coverage is thorough: parametrized per-field coercion, all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull, and a negative case confirming non-empty invalid URLs still raise. The pre-existing gap in OAuthMetadata and ProtectedResourceMetadata flagged in the inline comment is explicitly not introduced by this PR and is worth a follow-up, but does not block merging this correct fix.

Comment on lines +74 to 94
@field_validator(
"client_uri",
"logo_uri",
"tos_uri",
"policy_uri",
"jwks_uri",
mode="before",
)
@classmethod
def _empty_string_optional_url_to_none(cls, v: object) -> object:
# RFC 7591 §2 marks these URL fields OPTIONAL. Some authorization servers
# echo omitted metadata back as "" instead of dropping the keys, which
# AnyHttpUrl would otherwise reject — throwing away an otherwise valid
# registration response. Treat "" as absent.
if v == "":
return None
return v

def validate_scope(self, requested_scope: str | None) -> list[str] | None:
if requested_scope is None:
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 This PR correctly adds empty-string coercion for OAuthClientMetadata, but OAuthMetadata and ProtectedResourceMetadata have the same defect for their optional AnyHttpUrl fields and are left unprotected. This is a pre-existing gap — the PR does not introduce or modify these classes — but the same Postel's law rationale applies equally to them.

Extended reasoning...

What the bug is

The PR adds a field_validator to OAuthClientMetadata that coerces empty strings to None for five optional AnyHttpUrl fields. The same class of defect exists in two sibling models that were not touched by this PR:

  • OAuthMetadata (RFC 8414 Authorization Server Metadata): registration_endpoint, service_documentation, op_policy_uri, op_tos_uri, revocation_endpoint, introspection_endpoint — all declared AnyHttpUrl | None at lines 135, 142–149, with no empty-string validator.
  • ProtectedResourceMetadata (RFC 9728 Protected Resource Metadata): jwks_uri, resource_documentation, resource_policy_uri, resource_tos_uri — all declared AnyHttpUrl | None at lines 164, 169–171, with no empty-string validator.

How it manifests

If a non-compliant authorization server or resource server echoes any of these optional URL fields as "" (rather than omitting the key), Pydantic's AnyHttpUrl validator rejects the empty string and raises ValidationError. The entire discovery response is discarded — the same failure mode this PR was authored to fix.

Step-by-step proof

  1. A non-compliant AS returns its metadata with "op_tos_uri": "".
  2. The client calls OAuthMetadata.model_validate(data).
  3. Pydantic attempts to coerce "" through AnyHttpUrl — no field_validator intercepts it.
  4. AnyHttpUrl raises ValidationError because an empty string is not a valid URL.
  5. The caller discards the response and OAuth discovery fails entirely.

The same sequence applies to any of the unprotected fields in ProtectedResourceMetadata.

Why existing code does not prevent it

The new validator is defined only on OAuthClientMetadata and is not inherited by OAuthMetadata or ProtectedResourceMetadata (they are not subclasses). Neither of those models defines any field_validator for their URL fields.

Impact

Any MCP client connecting to a non-compliant server that echoes empty-string URL fields in its authorization server metadata or protected resource metadata response will fail OAuth discovery entirely. The behavior is identical to what the PR describes for DCR responses — a valid discovery document is thrown away because of a cosmetic serialization quirk in a single optional field.

How to fix

Add the same _empty_string_optional_url_to_none field_validator (or a shared mixin) to OAuthMetadata and ProtectedResourceMetadata, covering all their optional AnyHttpUrl fields.

Pre-existing status

This is a pre-existing issue that predates this PR. The PR neither introduces these models nor adds new call sites for them — it is scoped entirely to DCR response parsing via OAuthClientMetadata. The gap in coverage is worth addressing as a follow-up given the PR's stated motivation.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean backport of an already-merged fix with thorough test coverage.

Extended reasoning...

Overview

Adds a to coercing empty-string values to for five optional fields (, , , , ). Two files changed: the model definition and its test file.

Security risks

None. The change widens accepted input at deserialization — empty strings that previously caused now parse as . Valid URLs and values are unaffected. No auth bypass or data exposure is possible from this coercion.

Level of scrutiny

Low. This is a v1.x backport of #2404, which already merged to main. The validator pattern (, identity passthrough for non-empty) is identical to the existing validator on in the same file. The negative test () confirms non-empty garbage URLs still raise, so the guard is not over-permissive.

Other factors

Test coverage is comprehensive: parametrized per-field, all-fields-together, valid-URL passthrough, inheritance via , and a negative case. My prior inline comment noted a pre-existing gap in and — that remains worth a follow-up issue but is out of scope for this targeted fix.

@felixweinberger
Copy link
Copy Markdown
Contributor Author

Pending #2435 as this fails CI until that lands.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — small, well-tested input-widening validator on unused optional metadata fields; my earlier note about sibling models is a pre-existing follow-up, not a blocker.

Extended reasoning...

Overview

This PR adds a single field_validator(mode="before") to OAuthClientMetadata that coerces ""None for five optional AnyHttpUrl fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri), plus five focused tests in tests/shared/test_auth.py. It is a v1.x backport of #2404. The CI blocker (#2435, pydantic 2.13 compat) has landed and been merged into this branch.

Security risks

None. The five affected fields are explicitly annotated in the source as "currently unused, but we support & store them for potential future use" — they are display/informational metadata, not used in any redirect, token, or authorization decision. The validator only widens accepted input by mapping the empty string to None; valid URLs, None, and omitted keys behave exactly as before, and non-empty invalid strings still raise (covered by a negative test). There is no auth-bypass, injection, or data-exposure surface here.

Level of scrutiny

Low. This is an 18-line defensive deserialization tweak following the exact pattern of the existing normalize_token_type validator on OAuthToken in the same file. The logic is a single equality check with no branching beyond if v == "". Inheritance through OAuthClientInformationFull (the actual DCR response model) is explicitly tested.

Other factors

My earlier inline comment flagged that OAuthMetadata and ProtectedResourceMetadata have the same gap — that is a pre-existing condition, out of scope for this targeted fix, and suitable for a follow-up. Test coverage is thorough (parametrized per-field, all-fields-together, passthrough, subclass inheritance, and negative case). No outstanding reviewer requests remain.

@felixweinberger felixweinberger merged commit 73d458b into v1.x Apr 13, 2026
22 checks passed
@felixweinberger felixweinberger deleted the fweinberger/dcr-empty-url-coerce-v1x branch April 13, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants