Skip to content

fix: use /responses for GitHub Copilot codex models#2320

Open
Keryer wants to merge 2 commits intologancyang:masterfrom
Keryer:master
Open

fix: use /responses for GitHub Copilot codex models#2320
Keryer wants to merge 2 commits intologancyang:masterfrom
Keryer:master

Conversation

@Keryer
Copy link
Copy Markdown

@Keryer Keryer commented Mar 21, 2026

Summary

Fix GitHub Copilot-authenticated Codex models (for example gpt-5.3-codex) failing with HTTP 400 because they were being sent to /chat/completions.

GitHub Copilot Codex models should use the /responses endpoint instead. This PR adds provider-side routing so Copilot Codex models
use Responses API, while other Copilot chat-compatible models continue using Chat Completions.

Root Cause

The GitHub Copilot chat path was hardwired to ChatOpenAICompletions, which always targets /chat/completions.

That works for normal Copilot chat models, but fails for Codex-family models that only support /responses.

Changes

  • add a dedicated GitHubCopilotResponsesModel for Copilot models that must call /responses
  • extract shared GitHub Copilot authenticated fetch logic so both API paths reuse the same token/header handling
  • add a reusable routing helper to detect when a GitHub Copilot model should use Responses API
  • wire that routing into both runtime model creation and model ping/verification
  • keep existing GitHubCopilotChatModel behavior for models that still support /chat/completions
  • add unit coverage for the routing logic

Behavior After This PR

  • gpt-5.3-codex with GitHub Copilot auth no longer hits /chat/completions
  • Copilot Codex models use /responses
  • chat runtime and connection testing use consistent endpoint selection

Testing

  • added unit tests for GitHub Copilot Responses API routing logic
  • manual validation target:
    • authenticate with GitHub Copilot
    • select gpt-5.3-codex
    • send a simple prompt
    • confirm the request no longer fails with 400 from /chat/completions

Notes

This change is intentionally provider-scoped and generalizable:
it avoids hardcoding a single model at the call site and instead routes GitHub Copilot Codex-family models through the correct API.

Route Copilot codex models away from /chat/completions to avoid 400
errors, and keep ping/model verification aligned with runtime routing.
@logancyang logancyang requested a review from Emt-lin April 6, 2026 12:17
@Emt-lin
Copy link
Copy Markdown
Collaborator

Emt-lin commented Apr 7, 2026

@Keryer

Nice work! CI passes and the routing approach is sound. Two suggestions before merging:

1. Deduplicate shouldUseGitHubCopilotResponsesApi calls in chatModelManager.ts

The helper is called 3 times per flow (createModelInstance and pingModel). This creates drift risk — if one call site is updated, the others can
easily be missed.

const useCopilotResponses = shouldUseGitHubCopilotResponsesApi(model);

if (useCopilotResponses) {
  constructorConfig.useResponsesApi = true;
  logInfo(`Enabling Responses API for GitHub Copilot model: ${model.name}`);
}

// later:
if (useCopilotResponses) {
  return new GitHubCopilotResponsesModel(constructorConfig);
}

Same pattern for pingModel().

2. Preserve key // Reason: comments in GitHubCopilotChatModel.ts

The PR removes several comments that explain non-obvious SDK/protocol constraints — these aren't restating the code, they document why workarounds
exist. Per our conventions ("Comment non-obvious code"), I'd suggest keeping at least:

  • 401 vs 403 retry: only retry on 401 (expired token), not 403 (no subscription — permanent)
  • typeof Request guard: some Obsidian mobile runtimes lack the Request global
  • streamUsage: false: Copilot API may not support stream_options.include_usage
  • In-place delta.content mutation: spread would lose non-enumerable properties like tool_calls

The new JSDoc @param/@returns tags are great — they complement rather than replace these rationale comments.

…store rationale comments

  Cache the result in useCopilotResponses in both createModelInstance() and
  tryPing() instead of calling the helper multiple times per flow.

  Restore Reason comments removed in c96a9d8 that document non-obvious
  constraints: why ChatOpenAICompletions is used over ChatOpenAI, the
  delta.role fallback to avoid broken tool_call_chunks, defaultRole?: any
  type choice, streamUsage/baseURL SDK behavior, and the typeof Request
  guard for Obsidian mobile runtimes.
@Keryer
Copy link
Copy Markdown
Author

Keryer commented Apr 7, 2026

@Emt-lin
Thanks for the thoughtful suggestions. I addressed both points in 39dba5e.

I deduplicated the shouldUseGitHubCopilotResponsesApi(model) checks in chatModelManager.ts, and I also restored the key rationale comments in GitHubCopilotChatModel.ts for the non-obvious SDK/protocol constraints you called out.

@Emt-lin
Copy link
Copy Markdown
Collaborator

Emt-lin commented Apr 7, 2026

@Keryer Thanks for your PR.

Ready to merge cc @logancyang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants