Bug Description
CitationRegistry stores its state in a class-level dictionary (_instances) shared across all requests. init_citation_registry() calls CitationRegistry.reset(), which wipes this global dict for every concurrent session. When two requests run the init_citation_registry → assign_citation_ids_stateful pipeline at the same time, one request's reset destroys the other's in-progress state, producing corrupted or swapped citation IDs in responses.
Location
servers/custom/src/custom.py, ~line 405:
class CitationRegistry:
_instances: Dict[int, Dict[str, Any]] = {} # class-level, shared across all requests
@classmethod
def reset(cls):
cls._instances = {} # wipes state for ALL concurrent sessions
~line 435:
@app.tool(output="q_ls->q_ls")
def init_citation_registry(q_ls: List[str]) -> Dict[str, Any]:
CitationRegistry.reset() # global reset triggered per request
return {"q_ls": q_ls}
Reproduction
- Send two concurrent requests that both invoke
init_citation_registry followed by assign_citation_ids_stateful.
- Request A calls
reset() while Request B is mid-way through assign_citation_ids_stateful.
- Request B's accumulated citations are wiped; it returns citation IDs starting from 1 for documents it had already assigned higher IDs.
Impact
Users receive incorrect citation numbers in answers, causing documents to be cited under wrong IDs. In multi-tenant deployments this also constitutes a cross-session information leak (one user's citation state can be reset by another user's request).
Suggested Fix
Scope registry state per request using a unique session/request ID rather than global class state:
def init_citation_registry(q_ls: List[str], request_id: str) -> Dict[str, Any]:
CitationRegistry._instances[request_id] = {}
return {"q_ls": q_ls, "request_id": request_id}
Or pass a fresh CitationRegistry instance through the pipeline context instead of using class-level storage.
Found via automated codebase analysis. Happy to submit a PR if this is confirmed.
Bug Description
CitationRegistrystores its state in a class-level dictionary (_instances) shared across all requests.init_citation_registry()callsCitationRegistry.reset(), which wipes this global dict for every concurrent session. When two requests run theinit_citation_registry→assign_citation_ids_statefulpipeline at the same time, one request's reset destroys the other's in-progress state, producing corrupted or swapped citation IDs in responses.Location
servers/custom/src/custom.py, ~line 405:~line 435:
Reproduction
init_citation_registryfollowed byassign_citation_ids_stateful.reset()while Request B is mid-way throughassign_citation_ids_stateful.Impact
Users receive incorrect citation numbers in answers, causing documents to be cited under wrong IDs. In multi-tenant deployments this also constitutes a cross-session information leak (one user's citation state can be reset by another user's request).
Suggested Fix
Scope registry state per request using a unique session/request ID rather than global class state:
Or pass a fresh
CitationRegistryinstance through the pipeline context instead of using class-level storage.Found via automated codebase analysis. Happy to submit a PR if this is confirmed.