fix: resolve 8 bugs in core model and inference code#244
Open
sjhddh wants to merge 1 commit intoshiyu-coder:masterfrom
Open
fix: resolve 8 bugs in core model and inference code#244sjhddh wants to merge 1 commit intoshiyu-coder:masterfrom
sjhddh wants to merge 1 commit intoshiyu-coder:masterfrom
Conversation
- Fix requires_grad typo in FixedEmbedding (was no-op) - Fix backward() gradient count mismatch in DifferentiableEntropyFunction - Fix int() unpacking crash in codebook entry methods - Fix undefined avg_prob when soft_entropy disabled - Fix is_causal misuse in cross-attention (semantic bug + crash risk) - Fix sample_from_logits crash on None top_k/top_p - Prevent in-place mutation of logits in top_k_top_p_filtering - Add @torch.no_grad() to inference methods - Replace wildcard import with explicit imports - Extract shared DataFrame validation helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Systematic audit of
model/module.pyandmodel/kronos.pyuncovered 8 bugs, several with crash potential:requires_gradtypo inFixedEmbedding—w.require_grad = Falseis a silent no-op (should berequires_grad)DifferentiableEntropyFunction.backward— returns 5 values for 4forwardargsint()unpacking crash inget_codebook_entry/get_group_codebook_entry—h, w = int(...)can't unpack a scalarNameErrorwhensoft_entropy=False—avg_probreferenced but never assigned in that branchis_causal=Truein cross-attention — semantically wrong (cross-attn should attend to all keys) and crashes whenattn_maskis also providedNonecomparison crash insample_from_logits—top_k > 0raisesTypeErrorwhentop_k is Nonetop_k_top_p_filtering— caller's tensor silently modified@torch.no_grad()onKronosPredictor.generate/predict/predict_batchAlso:
from model.module import *with explicit importsindexes_to_codes/group_indexes_to_codes_validate_and_preparehelper to deduplicate DataFrame validationTest plan
pytest tests/test_kronos_regression.py)FixedEmbeddingweights are now truly frozen (verifyrequires_grad == False)sample_from_logits(logits, top_k=None, top_p=None)no longer crashestop_k_top_p_filteringdoes not mutate the input tensor🤖 Generated with Claude Code