Skip to content

fix: correct encoder/decoder block count to match configured layers#248

Open
mvanhorn wants to merge 1 commit intoshiyu-coder:masterfrom
mvanhorn:fix/219-encoder-decoder-block-count
Open

fix: correct encoder/decoder block count to match configured layers#248
mvanhorn wants to merge 1 commit intoshiyu-coder:masterfrom
mvanhorn:fix/219-encoder-decoder-block-count

Conversation

@mvanhorn
Copy link
Copy Markdown

Summary

range(self.enc_layers - 1) and range(self.dec_layers - 1) in model/kronos.py create one fewer TransformerBlock than the n_enc_layers / n_dec_layers parameters specify. With the default of n_enc_layers=4, only 3 encoder blocks are created. Same for decoder.

Changes

  • model/kronos.py:62: range(self.enc_layers - 1) -> range(self.enc_layers)
  • model/kronos.py:67: range(self.dec_layers - 1) -> range(self.dec_layers)

Why

The parameter name (n_enc_layers) and docstring ("Number of encoder layers") both indicate the user expects N blocks when passing N. The forward pass (for layer in self.encoder) iterates all blocks with no separate final layer, so the -1 has no compensating logic elsewhere.

The embed and head linear projections are separate from the transformer blocks and shouldn't count toward the configured layer count.

Fixes #219

This contribution was developed with AI assistance (Claude Code).

range(self.enc_layers - 1) created one fewer TransformerBlock than
the n_enc_layers parameter specified. Same issue in decoder. When
n_enc_layers=4, only 3 blocks were created. Now creates the exact
number of blocks requested.

Fixes shiyu-coder#219
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

the number of encoder and decoder blocks is wrong in the code

1 participant