Skip to content

pytorch-optimizer v3.10.0

Latest

Choose a tag to compare

@kozistr kozistr released this 01 Mar 06:19
· 2 commits to main since this release
4888213

Change Log

Feature

  • Add support for foreach. (#287, #476, #477)
    • More than 10 optimizers (e.g. AdaFactor, StableAdamW, Lion, AdaBelief, Amos, ...) now support foreach.
    • In most cases, foreach improves training speed by 1.1x to 1.5x, with a moderate increase in memory usage.
    • Like official PyTorch optimizers, the default value of foreach is None. When foreach=None, CUDA paths prefer the foreach implementation over the for-loop implementation.
    • If you need the previous for-loop behavior, set foreach=False explicitly.
  • Update the Emo-series optimizers. (#472, #478)
    • Update EmoNavi, EmoFact, and EmoLynx.
    • Begin deprecating EmoNeco and EmoZeal (they are being phased out).
  • Implement SpectralSphere optimizer. (#483, #485)
  • Support various coefficients for zero_power_via_newton_schulz_5. (#487)
    • Add coefficient presets: original, quintic, polar_express, and polar_express_safer.
    • Support custom coefficient schedules and expose ns_coeffs in Muon, DistributedMuon, AdaMuon, and AdaGO.

Refactor

  • Rename and organize type aliases. (#488)

Fix

  • Fix misbehavior in AdaFactor optimizer. (#477)
  • Fix a potential NaN issue in AdamP optimizer. (#480, #481)
  • Fix Lookahead wrapper compatibility with accelerate by normalizing lookahead_state serialization. (#484, #489)

Docs

  • Convert the previous docstring style to Google style. (#487)
  • Add py.typed to mark distributed typing information. (#487)
  • rework the README.md. (#491, #492)

CI/CD

  • Introduce uv. (#473)