Skip to content

Commit f60aa6a

Browse files
committed
refactor: keep_loaded --> eject_models Boolean switch. Use it to eject all other models prior to loading model for inference; helpful to maximize available latent space on device prior to UNet inference, for example
So this is a change from something just newly-released in 2.5.0, but most should either see an improvement or no change to behavior. This was the weakest, and jankiest part of 2.5.0 and my decision to manage a CPU memory leak turned into a too-aggressive solution with unwanted side effects. This solution should provide a better way to manage `compute` VRAM as the most asked-for feature is a way to remove everything else from VRAM prior to main UNet inference, which this accomplishes nicely, as well as reporting back accurate information DisTorch2 on-device shard sizes.
1 parent 72a2033 commit f60aa6a

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
)
2222

2323
WEB_DIRECTORY = "./web"
24-
MGPU_MM_LOG = True
24+
MGPU_MM_LOG = False
2525
DEBUG_LOG = False
2626

2727
logger = logging.getLogger("MultiGPU")

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[project]
22
name = "comfyui-multigpu"
33
description = "Provides a suite of custom nodes to manage multiple GPUs for ComfyUI, including advanced model offloading for both GGUF and Safetensor formats with DisTorch, and bespoke MultiGPU support for WanVideoWrapper and other custom nodes."
4-
version = "2.5.0"
4+
version = "2.5.1"
55
license = {file = "LICENSE"}
66

77
[project.urls]

0 commit comments

Comments
 (0)