Commit f60aa6a
committed
refactor: keep_loaded --> eject_models Boolean switch. Use it to eject all other models prior to loading model for inference; helpful to maximize available latent space on device prior to UNet inference, for example
So this is a change from something just newly-released in 2.5.0, but most should either see an improvement or no change to behavior. This was the weakest, and jankiest part of 2.5.0 and my decision to manage a CPU memory leak turned into a too-aggressive solution with unwanted side effects.
This solution should provide a better way to manage `compute` VRAM as the most asked-for feature is a way to remove everything else from VRAM prior to main UNet inference, which this accomplishes nicely, as well as reporting back accurate information DisTorch2 on-device shard sizes.1 parent 72a2033 commit f60aa6a
2 files changed
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
0 commit comments