Skip to content

Commit 7e0d484

Browse files
committed
docs: Add documentation for standard, gguf, and DisTorch nodes/wrappers
1 parent a8a5a6f commit 7e0d484

34 files changed

Lines changed: 1215 additions & 0 deletions
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# CLIPLoaderDisTorch2MultiGPU
2+
3+
The `CLIPLoaderDisTorch2MultiGPU` node is used to load standard CLIP text encoder models with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger text encoding models across multiple GPUs.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_name` | `STRING` | The name of the CLIP model to load. |
12+
| `type` | `STRING` | The type of CLIP model (e.g., 'stable_diffusion', 'stable_diffusion_xl'). |
13+
| `device` | `STRING` | Target device for text encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
14+
| `virtual_vram_gb` | `FLOAT` | Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0). |
15+
| `donor_device` | `STRING` | Device to donate VRAM from when allocating virtual memory (default: 'cpu'). |
16+
| `expert_mode_allocations` | `STRING` | Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*'). |
17+
| `keep_loaded` | `BOOLEAN` | Whether to keep the model loaded when triggering memory cleanup operations (default: true). |
18+
19+
## Outputs
20+
21+
| Output Name | Data Type | Description |
22+
| --- | --- | --- |
23+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model with DisTorch2 distributed allocation applied. |
24+
25+
## DisTorch2 Distributed Loading
26+
27+
DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.
28+
29+
### Key Concepts
30+
31+
**Virtual VRAM Allocation**: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.
32+
33+
**Expert Mode Allocations**: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.
34+
35+
### Allocation Examples
36+
37+
**Basic Virtual VRAM Mode**:
38+
- `device`: `cuda:0`
39+
- `virtual_vram_gb`: `8.0`
40+
- `donor_device`: `cuda:1`
41+
- Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.
42+
43+
**Expert Ratio Allocation**:
44+
- `expert_mode_allocations`: `cuda:0,60%;cuda:1,30%;cpu,10%`
45+
- Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.
46+
47+
**Expert Byte Allocation**:
48+
- `expert_mode_allocations`: `cuda:0,4gb;cuda:1,2gb;cpu,*`
49+
- Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.
50+
51+
**Mixed Mode**:
52+
Combines virtual VRAM with expert allocations for complex multi-device scenarios.
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# CLIPLoaderGGUFDisTorch2MultiGPU
2+
3+
The `CLIPLoaderGGUFDisTorch2MultiGPU` node is used to load GGUF format CLIP text encoder models with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger text encoding models across multiple GPUs.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip` and `ComfyUI/models/clip_gguf` folders, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_name` | `STRING` | The name of the CLIP model to load from combined clip and clip_gguf folders. |
12+
| `type` | `STRING` | The type of CLIP model (e.g., 'stable_diffusion', 'stable_diffusion_xl'). |
13+
| `device` | `STRING` | Target device for text encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
14+
| `virtual_vram_gb` | `FLOAT` | Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0). |
15+
| `donor_device` | `STRING` | Device to donate VRAM from when allocating virtual memory (default: 'cpu'). |
16+
| `expert_mode_allocations` | `STRING` | Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*'). |
17+
| `keep_loaded` | `BOOLEAN` | Whether to keep the model loaded when triggering memory cleanup operations (default: true). |
18+
19+
## Outputs
20+
21+
| Output Name | Data Type | Description |
22+
| --- | --- | --- |
23+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model with DisTorch2 distributed allocation applied. |
24+
25+
## DisTorch2 Distributed Loading
26+
27+
DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.
28+
29+
### Key Concepts
30+
31+
**Virtual VRAM Allocation**: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.
32+
33+
**Expert Mode Allocations**: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.
34+
35+
### Allocation Examples
36+
37+
**Basic Virtual VRAM Mode**:
38+
- `device`: `cuda:0`
39+
- `virtual_vram_gb`: `8.0`
40+
- `donor_device`: `cuda:1`
41+
- Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.
42+
43+
**Expert Ratio Allocation**:
44+
- `expert_mode_allocations`: `cuda:0,60%;cuda:1,30%;cpu,10%`
45+
- Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.
46+
47+
**Expert Byte Allocation**:
48+
- `expert_mode_allocations`: `cuda:0,4gb;cuda:1,2gb;cpu,*`
49+
- Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.
50+
51+
**Mixed Mode**:
52+
Combines virtual VRAM with expert allocations for complex multi-device scenarios.

web/docs/CLIPLoaderGGUFMultiGPU.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# CLIPLoaderGGUFMultiGPU
2+
3+
The `CLIPLoaderGGUFMultiGPU` node is used to load GGUF format CLIP text encoder models with device selection capability, enabling users to specify which GPU or device should be used for model execution.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip` and `ComfyUI/models/clip_gguf` folders, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_name` | `STRING` | The name of the CLIP model to load from combined clip and clip_gguf folders. |
12+
| `type` | `STRING` | The type of CLIP model (e.g., 'stable_diffusion', 'stable_diffusion_xl'). |
13+
| `device` | `STRING` | Target device for text encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
14+
15+
## Outputs
16+
17+
| Output Name | Data Type | Description |
18+
| --- | --- | --- |
19+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model. |

web/docs/CLIPLoaderMultiGPU.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# CLIPLoaderMultiGPU
2+
3+
The `CLIPLoaderMultiGPU` node is used to load CLIP text encoder models with device selection capability, enabling users to specify which GPU or device should be used for model execution.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_name` | `STRING` | The name of the CLIP model to load. |
12+
| `type` | `STRING` | The type of CLIP model (e.g., 'stable_diffusion', 'stable_diffusion_xl'). |
13+
| `device` | `STRING` | Target device for text encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
14+
15+
## Outputs
16+
17+
| Output Name | Data Type | Description |
18+
| --- | --- | --- |
19+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model. |
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# CLIPVisionLoaderDisTorch2MultiGPU
2+
3+
The `CLIPVisionLoaderDisTorch2MultiGPU` node is used to load CLIP Vision models with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger vision encoder models across multiple GPUs.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip_vision` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_vision` | `STRING` | The name of the CLIP Vision model to load. |
12+
| `device` | `STRING` | Target device for vision encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
13+
| `virtual_vram_gb` | `FLOAT` | Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0). |
14+
| `donor_device` | `STRING` | Device to donate VRAM from when allocating virtual memory (default: 'cpu'). |
15+
| `expert_mode_allocations` | `STRING` | Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*'). |
16+
| `keep_loaded` | `BOOLEAN` | Whether to keep the model loaded when triggering memory cleanup operations (default: true). |
17+
18+
## Outputs
19+
20+
| Output Name | Data Type | Description |
21+
| --- | --- | --- |
22+
| `CLIP_VISION` | `CLIP_VISION` | The loaded CLIP Vision model with DisTorch2 distributed allocation applied. |
23+
24+
## DisTorch2 Distributed Loading
25+
26+
DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.
27+
28+
### Key Concepts
29+
30+
**Virtual VRAM Allocation**: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.
31+
32+
**Expert Mode Allocations**: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.
33+
34+
### Allocation Examples
35+
36+
**Basic Virtual VRAM Mode**:
37+
- `device`: `cuda:0`
38+
- `virtual_vram_gb`: `8.0`
39+
- `donor_device`: `cuda:1`
40+
- Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.
41+
42+
**Expert Ratio Allocation**:
43+
- `expert_mode_allocations`: `cuda:0,60%;cuda:1,30%;cpu,10%`
44+
- Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.
45+
46+
**Expert Byte Allocation**:
47+
- `expert_mode_allocations`: `cuda:0,4gb;cuda:1,2gb;cpu,*`
48+
- Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.
49+
50+
**Mixed Mode**:
51+
Combines virtual VRAM with expert allocations for complex multi-device scenarios.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# CLIPVisionLoaderMultiGPU
2+
3+
The `CLIPVisionLoaderMultiGPU` node is used to load CLIP Vision models with device selection capability, enabling users to specify which GPU or device should be used for vision encoder execution.
4+
5+
This node automatically detects models located in the `ComfyUI/models/clip_vision` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `clip_vision` | `STRING` | The name of the CLIP Vision model to load. |
12+
| `device` | `STRING` | Target device for vision encoder compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
13+
14+
## Outputs
15+
16+
| Output Name | Data Type | Description |
17+
| --- | --- | --- |
18+
| `CLIP_VISION` | `CLIP_VISION` | The loaded CLIP Vision model. |
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# CheckpointLoaderSimpleDisTorch2MultiGPU
2+
3+
The `CheckpointLoaderSimpleDisTorch2MultiGPU` node is used to load checkpoint models (complete diffusion models containing UNet, CLIP, and VAE components) with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger models across multiple GPUs.
4+
5+
This node automatically detects models located in the `ComfyUI/models/checkpoints` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `ckpt_name` | `STRING` | The name of the checkpoint model to load. |
12+
| `compute_device` | `STRING` | Target device for compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
13+
| `virtual_vram_gb` | `FLOAT` | Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0). |
14+
| `donor_device` | `STRING` | Device to donate VRAM from when allocating virtual memory (default: 'cpu'). |
15+
| `expert_mode_allocations` | `STRING` | Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*'). |
16+
| `keep_loaded` | `BOOLEAN` | Whether to keep the model loaded when triggering memory cleanup operations (default: true). |
17+
18+
## Outputs
19+
20+
| Output Name | Data Type | Description |
21+
| --- | --- | --- |
22+
| `MODEL` | `MODEL` | The loaded UNet diffusion model with DisTorch2 distributed allocation applied. |
23+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model. |
24+
| `VAE` | `VAE` | The loaded VAE decoder/encoder model. |
25+
26+
## DisTorch2 Distributed Loading
27+
28+
DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.
29+
30+
### Key Concepts
31+
32+
**Virtual VRAM Allocation**: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.
33+
34+
**Expert Mode Allocations**: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.
35+
36+
### Allocation Examples
37+
38+
**Basic Virtual VRAM Mode**:
39+
- `compute_device`: `cuda:0`
40+
- `virtual_vram_gb`: `8.0`
41+
- `donor_device`: `cuda:1`
42+
- Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.
43+
44+
**Expert Ratio Allocation**:
45+
- `expert_mode_allocations`: `cuda:0,60%;cuda:1,30%;cpu,10%`
46+
- Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.
47+
48+
**Expert Byte Allocation**:
49+
- `expert_mode_allocations`: `cuda:0,4gb;cuda:1,2gb;cpu,*`
50+
- Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.
51+
52+
**Mixed Mode**:
53+
Combines virtual VRAM with expert allocations for complex multi-device scenarios.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# CheckpointLoaderSimpleMultiGPU
2+
3+
The `CheckpointLoaderSimpleMultiGPU` node is used to load checkpoint models (complete diffusion models containing UNet, CLIP, and VAE components) with device selection capability, enabling users to specify which GPU or device should be used for model execution.
4+
5+
This node automatically detects models located in the `ComfyUI/models/checkpoints` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `ckpt_name` | `STRING` | The name of the checkpoint model to load. |
12+
| `device` | `STRING` | Target device for compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
13+
14+
## Outputs
15+
16+
| Output Name | Data Type | Description |
17+
| --- | --- | --- |
18+
| `MODEL` | `MODEL` | The loaded UNet diffusion model. |
19+
| `CLIP` | `CLIP` | The loaded CLIP text encoder model. |
20+
| `VAE` | `VAE` | The loaded VAE decoder/encoder model. |
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# ControlNetLoaderDisTorch2MultiGPU
2+
3+
The `ControlNetLoaderDisTorch2MultiGPU` node is used to load ControlNet models with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger conditional generation models across multiple GPUs.
4+
5+
This node automatically detects models located in the `ComfyUI/models/controlnet` folder, and it will also read models from additional paths configured in the `extra_model_paths.yaml` file. Sometimes, you may need to **refresh the ComfyUI interface** to allow it to read the model files from the corresponding folder.
6+
7+
## Inputs
8+
9+
| Parameter | Data Type | Description |
10+
| --- | --- | --- |
11+
| `control_net_name` | `STRING` | The name of the ControlNet model to load. |
12+
| `compute_device` | `STRING` | Target device for compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system. |
13+
| `virtual_vram_gb` | `FLOAT` | Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0). |
14+
| `donor_device` | `STRING` | Device to donate VRAM from when allocating virtual memory (default: 'cpu'). |
15+
| `expert_mode_allocations` | `STRING` | Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*'). |
16+
| `keep_loaded` | `BOOLEAN` | Whether to keep the model loaded when triggering memory cleanup operations (default: true). |
17+
18+
## Outputs
19+
20+
| Output Name | Data Type | Description |
21+
| --- | --- | --- |
22+
| `CONTROL_NET` | `CONTROL_NET` | The loaded ControlNet model with DisTorch2 distributed allocation applied. |
23+
24+
## DisTorch2 Distributed Loading
25+
26+
DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.
27+
28+
### Key Concepts
29+
30+
**Virtual VRAM Allocation**: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.
31+
32+
**Expert Mode Allocations**: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.
33+
34+
### Allocation Examples
35+
36+
**Basic Virtual VRAM Mode**:
37+
- `compute_device`: `cuda:0`
38+
- `virtual_vram_gb`: `8.0`
39+
- `donor_device`: `cuda:1`
40+
- Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.
41+
42+
**Expert Ratio Allocation**:
43+
- `expert_mode_allocations`: `cuda:0,60%;cuda:1,30%;cpu,10%`
44+
- Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.
45+
46+
**Expert Byte Allocation**:
47+
- `expert_mode_allocations`: `cuda:0,4gb;cuda:1,2gb;cpu,*`
48+
- Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.
49+
50+
**Mixed Mode**:
51+
Combines virtual VRAM with expert allocations for complex multi-device scenarios.

0 commit comments

Comments
 (0)