Exports accurate and easy to aggregate allocated (Requests) to allocatable resources ratios across your Kubernetes cluster, per node group (via label combinations), or per-node.
Kubernetes clusters can waste resources due to fragmentation, or unaccounted overhead from Daemonsets, and other system components. Improvements in binpacking are hard to measure, and can only be observed over extended periods of time. You can use KBE to monitor how well your cluster bin-packs overtime, and how in/effective scheduling tweaks are. It is Like running eks-node-viewer in a loop to export historical metrics to your O11Y stack.
You need the combination of kube-state-metrics, kubelet and cAdvisor metrics can be used, they fall short because:
- These metrics are pulled from different sources at different intervals. This causes aggregation to not give an accurate snapshot of the cluster state per scrape. When aggregating over long periods of time (days+) the inaccuracies compound.
- Queries get extremely complex, and you have to handle many cases ( e.g exclude failed & completed pods, handle init containers, not count pending pods, complex
joinsto group by node labels ) - Some O11Y tools query language ( looking at you Datadog ) lacks the flexibility to join & combine metrics from different data sources.
How is KBE better?: KBE tracks the cluster state and returns an atomic snapshot of the cluster binpacking state on each scrape.
helm install kube-binpacking-exporter \
oci://ghcr.io/procore-oss/charts/kube-binpacking-exporter \
--version <check-releases-use-without-a-v>Helm chart version has no v prefix.
Check Helm values.yaml for options, most importantly how your O11Y stack pulls the metrics from /metrics at :9101.
To Run Locally
Note: you must have get|list|watch permissions on pods and nodes to run KBE locally.
# Build
docker build -t kube-binpacking-exporter:dev .
# Run (mount your kubeconfig)
docker run --rm -p 9101:9101 \
-v ~/.kube/config:/home/nonroot/.kube/config:ro \
kube-binpacking-exporter:dev \
--kubeconfig /home/nonroot/.kube/configOnce running, open http://localhost:9101 for the homepage with links to all endpoints.
# Basic (uses your current kubeconfig context)
go run . --kubeconfig ~/.kube/config
# Debug logging
go run . --kubeconfig ~/.kube/config --log-level=debug
# With label grouping
go run . --kubeconfig ~/.kube/config \
--label-group=topology.kubernetes.io/zone,node.kubernetes.io/instance-type \
--label-group=topology.kubernetes.io/zone
# Node filtering — only production nodes
go run . --kubeconfig ~/.kube/config --node-selector="environment=production"
# Node filtering — exclude control plane and spot instances
go run . --kubeconfig ~/.kube/config \
--node-selector='!node-role.kubernetes.io/control-plane,spot notin (true)'- Informer-based: Zero API calls per metric scrape, so it's very light on the Kube API server.
- Per-node and cluster-wide metrics: Individual node utilization plus cluster aggregates.
- Combination label grouping: Calculate binpacking metrics grouped by node label combinations (e.g., per-zone, per-zone+instance-type).
- Cardinality control: Disable per-node metrics via
--disable-node-metrics. - Track Daemonset Overhead.
- Support more Node resources. (e.g
storageandgpu) - Prometheus & Datadog Dashboard.
KBE's only concern is Are Pods' requests being satisfied in the most efficient way possible. Tracking if pods are setting the correct requests, and if they are under-utilizing requests is out of the scope of this tool.
- Project provide immutable releases, with signed Image & Chart, as well as Github attestations.
- Helm chart values file default to using the Image @digest.
- The Image runs in a distroless, read-only filesystem, with a non-root user with restricted permissions.
Resource allocation metrics use a resource label to identify the resource type (cpu, memory, etc.):
| Metric | Type | Labels | Description |
|---|---|---|---|
kube_binpacking_node_allocated |
Gauge | node, resource |
Total resource requested by pods on this node |
kube_binpacking_node_allocatable |
Gauge | node, resource |
Total allocatable resource on this node |
kube_binpacking_node_utilization_ratio |
Gauge | node, resource |
Ratio of allocated to allocatable (0.0–1.0+) |
kube_binpacking_cluster_allocated |
Gauge | resource |
Cluster-wide total resource requested |
kube_binpacking_cluster_allocatable |
Gauge | resource |
Cluster-wide total allocatable resource |
kube_binpacking_cluster_utilization_ratio |
Gauge | resource |
Cluster-wide allocation ratio |
kube_binpacking_cluster_node_count |
Gauge | - | Total number of nodes in the cluster |
kube_binpacking_group_allocated |
Gauge | label_group, label_group_value, resource |
Total resource requested on nodes in this label group |
kube_binpacking_group_allocatable |
Gauge | label_group, label_group_value, resource |
Total allocatable resource on nodes in this label group |
kube_binpacking_group_utilization_ratio |
Gauge | label_group, label_group_value, resource |
Ratio for nodes in this label group (0.0–1.0+) |
kube_binpacking_group_node_count |
Gauge | label_group, label_group_value |
Number of nodes in this label group |
Notes:
- Per-node metrics can be disabled via
--disable-node-metricsto reduce cardinality in large clusters - Group metrics are only emitted when
--label-groupis configured
Example Output
kube_binpacking_node_allocated{node="worker-1",resource="cpu"} 3.5
kube_binpacking_node_allocatable{node="worker-1",resource="cpu"} 4
kube_binpacking_node_utilization_ratio{node="worker-1",resource="cpu"} 0.875
kube_binpacking_node_allocated{node="worker-1",resource="memory"} 4294967296
kube_binpacking_node_allocatable{node="worker-1",resource="memory"} 8589934592
kube_binpacking_node_utilization_ratio{node="worker-1",resource="memory"} 0.5
kube_binpacking_cluster_allocated{resource="cpu"} 12.5
kube_binpacking_cluster_allocatable{resource="cpu"} 16
kube_binpacking_cluster_utilization_ratio{resource="cpu"} 0.78125
kube_binpacking_cluster_node_count 4
kube_binpacking_group_allocated{label_group="topology.kubernetes.io/zone",label_group_value="us-east-1a",resource="cpu"} 6.5
kube_binpacking_group_allocatable{label_group="topology.kubernetes.io/zone",label_group_value="us-east-1a",resource="cpu"} 8
kube_binpacking_group_utilization_ratio{label_group="topology.kubernetes.io/zone",label_group_value="us-east-1a",resource="cpu"} 0.8125
kube_binpacking_group_node_count{label_group="topology.kubernetes.io/zone",label_group_value="us-east-1a"} 2
| Flag | Default | Description |
|---|---|---|
--kubeconfig |
(auto) | Path to kubeconfig (uses in-cluster config if empty) |
--metrics-addr |
:9101 |
Address to serve metrics on |
--metrics-path |
/metrics |
HTTP path for metrics endpoint |
--resources |
cpu,memory |
Comma-separated list of resources to track |
--label-group |
(none) | Repeatable. Comma-separated label keys defining one combination group (e.g., --label-group=zone,instance-type --label-group=zone) |
--node-selector |
(none) | Kubernetes label selector to filter which nodes are tracked (e.g., environment=production,!spot). Uses set-based syntax. Filtered server-side via the node informer |
--disable-node-metrics |
false |
Disable per-node metrics to reduce cardinality (only emit cluster-wide and label-group metrics) |
--log-level |
info |
Log level: debug, info, warn, error |
--log-format |
json |
Log format: json, text |
--resync-period |
30m |
Informer cache resync period (e.g., 1m, 30s, 1h30m) |
--list-page-size |
500 |
Number of resources to fetch per page during initial sync (0 = no pagination) |
Defaults to port :9101
| Endpoint | Purpose |
|---|---|
/metrics |
Prometheus metrics (configured via --metrics-path) |
/sync |
Cache sync status - returns JSON with last sync time, age, and sync state |
/healthz |
Liveness probe - returns 200 if process is alive |
/readyz |
Readiness probe - returns 200 if informer cache is synced, 503 otherwise |
# Build binary
go build -o kube-binpacking-exporter .
# Build Docker image
docker build -t kube-binpacking-exporter:dev .# Run all tests
go test -v ./...
# Run tests with coverage
go test -v -coverprofile=coverage.out ./...
go tool cover -func=coverage.out # summary
go tool cover -html=coverage.out # detailed HTML report
# Run a specific test
go test -v -run TestCalculatePodRequest
# Race detector
go test -race ./...Tests use mock listers — no cluster required. See TESTING.md for full details.
go vet ./...
golangci-lint run
helm lint chartSee TODO.md for planned features and improvements.
This project was developed with the assistance of AI agents, specifically Claude Code. All code has been reviewed and approved by the maintainer.
MIT