feat: defaults-only sidecar injection without AgentRuntime CR#276
feat: defaults-only sidecar injection without AgentRuntime CR#276pdettori merged 3 commits intokagenti:mainfrom
Conversation
…CR exists The webhook previously skipped sidecar injection when no AgentRuntime CR was found. This change removes that gate so workloads with the kagenti.io/type label receive sidecars using platform + namespace defaults even without a CR. A new DefaultsConfigReconciler watches cluster and namespace ConfigMaps and updates the kagenti.io/config-hash annotation on workloads that are not managed by an AgentRuntime CR, triggering rolling updates when defaults change. Closes: RHAIENG-4177 Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
cb39d10 to
4ecf46d
Compare
| Scheme *runtime.Scheme | ||
| } | ||
|
|
||
| func (r *DefaultsConfigReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { |
There was a problem hiding this comment.
This is a part of webhook for now. Ideally we would want the WH to be light weight, and eventually this should be a separate reconciler added to the operator.
- Remove unused `defaultsLog` variable (golangci-lint/unused) - Fix gofmt alignment in test struct literals (golangci-lint/gofmt) - Add `kagenti.io/inject: disabled` to noproto-agent E2E fixture so sidecar injection doesn't break the pause container deployment Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
cwiklik
left a comment
There was a problem hiding this comment.
Review Summary
Well-structured feature that fills an important gap: pods with kagenti.io/type label now get sidecar injection even without an AgentRuntime CR, using platform + namespace defaults. The new DefaultsConfigReconciler correctly watches cluster/namespace ConfigMaps and propagates config-hash changes to unmanaged workloads. Pod mutator change is minimal (3-line removal of early return).
The reconciler uses proper controller-runtime patterns: retry.RetryOnConflict, client.IgnoreNotFound, error accumulation via errors.Join, and server-side label filtering for namespace listing. Test coverage is comprehensive (18 tests covering happy path, managed/unmanaged, deletion, multi-namespace, idempotency, feature gates).
Areas reviewed: Go (reconciler, pod mutator, tests), E2E fixtures
Commits: 2 commits, all signed-off ✓
CI status: all passing (E2E pending trigger)
| } | ||
|
|
||
| // isClusterConfigMapKey checks whether a NamespacedName refers to one of the | ||
| // cluster-level defaults ConfigMaps. Used for both live objects and deletion |
There was a problem hiding this comment.
suggestion (non-blocking): namespacesWithKagentiWorkloads() lists all Deployments + StatefulSets cluster-wide with the kagenti.io/type label on every cluster ConfigMap change. The label selector makes this server-side filtered, which is fine for moderate clusters. For very large clusters, consider adding a cache or index to avoid the double full-list. Low priority — current approach is correct and simple.
| } | ||
| if arOverrides == nil { | ||
| mutatorLog.Info("Skipping mutation: no matching AgentRuntime CR found", | ||
| mutatorLog.Info("No AgentRuntime CR found, injecting with defaults-only config", |
There was a problem hiding this comment.
👍 Clean change — removing the early return and letting ResolveConfig handle nil overrides is the right call. The log message clearly distinguishes this path.
| labels: | ||
| app.kubernetes.io/name: noproto-agent | ||
| kagenti.io/type: agent | ||
| kagenti.io/inject: disabled |
There was a problem hiding this comment.
nit: Worth adding a comment explaining why noproto-agent needs kagenti.io/inject: disabled — it's not obvious to future readers that this fixture specifically tests operator behavior without sidecar injection, and the new defaults-only path would otherwise inject into it.
pdettori
left a comment
There was a problem hiding this comment.
Review Summary
Well-structured PR that changes a key behavior: the webhook now injects sidecars using platform/namespace defaults even when no AgentRuntime CR exists, making CR creation optional rather than a prerequisite. The new DefaultsConfigReconciler propagates ConfigMap changes to unmanaged workloads. Tests are comprehensive (18 new unit tests + updated integration test).
Two lint issues must be fixed before merge, and the E2E failure needs a rerun to determine if it's a flake.
CI Status
- Build, Unit Tests, Integration Tests, DCO, Helm Lint, CodeQL, Trivy, Hadolint, ShellCheck, YAML Lint — all pass
- Lint: unused var
defaultsLog+ gofmt formatting in test file - E2E:
AgentCard > should not create AgentCard for workload without protocol label— webhook service unreachable (context deadline exceeded). This looks like a timing flake (webhook pod not ready), not a logic regression, but should be confirmed with a rerun since this PR touches the webhook.
Areas reviewed: Go (reconciler, webhook, tests), CI
All AgentCard E2E fixtures (echo-agent, audit-agent, signed-agent) now set kagenti.io/inject=disabled on their PodTemplateSpec. These tests validate AgentCard sync behaviour, not sidecar injection, and the defaults-only injection path would otherwise inject sidecars whose infrastructure dependencies are not present in the E2E environment. Also adds a TODO for field indexer optimisation on namespacesWithKagentiWorkloads per reviewer feedback. Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Summary
DefaultsConfigReconcilerwatches cluster and namespace ConfigMaps and updateskagenti.io/config-hashon workloads not managed by an AgentRuntime CR, triggering rolling updates when defaults changeCloses: RHAIENG-4177
Changes
internal/webhook/injector/pod_mutator.gointernal/webhook/injector/defaults_config_reconciler.gointernal/webhook/injector/defaults_config_reconciler_test.gointernal/webhook/injector/pod_mutator_test.gointernal/webhook/v1alpha1/authbridge_webhook_test.gocmd/main.goauthBridgeWebhooksEnabled()guardTest plan
TestInjectAuthBridge_NoAgentRuntime_InjectsWithDefaults— verifies envoy-proxy, spiffe-helper, proxy-init injectedTestDefaultsConfigReconciler_UpdatesUnmanagedWorkload— config-hash set on orphaned DeploymentTestDefaultsConfigReconciler_SkipsManagedWorkload— managed workloads untouchedTestDefaultsConfigReconciler_ConfigMapDeleted_UpdatesWorkloads— deletion triggers re-hashTestDefaultsConfigReconciler_MultiNamespaceFanOut— cluster ConfigMap change fans out to all namespacesTestDefaultsConfigReconciler_MixedManagedAndUnmanaged— only unmanaged workloads updatedTestDefaultsConfigReconciler_IdempotentWhenHashUnchanged— no spurious updateskagenti.io/type=agentand no CRSigned-off-by: Varsha Prasad Narsing varshaprasad96@gmail.com