[RFC] Add the ability to give HTTP filters names, and a utility to get them.#44356
Open
nshipilov wants to merge 206 commits intoenvoyproxy:mainfrom
Open
[RFC] Add the ability to give HTTP filters names, and a utility to get them.#44356nshipilov wants to merge 206 commits intoenvoyproxy:mainfrom
nshipilov wants to merge 206 commits intoenvoyproxy:mainfrom
Conversation
|
Hi @nshipilov, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
Member
|
could you fix DCO please @nshipilov |
Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
<!-- !!!ATTENTION!!! If you are fixing *any* crash or *any* potential security issue, *do not* open a pull request in this repo. Please report the issue via emailing envoy-security@googlegroups.com where the issue will be triaged appropriately. Thank you in advance for helping to keep Envoy secure. !!!ATTENTION!!! For an explanation of how to fill out the fields, please see the relevant section in [PULL_REQUESTS.md](https://github.com/envoyproxy/envoy/blob/main/PULL_REQUESTS.md) !!!ATTENTION!!! Please check the [use of generative AI policy](https://github.com/envoyproxy/envoy/blob/main/CONTRIBUTING.md?plain=1#L41). You may use generative AI only if you fully understand the code. You need to disclose this usage in the PR description to ensure transparency. --> Commit Message: Additional Description: Risk Level: Testing: Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] --------- Signed-off-by: tyxia <tyxia@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…xy#43544) Commit Message: Adds extension common util for parsing AWS eventstream Additional Description: this PR adds common utility for parsing the AWS eventstream encoding which is used in various AWS products for streaming content. More details here https://smithy.io/2.0/aws/amazon-eventstream.html. Part of the work for envoyproxy#43492 Risk Level: Testing: Unit + fuzzy Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes:] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] --------- Signed-off-by: Peter Leng <yleng@pinterest.com> Signed-off-by: Peter Leng <peterleng1234@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Uses of sed and xargs assumed GNU extensions. Fix the commands so that they'll work on either BSD or GNU versions of tools. Signed-off-by: Greg Greenway <ggreenway@apple.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…igh watermark (envoyproxy#42176) Commit Message: Add timeout guard for connections stuck above buffer high watermark * extend listener and cluster APIs with `per_connection_buffer_high_watermark_timeout`, allowing operators to cap how long downstream or upstream connections can remain at/above their configured buffer high watermark * close connections when their buffers stay at/above the watermark past the configured timeout Background: In our large (internal) multi-tenant use case where Envoy runs as a host-agent shared by all the pods on the host, a buggy pods can accept connections but fail to drain them, leaving Envoy buffers full until overload-manager kicks in or OOMs. This change clamps such stalled connections once they remain fully buffered for the configured duration. Additional Description: N/A Risk Level: Low Testing: Unit tests Docs Changes: N/A Release Notes: N/A Platform-Specific Features: N/A --------- Signed-off-by: Ronak Jain <ronakjainc@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Add the filter state key to the IS_ENVOY_BUG messages to help with debugging Risk Level: low Testing: unit tests Signed-off-by: Elisha Ziskind <eziskind@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…rap DM (envoyproxy#43962) ## Description This PR added listener lifecycle callbacks including `on_listener_add_or_update` and `on_listener_removal` to the dynamic module bootstrap extension enabling modules to respond to listeners lifecycle events. --- **Commit Message:** dynamic_modules: add listener lifecycle event callbacks to the bootstrap DM **Additional Description:** Added listener lifecycle callbacks including `on_listener_add_or_update` and `on_listener_removal` for Bootstrap. **Risk Level:** Low **Testing:** Added Tests **Docs Changes:** Added **Release Notes:** N/A Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…#43703) ## Description This PR adds support for writing new upstream extensions using Dynamic Modules. --- **Commit Message:** dynamic_modules: add dynamic modules support for upstream extensions **Additional Description:** Adds support for writing new upstream extensions using Dynamic Modules. **Risk Level:** Low **Testing:** Added Unit + Integration Tests **Docs Changes:** Added **Release Notes:** Added --------- Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…ic modules ABI (envoyproxy#43964) ## Description This PR added a process-wide shared data registry to the dynamic modules ABI. --- **Commit Message:** dynamic_modules: add a process-wide shared data registry to the dynamic modules ABI **Additional Description:** Added a process-wide shared data registry to the dynamic modules ABI. **Risk Level:** Low **Testing:** Added Tests **Docs Changes:** Added **Release Notes**: Added Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…oxy#43965) The clang toolchain is now hermetic and automatically downloaded by Bazel, so bazel/setup_clang.sh no longer exists. Updated `bazel/README.md` to reflect the current setup process, fixed a stale comment in `.bazelrc`, and removed outdated shell_commands in `.bazelci/presubmit.yml`. Signed-off-by: stekole <stefan@sandnetworks.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…nvoyproxy#43957) Commit Message: I believe that when you check out a repo and run tests, they should work. Tests that do not make sense on the platform you are running should be skipped. In this case the underlying system throws if you don't have the right processor, so I changed the test to just catch it, check the exception string is what's expected, and stop. Additional Description: Risk Level: low (test-only) Testing: Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] --------- Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…nvoyproxy#43952) ## Summary - Document the evaluation order of rewrite rules within a matched `local_reply_config` mapper: `body` → `headers_to_add` → `status_code` → `body_format_override` - Explain that `%RESPONSE_CODE%` resolves to different values in `headers_to_add` (original code) vs `body_format_override` (overridden code) due to this ordering - Provide a workaround: capture the original code in a response header and reference it via `%RESP(header-name)%` Verified against `source/common/local_reply/local_reply.cc` lines 103-132 (matchAndRewrite) and lines 178-207 (rewrite). Fixes envoyproxy#37577 Risk Level: Low (docs only) Testing: N/A Docs Changes: Added evaluation order documentation to local_reply.rst Release Notes: N/A Signed-off-by: Kit <kit@kovan.dev> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Kit <kit@kovan.dev> Signed-off-by: kovan <xaum.io@gmail.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…y#43987) On macOS, the primary group for regular users is GID 20 (staff). run_envoy_docker.sh always unconditionally overwrites USER_GID with the result of id -g, then passes that value into the Docker container entrypoint. The entrypoint runs groupmod -g "$USER_GID" envoybuild to remap the envoybuild group to match the host user's GID so that volume-mounted file permissions work correctly. However, GID 20 is already assigned to a different group inside the Linux-based build container (e.g. dialout), so groupmod fails with: ``` groupmod: GID '20' already exists ``` Because USER_GID was unconditionally assigned, any attempt to override it by prefixing the command (e.g. USER_GID=1000 ./ci/run_envoy_docker.sh bash) had no effect — the script always clobbered the value. Fix Changed the USER_UID and USER_GID assignments in run_envoy_docker.sh to use the existing environment variable value if already set, falling back to id -u / id -g only when not provided: ``` USER_UID="${USER_UID:-$(id -u)}" USER_GID="${USER_GID:-$(id -g)}" ``` This allows macOS users to override the GID with one that doesn't conflict with groups already present inside the container: ``` USER_GID=1000 ./ci/run_envoy_docker.sh bash ``` Signed-off-by: Juan Manuel Ollé <jolle@salesforce.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com> Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com> Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com> Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com> Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
… implementation from Stats::AllocatorImpl to Stats::Allocator (envoyproxy#43968) A long time ago, all Envoy stats were stored in a large shared memory block, significantly limiting how many you could get, and the lengths of their names. This facilitated keeping stat continuity across parent/child hot-restart. The fixed shared-mem block size was very limiting and a bit inefficient, so an alternative mechanism of keeping stats on the heap was added for when hot-restart was not needed. To switch between them, we used the Allocator interface, and kept two implementations of that. Then we removed the shared-memory version in favor of using message passing to update counters from parent to child during hot restart. Now it's time to remove that interface; it's making it a bit harder to reason about guaranteeing that stats users don't call the allocator directly; it's really an implementation detail of the stats system. So this is a pure refactor (no functional change) which can help that, and make it easier to reason about the safety of envoyproxy#43958 . As removing headers and class names can be a breaking change, temporary forwarding headers are left behind, temporarily; out-of-repo referencess should be changed by June. Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…yproxy#43974) **Commit Message**: dynamic_modules: remote source: add caching for fetched modules **Motivation** Follow up from envoyproxy#43818 and envoyproxy#43333 Next PR: `nack_on_cache_miss` support (NACK on cache miss + background fetch) **Additional Description**: `newDynamicModuleFromBytes` already writes fetched modules to a deterministic path based on SHA256. This PR adds a check before the async fetch: if the file already exists on disk, load it directly and skip the HTTP fetch entirely. It is a simple approach that forms base for the next PR. **Risk Level**: Low **Testing**: Added unit tests **Docs Changes**: NA **Release Notes**: Added Platform Specific Features: NA [Optional Runtime guard:] Not needed [Optional Fixes #Issue]: NA [Optional Fixes commit #PR or SHA]: NA [Optional Deprecated:]: NA [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] NA --------- Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Commit Message: mcp: add tools/call response transcoding. Additional Description: Risk Level: Low Testing: Unit test and Integration test Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] --------- Signed-off-by: Yilin Guo <guoyilin@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…anic safety (envoyproxy#43828) Fixes envoyproxy#43827 Adds an opt-in `CatchUnwind<F>` wrapper type to the Rust SDK that module authors can use to catch panics in filter callbacks. When a wrapped filter panics: 1. The inner filter is immediately dropped (preventing access to potentially corrupted state). 2. The panic payload is logged via `envoy_log_error!`. 3. The wrapper fails closed — sends a 500 response for HTTP filters, closes the connection for network filters, and returns `StopIteration` for all status-returning methods. 4. Subsequent status-returning callbacks on a poisoned filter abort the process (this should never happen if the fail-closed mechanism works correctly). 5. Void cleanup callbacks (e.g. `on_destroy`, `on_stream_complete`) silently skip on a poisoned filter. Module authors opt in by wrapping their filter at construction time: ```rust fn new_http_filter(&self, envoy: &mut EHF) -> Box<dyn HttpFilter<EHF>> { Box::new(CatchUnwind::new(MyFilter::new())) } ``` Without the wrapper, panics abort the process as before (the default Rust behavior at `extern "C"` boundaries). Init and config callbacks are intentionally not wrapped — panics during startup indicate fatal misconfiguration and should crash. **AI Disclosure:** This PR was developed with AI assistance (Claude via Cursor). **Risk Level:** Low **Testing:** Unit tests for HTTP, network, and listener filter panic paths. **Docs Changes:** Release note in `changelogs/current.yaml` + dynamic modules docs update. **Release Notes:** Rust SDK adds `CatchUnwind` wrapper for opt-in panic safety in filter callbacks. --------- Signed-off-by: Conrad Ludgate <conradludgate@gmail.com> Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com> Co-authored-by: Rohit Agrawal <rohit.agrawal@databricks.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…oyproxy#44009) Commit Message: deps: switch aspect_bazel_lib to download from official releases Additional Description: The bazel-lib published release tarballs include a generated file `tools/integrity.bzl` which is stubbed out in the regular source tree (ex: https://github.com/bazel-contrib/bazel-lib/blob/v2.21.2/tools/integrity.bzl). If this file is missing, some bazel-lib rules which make use of extra toolchains will break due to the missing checksums when downloading binaries. None of the affected rules are used in Envoy right now, but this can affect downstream consumers. Risk Level: low Testing: Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional [API Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):] Signed-off-by: Joe Kralicky <joekralicky@gmail.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…t populated (envoyproxy#43876) Fixing a case where the initial_resource_versions field in the delta-xDS discovery request isn't being populated correctly when xDS-Failover is used, and the primary doesn't respond on the first attempt. Risk Level: low - xDS-Failover is gated by an already existing `envoy.restart_features.xds_failover_support` and is still experimental. Testing: Added an integration test. Docs Changes: N/A Release Notes: Added. Platform Specific Features: N/A Signed-off-by: Adi Suissa-Peleg <adip@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…envoyproxy#43975) Commit Message: userspace socket: rename some IoHandle functions to be less confusing Additional Description: Userspace IoHandleImpl implements a Network::IoHandle read/write interface to be exposed to the transport socket. It also implements a UserSpace::IoHandle interface, exposed to the peer handle, to indicate to the peer whether its receive buffer can accept more data. Rather confusingly, this is described as being "writable". This overloading of terms means that a handle can be blocked for `write()`s while `isWritable()` returns true, since it is the handle's _peer_ whose `write()` function can now be called because the peer is what calls `isWritable()`. Similarly, `isWritable()` means that the handle is writable by the peer, while `isReadable()` means that the handle is readable _by the owner_. I think this would be less confusing by avoiding most references to "write" in the UserSpace::IoHandle API, except where it refers specifically to the `write()` function on the same handle: - `setWriteEnd()` is called by the peer to promise that it will send no more data -> `setEof()` - `isPeerShutDownWrite()` is called to determine if the above has happened -> `hasReceivedEof()` - `getWriteBuffer()` -> `getReceiveBuffer()` - `isWritable()` is called to determine if the receive buffer can accept data -> `canReceiveData()` - `isPeerWritable()` _actually_ indicates whether write() calls will succeed -> `isWritable()` Risk Level: none; renames some symbols Testing: existing CI Docs Changes: n/a Release Notes: n/a --------- Signed-off-by: Eugene Chan <eugenechan@google.com> Signed-off-by: pianiststickman <34144687+pianiststickman@users.noreply.github.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…#43934) Fixed bug where providing an empty inner socket config would cause Envoy to crash. Risk Level: Low. Testing: UT Docs Changes: n/a Release Notes: done --------- Signed-off-by: Tony Allen <txallen@google.com> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
envoyproxy#44348) …ep (envoyproxy#43715)" This reverts commit d74384b. --------- Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: Ryan Northey <ryan@synca.io> Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
…to resolve it Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
Author
Done. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message: Add the ability to give HTTP filters names, and a utility to get them (or fall back to RTTI if not overwritten).
Additional Description:
At Uber, we have added logging in filter_manager.h/cc for filter operations (ctors, dtors, OnDestroy, encodeHeaders, decodeHeaders) to debug slow p99.9 forward latencies in our filter chains.
To do so, we need a way to identify filters, and this is my proposal for how to do so. It allows gradual onboarding with a reasonable fallback.
Additionally, we would like some guidance on whether or not you would be open to adding logging similar to what we have upstream.
Currently, we have a configurable "slowness" level in runtime configuration, and filter_manager emits logs if specific functions cross that threshold. On top of this, we could emit stats about these latencies in the filter_manager.
Let me know what you folks think, and if the direction is agreeable, I can put up a pull request. Thanks!
Risk Level: Low
Testing: Unit tests.
Docs Changes: None
Release Notes: None
Platform Specific Features: None