Skip to content

vhds: use vhds virtual host over rds one upon name conflict#44401

Open
Arkelenia wants to merge 1 commit intoenvoyproxy:mainfrom
Arkelenia:frederic/vhds-rds-collision-resolution
Open

vhds: use vhds virtual host over rds one upon name conflict#44401
Arkelenia wants to merge 1 commit intoenvoyproxy:mainfrom
Arkelenia:frederic/vhds-rds-collision-resolution

Conversation

@Arkelenia
Copy link
Copy Markdown

Commit Message: This PR allows VHDS to explicitly take precedence if virtual host names collide.
Additional Description: If a virtual host is distributed through VHDS and RDS at the same time, then all further VHDS or RDS updates are stuck.
Risk Level: Medium
Testing: added unit tests
Docs Changes: Nothing as the docs already indicate that VHDS is meant to take precedence over RDS.
Release Notes: Allow VHDS to take precedence over RDS for virtual hosts with colliding names
Platform Specific Features: N/A

@Arkelenia Arkelenia had a problem deploying to external-contributors April 12, 2026 01:10 — with GitHub Actions Error
@repokitteh-read-only
Copy link
Copy Markdown

Hi @Arkelenia, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #44401 was opened by Arkelenia.

see: more, trace.

@Arkelenia Arkelenia force-pushed the frederic/vhds-rds-collision-resolution branch from c72de58 to 0db315e Compare April 12, 2026 03:43
@Arkelenia Arkelenia temporarily deployed to external-contributors April 12, 2026 03:43 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@wdauchy wdauchy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. A few suggestions below.

Comment thread source/common/router/route_config_update_receiver_impl.cc
Comment thread source/common/router/route_config_update_receiver_impl.cc
if (vh.name() == "overlapping_vhost") {
// The VHDS version should have "vhost.vhds.domain", not "vhost.rds.domain"
EXPECT_EQ("vhost.vhds.domain", vh.domains(0));
found_vhds_version = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions here iterate and match by name, but never fail explicitly if overlapping_vhost is missing. If the loop doesn't find it, found_vhds_version stays false and the EXPECT_TRUE below catches it — that works, but the failure message would be opaque.

Consider a helper or ASSERT_TRUE(found_vhds_version) << "VHDS version of overlapping_vhost not found"; for a clearer diagnostic on failure.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

envoy_grpc:
cluster_name: xds_cluster
)EOF");
config_update_info->onRdsUpdate(updated_route_config, "2");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't assert that overlapping_vhost was actually found during the loop. If the vhost is somehow absent, the loop silently passes. Add a boolean + final assertion like the first test does.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment thread test/common/router/vhds_test.cc
@Arkelenia Arkelenia had a problem deploying to external-contributors April 13, 2026 15:37 — with GitHub Actions Error
@Arkelenia Arkelenia force-pushed the frederic/vhds-rds-collision-resolution branch from 4b7d962 to 6080190 Compare April 13, 2026 15:49
@Arkelenia Arkelenia temporarily deployed to external-contributors April 13, 2026 15:49 — with GitHub Actions Inactive
@wdauchy
Copy link
Copy Markdown
Contributor

wdauchy commented Apr 13, 2026

@botengyao do you think you could take a look?

Signed-off-by: Frederic Hemery <frederic.hemery@datadoghq.com>
@Arkelenia Arkelenia force-pushed the frederic/vhds-rds-collision-resolution branch from 6080190 to 3d79966 Compare April 13, 2026 20:43
@Arkelenia Arkelenia temporarily deployed to external-contributors April 13, 2026 20:44 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the issue.
Can you add a pointer to the relevant docs?
Looking at the proto description and the docs it is unclear why VHDS take precedence (just that the virtual-host is merged between the RDS and VHDS).

@Arkelenia
Copy link
Copy Markdown
Author

@adisuissa The proto description you linked explicitly states that VHDS is supposed to take precedence over RDS when they get merged. I think the choice was made with the intent of RDS providing fleet-wide defaults and VHDS providing node-specific virtual hosts. I'll defer to you on whether we want to change the behavior considering that no one can currently rely on it anyway.

Do you want me to link those ion the PR description or commit message?

@adisuissa
Copy link
Copy Markdown
Contributor

@adisuissa The proto description you linked explicitly states that VHDS is supposed to take precedence over RDS when they get merged. I think the choice was made with the intent of RDS providing fleet-wide defaults and VHDS providing node-specific virtual hosts. I'll defer to you on whether we want to change the behavior considering that no one can currently rely on it anyway.

Do you want me to link those ion the PR description or commit message?

AFAICT it says the following:

The contents of these two fields will be merged to generate a routing table for a given RouteConfiguration, with ``vhds`` derived configuration taking precedence.

My reading of the code is that it takes the virtual host from the routes and then adds the virtual hosts from the VHDS. Should it be that the contents of the virtual-hosts are merged instead?

@adisuissa
Copy link
Copy Markdown
Contributor

I'm not saying your approach is incorrect, just that I'm not sure what was the intent of the original VHDS users.

@Arkelenia
Copy link
Copy Markdown
Author

Arkelenia commented Apr 14, 2026

Ah I see, I think I misunderstood your original message. Sorry about that. My read on it was that virtual host contents would not be merged together and instead one vhost would take precedence over the other upon name collisions. But you are right that the documentation is very vague on what constitutes a collision and what merging means.

My reading of the code is that it takes the virtual host from the routes and then adds the virtual hosts from the VHDS. Should it be that the contents of the virtual-hosts are merged instead?

The code today only appends VHDS vhosts to the RDS vhost array but merging the content of vhosts might have been the original intent. I think that merging the contents of a vhost is quite tricky though, especially when it comes to routes because the ordering has semantic meaning there. We could try to build some heuristic for merging routes if we want to go in that direction. That being said, if someone is already able to leverage VHDS, they already have the control plane infrastructure in place to merge the vhosts together and distribute them through VHDS (relying on the fact that the whole VHDS vhost will take precedence). I think that merging two colliding vhosts is too ambiguous for Envoy to get consistently right. What do you think?

I can also update the docs to make it more explicit what collision and merge mean in the context of VHDS.

@wdauchy
Copy link
Copy Markdown
Contributor

wdauchy commented Apr 15, 2026

Ah I see, I think I misunderstood your original message. Sorry about that. My read on it was that virtual host contents would not be merged together and instead one vhost would take precedence over the other upon name collisions. But you are right that the documentation is very vague on what constitutes a collision and what merging means.

My reading of the code is that it takes the virtual host from the routes and then adds the virtual hosts from the VHDS. Should it be that the contents of the virtual-hosts are merged instead?

The code today only appends VHDS vhosts to the RDS vhost array but merging the content of vhosts might have been the original intent. I think that merging the contents of a vhost is quite tricky though, especially when it comes to routes because the ordering has semantic meaning there. We could try to build some heuristic for merging routes if we want to go in that direction. That being said, if someone is already able to leverage VHDS, they already have the control plane infrastructure in place to merge the vhosts together and distribute them through VHDS (relying on the fact that the whole VHDS vhost will take precedence). I think that merging two colliding vhosts is too ambiguous for Envoy to get consistently right. What do you think?

I can also update the docs to make it more explicit what collision and merge mean in the context of VHDS.

+1 lets make the doc clearer in the same PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants