Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added redistribute live and check_xroutes #114

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Dando-Real-ITA
Copy link
Contributor

I was unable to reliably make the fake default route IPv4 and IPv6 on dev lo work. For example, sometimes babeld on the distributing server was setting the xroute metric to unreachable even if it was ok, and only at one of ipv4 or ipv6.
Result was that the client would get, with another default route retracted, one of ipv4 or ipv6 completely unreachable

I am not sure what was causing the metric not reflect the state, but I suppose it was having 2 default routes on the same master table

Using redistribute live update + check_xroutes I was able to have consistent results of refmetric on the client with any combination or order of default routes working or not

The router id allows to assign origin-based pref-src.

A use case is, when installing default routes originated by two edge routers with different subnets, and the client has 2 ips matching the separate subnets

install ip 0.0.0.0/0                  eq 0    id 00:02:00:00:00:00:1d:01      pref-src 66.199.5.163
install ip 0.0.0.0/0                  eq 0    id 00:02:00:00:00:00:1d:00      pref-src 12.144.66.186
When two similar routes with pref-src were switched, the pref-src was not updated to the new one.
With this fix, switch_routes passes the new->src to change_route, which is used to reapply the install_filter and retrieve the newpref_src from the filter_result.
kernel_route for netlink has been updated to support the newpref_src on ROUTE_MODIFY
It allows to change the metric of an existing redistribute filter using the live config
Allows to send an update of a redistribute metric changed to INFINITY
It adds a command check_xroutes that does not ignore the INFINITY metric of kernel routes
@jech
Copy link
Owner

jech commented Jul 8, 2024

Thanks for the PR. Sorry for the delay, you submitted it at a bad time and then it fell into the cracks.

At first sight it looks good, please give me a few days to review it carefully.

@Dando-Real-ITA
Copy link
Contributor Author

No problem, if interested after these commits I added a multi default support, however it is a more pervasive change and to work requires policy rules and to install the multiple default routes in separate tables

Also have a few changes to support table id > 255 and a delay on first update when using hmac to after the cycle is running and the handshake is completed

I am on hold currently on the project but I will use it to upgrade our network to multi-isp clos model

@jech
Copy link
Owner

jech commented Jul 20, 2024

I've applied the first two patches of your PR.

In the third patch, I have a problem: why is applying the install filter conditional on newsrc being non-null? I think that's a bug, we should be applying the install filter unconditionally in change_route and flushing the route if the install_filter causes it to be discarded.

The fourth patch is incomplete. It only works in a very specific case (the filter already exists and only the add_metric field is being changed), which is fine, no problem only doing the specific case that solves your problem. However, if the requested operation could not be performed, it does not return an error message to the user.

As to the last patch, I don't understand why it is useful. It also generates an error message that does not fit the established pattern (error messages start with either "no" or "bad").

@Dando-Real-ITA
Copy link
Contributor Author

3rd patch: change_route is called by multiple code paths that do not change the src: install_route, uninstall_route, change_route_metric. All these do not change the pref_src, and the first run of the install_filter in change_route gets pref_src, and by default newpref_src = pref_src;. This means that in these cases, the code works as before patch.

Only in switch_routes there is a new source, which causes to reevaluate the install_filter to retrieve the newpref_src

4th patch: Yes, I handled not finding the filter as a noop, the update filter was meant for that very specific case, but I could add a fail return message

5th patch: It is needed to send the metric change update when the route is changed to deny. Before patch, redistribute filters are fixed once the configuration is read, so check_xroutes when it finds a route with deny, ignores it and no further processing happens, including sending updates to peers. With the patch, a route could switch from having a valid metric to be deny, in this case a retraction must be sent to peers. Only in this case, which is manually triggered by the new command, the INFINITY metric routes are not ignored so that an update can be sent. All other calls to check_xroutes keep the default behaviour. I can update the message

@jech
Copy link
Owner

jech commented Jul 27, 2024

3rd patch: change_route is called by multiple code paths that do not change the src: install_route, uninstall_route, change_route_metric. All these do not change the pref_src, and the first run of the install_filter in change_route gets pref_src, and by default newpref_src = pref_src;. This means that in these cases, the code works as before patch.

Agreed, but I think that the existing code is buggy in case of a non-trivial install filter. I could be wrong, but I think we need to invoke install_filter unconditionally in switch_routes.

4th patch: Yes, I handled not finding the filter as a noop, the update filter was meant for that very specific case, but I could add a fail return message

You really need to add proper error handling there.

5th patch: It is needed to send the metric change update

Hmm.

First of all, it's not strictly needed, since the routes will still expire. However, it might take some time.

I'm not keen on adding a manual trigger. Can we work things out so that the update happens automatically?

@Dando-Real-ITA
Copy link
Contributor Author

Agreed, but I think that the existing code is buggy in case of a non-trivial install filter. I could be wrong, but I think we need to invoke install_filter unconditionally in switch_routes.

install_filter in change_route is always executed on the babel_route *route, and again if also source *newsrc is passed. What other use case do you mean?

You really need to add proper error handling there.

Done

Hmm.

First of all, it's not strictly needed, since the routes will still expire. However, it might take some time.

I'm not keen on adding a manual trigger. Can we work things out so that the update happens automatically?

A first option could be to trigger CONFIG_ACTION_CHECK_XROUTES directly with the redistribute update. However in my use case I change 4 redistribute before updating the xroutes, and would trigger the check_xroutes 4 times instead of 1:

echo "
redistribute ip 0.0.0.0/0 eq 0 $METRIC
redistribute ip 0.0.0.0/0 eq 0 proto 3 $METRIC
redistribute ip ::/0      eq 0 $METRIC
redistribute ip ::/0      eq 0 proto 3 $METRIC
check_xroutes
" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null

Another option could be to remove the infinity check altogether from:

babeld/xroute.c

Line 491 in 130366f

if(i < numroutes && routes[i].metric >= INFINITY) {

Then at kernel-check-interval timeout the new metric would be detected and update sent instead of ignored

The kernel default routes are never touched, as they are used by a timed external script to check if internet is available, only the redistribution parameter in babel is changed. So there is no kernel netlink update to use to trigger an update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants