Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP and policy removal not working on connection update #426

Open
LionelJouin opened this issue Mar 11, 2022 · 17 comments
Open

IP and policy removal not working on connection update #426

LionelJouin opened this issue Mar 11, 2022 · 17 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@LionelJouin
Copy link
Member

After some testing with this change: networkservicemesh/sdk#1231. I found it was possible to add new IP addresses and policy routes by just requesting (updating) a new time without having to close the connection.

The issue I noticed is that it is not possible to remove any IP or policy route while they are removed from the IP context in the request.

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Mar 13, 2022

I feel this is totally related to networkservicemesh/cmd-forwarder-vpp#441

To resolve the issue we should rework each chain element from sdk-vpp and add a chec on change.
If we have a change then we need to delete previosly created things.
otherwise do as we do currently.

@denis-tingaikin
Copy link
Member

cc @edwarnicke

@denis-tingaikin denis-tingaikin added the bug Something isn't working label Mar 13, 2022
@LionelJouin
Copy link
Member Author

LionelJouin commented Mar 13, 2022

Right, it is related. Just in my use case, the NSC is intentionally changing the content of the IP context. From my tests, the ipadress and iprule chain elements are working fine for adding but not for removing.

@edwarnicke
Copy link
Member

@LionelJouin That likely is related to needing to update the chain elements that do that work.

@denis-tingaikin One other thing to watch carefully is that we are only doing the work on the return stroke of the call in those chain elements. If the NSE is OK allowing the NSC to update IP context... that's fine... but its not OK to act on NSC requests to update IPContext until the NSE has returned those updated (meaning it thinks they are valid).

@LionelJouin Does that make sense to you?

@LionelJouin
Copy link
Member Author

yes, it's clear to me.

@denis-tingaikin
Copy link
Member

@LionelJouin networkservicemesh/sdk#1234 is merged. Could you test is it solves the problem?

@LionelJouin
Copy link
Member Author

I tried it, it works fine, It solved this issue: networkservicemesh/sdk#1231

@edwarnicke
Copy link
Member

@LionelJouin and does it also solve your issue here with IP policy removal on connection update? Or is that still open?

@LionelJouin
Copy link
Member Author

LionelJouin commented Mar 14, 2022

no, it is not solved, this issue is still open (#426).

@LionelJouin
Copy link
Member Author

@edwarnicke @denis-tingaikin How the forwarder resiliency could be handled? Handling IP addresses would be easy since it is possible to get the IP addresses by checking the interface directly and then compare them to the ones in the IPContext to find which ones have to be added or removed. But I believe this would be more complex for the policy routes, how the forwarder could remember which rule/table belongs to which policy? (in case the forwarder crashes/restarts/...)

@glazychev-art
Copy link
Contributor

Hello @LionelJouin,
Maybe I misunderstood you. Is the problem still present after this fix? - #437

@LionelJouin
Copy link
Member Author

Hi, yes, I think so. For instance, if the forwarder restarts, then the policies map will be emptied, and then the forwarder will loose track of which ip tables/rules it was managing previously.

@glazychev-art
Copy link
Contributor

Ah.. got it.
So, if forwarder restarts - client starts healing with the connection, where we should have our rules. And I think we can restore them in forwarder.
At this line we can check tables map, and if it is empty for a given connection Id - try to find added table IDs using netlink.RuleList().

@edwarnicke
Copy link
Member

@LionelJouin One other thing to keep in mind is that if your NSE wants to trigger a change... it should probably use:

https://github.com/networkservicemesh/api/blob/main/pkg/api/networkservice/connection.proto#L36

Which is to say:

  1. NSE sends a ConnectionEvent over Monitor towards the NSE with State set to REFRESH_REQUESTED
  2. This will cause the NSC to send a new Request
  3. The NSE can return a changed Connection.ConnectionContext

Basically... this way you don't have to wait around for a Refresh to happen at Expiration.

@denis-tingaikin denis-tingaikin added this to the v1.6.0 milestone Aug 17, 2022
@LionelJouin LionelJouin moved this to 👀 In review in Meridio Aug 23, 2022
@LionelJouin LionelJouin moved this from 👀 In review to 🏗 In progress in Meridio Aug 23, 2022
@LionelJouin LionelJouin self-assigned this Aug 23, 2022
@LionelJouin LionelJouin moved this from 🏗 In progress to 👀 In review in Meridio Aug 23, 2022
@LionelJouin LionelJouin moved this from 👀 In review to 🏗 In progress in Meridio Aug 23, 2022
@LionelJouin LionelJouin moved this from 🏗 In progress to 👀 In review in Meridio Aug 25, 2022
@glazychev-art
Copy link
Contributor

@LionelJouin
As far as I can see, all main PRs have been merged.
Can we close this issue?

@LionelJouin
Copy link
Member Author

There is still 2e2 tests to develop and other properties to support. We could keep this issue, otherwise I would open new ones for the e2e and properties

@edwarnicke
Copy link
Member

@LionelJouin Agree on tracking the other issues. I'd suggest opening new issues and referencing this one.

@LionelJouin LionelJouin moved this from 👀 In review to ✅ Done in Meridio Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants