Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP address and policy removal on connection update #433

Merged
merged 3 commits into from
Mar 25, 2022

Conversation

LionelJouin
Copy link
Member

  • Based on interfaces addresses, IPs in IPContext are filtered: to add/to removed
  • Based on memory, Policy routes are filtered: to add/to removed
  • Doesn't support forwarder resiliency. If the forwarder restart, the existing policy routes are forgotten

issue: #426

* Based on interfaces addresses, IPs in IPContext are filtered: to add/to removed

Signed-off-by: Lionel Jouin <[email protected]>
Comment on lines 161 to 162
toAdd := []*net.IPNet{}
toRemove := []*net.IPNet{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: you can append into nil slices

Suggested change
toAdd := []*net.IPNet{}
toRemove := []*net.IPNet{}
var toAdd []*net.IPNet{}
var toRemove []*net.IPNet{}

if !ok {
counter.Inc()
tableID = int(counter.Load())
ps, ok := tableIDs.Load(mechanism.GetNetNSURL())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps before that we should check if len(conn.Context.IpContext.Policies) == 0 { return nil } so as not to create new tableID entry.

@@ -254,6 +282,8 @@ func delRule(ctx context.Context, handle *netlink.Handle, policy *networkservice
return errors.WithStack(err)
}

// TODO: Flush table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that we need to delete an entry from Maps. Perhaps after deleting current NSC, a new one will come with the same mechanism.GetNetNSURL(). It may cause an error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of having mechanism.GetNetNSURL() as key is to sync the table IDs for every connection handled by a pod, if we use, for instance, the connection ID as key, and the pod requests 2 NSM connections with policies, there might be table ID collision.

Otherwise, with the mechanism.GetNetNSURL() as key, I am not sure when to delete the entry from the map

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, got it.
I thought that we can delete here an entry from tableIDs.Load(mechanism.GetNetNSURL()).policies
And somewhere here https://github.com/networkservicemesh/sdk-kernel/blob/main/pkg/kernel/networkservice/connectioncontextkernel/ipcontext/iprule/common.go#L246
we can check:

if len(tableIDs.Load(mechanism.GetNetNSURL()).policies) == 0 {
  tableIDs.Delete(mechanism.GetNetNSURL()) 
  }

Do I understand correctly? Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is a good idea. And probably we could iterate over the policy routes entries in the tableIDs map instead of the ones in the connection.

Also, before implementing it, flushing the table should be done since the counter will be reset. Otherwise when the entry will be remove and a new one will be added again, and tables will contain old routes.

}
ps.policies[tableID] = policy
tableIDs.Store(mechanism.GetNetNSURL(), ps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add ps to tableIDs after loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, if for some reason, routeAdd/ruleAdd fails, the previously added rules/routes will be forgotten, so I think it is better to keep the tableIDs update whithin the loop.

@NikitaSkrynnik
Copy link
Contributor

NikitaSkrynnik commented Mar 22, 2022

@LionelJouin I've built cmd-forwarder-vpp with your changes and launched examples/features/policy-based-routing test. It looks like iprule doesn't create tables for some routes. As you can see the second route is in default table. It also doesn't have nsm-1 interface. I tried to debug it but didn't find anything yet.

image

Could you please check this test?

@LionelJouin
Copy link
Member Author

@NikitaSkrynnik I just tried it:
With the changes:

/ # ip rule
0:      from all lookup local
32764:  from all ipproto udp lookup 2
32765:  from 172.16.2.201/24 ipproto tcp lookup 1
32766:  from all lookup main
32767:  from all lookup default
/ # ip route show table 1
172.16.3.0/24 via 172.16.2.200 dev nsm-1 onlink 
/ # ip route show table 2
Dump terminated
/ # ip -6 rule
0:      from all lookup local
32765:  from 2004::3/120 ipproto udp lookup 3
32766:  from all lookup main
/ # ip -6 route show table 3
2004::/120 via 2004::6 dev nsm-1 metric 1024 onlink pref medium
/ # ip route
default via 10.244.1.1 dev eth0 
10.244.1.0/24 via 10.244.1.1 dev eth0 src 10.244.1.4 
10.244.1.1 dev eth0 scope link src 10.244.1.4 
172.16.1.100 dev nsm-1 
172.16.2.0/24 dev nsm-1 proto kernel scope link src 172.16.2.201 
/ # ip -6 route
2004::/120 dev nsm-1 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev nsm-1 proto kernel metric 256 pref medium

Without the changes:

/ # ip rule
0:      from all lookup local
32764:  from all ipproto udp lookup 2
32765:  from 172.16.2.201/24 ipproto tcp lookup 1
32766:  from all lookup main
32767:  from all lookup default
/ # ip route show table 1
172.16.3.0/24 via 172.16.2.200 dev nsm-1 onlink 
/ # ip route show table 2
default dev nsm-1 
/ # ip -6 rule
0:      from all lookup local
32765:  from 2004::3/120 ipproto udp lookup 3
32766:  from all lookup main
/ # ip -6 route show table 3
2004::/120 via 2004::6 dev nsm-1 metric 1024 onlink pref medium
/ # ip route
default via 10.244.2.1 dev eth0 
10.244.2.0/24 via 10.244.2.1 dev eth0 src 10.244.2.6 
10.244.2.1 dev eth0 scope link src 10.244.2.6 
172.16.1.100 dev nsm-1 
172.16.2.0/24 dev nsm-1 proto kernel scope link src 172.16.2.201 
/ # ip -6 route
2004::/120 dev nsm-1 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev nsm-1 proto kernel metric 256 pref medium

With the changes, the table 2 is missing the default route. I removed these lines:
https://github.com/networkservicemesh/sdk-kernel/blob/main/pkg/kernel/networkservice/connectioncontextkernel/ipcontext/iprule/common.go#L73
I will add them again to fix it.

* Based on memory, Policies are filtered: to add/to removed
* Doesn't support forwarder resiliency. If the forwarder restart, the
  existing policy routes are forgotten

Signed-off-by: Lionel Jouin <[email protected]>
* Flush table
* Remove empty ip rule entries on close
* Remove unused code

Signed-off-by: Lionel Jouin <[email protected]>
@LionelJouin
Copy link
Member Author

LionelJouin commented Mar 23, 2022

@NikitaSkrynnik @glazychev-art @denis-tingaikin I fixed what have been discussed. Now the tables are flushed on connection close and the entries are removed from the map if they are empty.

@glazychev-art
Copy link
Contributor

glazychev-art commented Mar 24, 2022

Thank you @LionelJouin !

We've tested this solution, and found a problem - mechanism.GetNetNSURL() can be different, even if the Request comes from the same NSC.

Example:
NSC requests the NSE-1

  1. Forwarder receives the Request with mechanism.GetNetNSURL() is inode://4/4026533591
  2. recvFDServer receives NetNSURL, and mechanism.GetNetNSURL() becomes file:///proc/13811/fd/19

Next, NSC requests the NSE-2:

  1. Forwarder receives the Request with mechanism.GetNetNSURL() is inode://4/4026533591 (the same as in the first Request)
  2. recvFDServer receives NetNSURL, and mechanism.GetNetNSURL() becomes file:///proc/13811/fd/34(different from the first Request)

Since the use of mechanism.GetNetNSURL() as a key solves the problem of identifying the same client, we have the following options:

1. Fix recvFDServer

It uses a map with ConnectionId as a key. Perhaps, we should change it to the NetNSURL (this will be inode://4/4026533591 from the example above)
https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/mechanisms/recvfd/server.go#L53-L56

2. Add additional field to API

Currently we use only InodeURL field, and change it as needed (file:// or inode://).
https://github.com/networkservicemesh/api/blob/main/pkg/api/networkservice/mechanisms/common/constants.go#L45-L46
Perhaps, we need to something like InodeURLOrigin and always store there the original inode://... value.

3. Dump client for existing tables

We can try to use dump NSC using netlink (if possible), and determine which tables are already in use.

@glazychev-art
Copy link
Contributor

@edwarnicke @denis-tingaikin @NikitaSkrynnik
Any thoughts?

@denis-tingaikin
Copy link
Member

@LionelJouin , @edwarnicke , @glazychev-art I feel we need to try option 3

@denis-tingaikin denis-tingaikin merged commit 61d18a5 into networkservicemesh:main Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants