Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pimd sometimes doesn't recover after link drops and comes back #183

Open
masaraksh79 opened this issue Dec 17, 2020 · 4 comments
Open

pimd sometimes doesn't recover after link drops and comes back #183

masaraksh79 opened this issue Dec 17, 2020 · 4 comments

Comments

@masaraksh79
Copy link

Running v2.3.2 patched (with fixes of #79 and #137), we are experiencing an issued with pimd not recovering after a radio link goes down (radio link is between Radio1 and Radio2). It does not happen every time, but definitely after few link downs and ups.

We would really appreciate if you would had any clue to solution of this issue since it has a noticeable affect on our multicast.
Our network looks like this (our pimd is only running on Radio1 and Radio2):

Screen Shot 2020-12-17 at 14 53 35

Note: in logs our comments are inside # lines for easier reading.

################################################################################
We have captured highest log level logs with pimd in both scenarios. First when
it does not recover
pimd-no-recover.txt
(it recovers only if pimd is restarted) and second when it
does recover
pimd-recover.txt
. At the beginning of logs we added timestamps of
when link was put down and up (+/- few seconds).

We have noticed one message which appears in a non-recover scenario and does not
when pimd manages to recover:

...
00:10:54.043 delete_mrtentry_all_kernel_cache: SG
00:10:54.044 Removed MFC entry src 10.200.55.101, grp 239.0.4.1
...
################################################################################

In a good state, pimd status on each side of the link is as follows:

pimd-logs.txt

Please let me know if you require any additional info. We have a setup where the problem is quite reproducible, so we could try things out have you suggested.

Thanks ahead!

Best regards
Yakir

@troglobit
Copy link
Owner

I hope you understand that this is incredibly hard for me to help you with. Working on this project has been a hobby for the last 10+ years, nobody pays me, and even if they did I don't have the time anyway since $DAYJOB takes up 110% of my waken time.

The only thing I can do is ask you to please test the latest GIT master branch and see if that works better.

A few years back I got patches to support point-to-point links from a company. Some of them have been merged, look for "Ventus" in the GIT log. Here's the rest, from an old git stash, that probably don't apply clean on anything, but should provide an idea of what's left: ventus-point-to-point_patch.txt

Would be great if you could help out test both latest master and this remaining patch set, that would increase the chances greatly of having a new release out during my time off over the holiday season.

@masaraksh79
Copy link
Author

Fully understand, we're in similar boats in this sense heh, we shall try out the latest. I'll update the ticket once done. Cheerios!

@troglobit
Copy link
Owner

Thank you! <3

@masaraksh79
Copy link
Author

While we're looking at the patches, we are trying to fire in all cannons. Dear @troglobit or other devs involved in pimd, we are looking for support in our attempt to resolve this issue. My current employer has hired a Linux dev to debug this with no past experience with pim or the project and is willing to pay to get consultancy from people who had good relevant expertise. Please contact me on my working email [email protected] if you're interested to discuss this short term opportunity. Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants