Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serf Node Join Issues #745

Open
anjumm opened this issue Oct 1, 2024 · 0 comments
Open

Serf Node Join Issues #745

anjumm opened this issue Oct 1, 2024 · 0 comments

Comments

@anjumm
Copy link

anjumm commented Oct 1, 2024

Hi everyone,

I'm currently working on a network topology using 100 containers running Serf. The setup is a 2-spine, 4-leaf topology, with each leaf hosting 25 nodes. All nodes are in the same subnet, and the configuration is quite simple.

However, I'm facing an issue when trying to join nodes to the Serf cluster:

Up to 40 nodes, the serf join commands work seamlessly, and all nodes can see each other in the cluster.
Beyond 40 nodes, I encounter issues like I/O timeout, No route to host, and failed joins.
I've also noticed a significant increase in ARP broadcast traffic as more nodes are added.
I suspect this may be a network-related issue within the Containerlab setup or Serf's handling of ARP broadcasts. Has anyone encountered similar issues, or does anyone have suggestions on how to mitigate the ARP broadcast or join failures?

I have changed the gossip interval to 5 seconds to see if this can resolve the issue but no luck.
Furthermore, I have used a simple JSON file for each container to initialize serf:
image

Thanks in advance!

I have added images for review:

image

One of the interface of the container:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant