Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Calico node failed to start" when scaling out docker cluster #301

Open
aledsage opened this issue May 4, 2016 · 0 comments
Open

"Calico node failed to start" when scaling out docker cluster #301

aledsage opened this issue May 4, 2016 · 0 comments
Assignees

Comments

@aledsage
Copy link
Member

aledsage commented May 4, 2016

Using clocker 1.2.0-SNAPSHOT (at commit 7c9346c, while testing a couple of unrelated fixes for issues #288 and #290)...

I successfully deployed a 2 host clocker+calico cluster in BlueBox. I then deployed many entities that created containers (using Brooklyn's MachineEntity) to cause the cluster to auto-scale.

It create a third host, but this hung on startup (waiting forever for post-start to finish). It is waiting for SdnAgent agent = Entities.attributeSupplierWhenReady(this, SdnAgent.SDN_AGENT).get();.

Looking at the CalicoNode for that host, its service.state is "ON_FIRE" and its service.isUp is "false". Looking in the debug log (grep -E "OKsRTXuY|10.101.1.162"), I see the following error:

2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Pulling Docker image calico/node:v0.19.0
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Running Docker container with the following command:
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] docker run -d --restart=always --net=host --privileged --name=calico-node -e HOSTNAME=brooklyn-o6o7oy-aled-clocker-bl-fgdo-docker-host-hhfw-bb3 -e
 IP=10.101.1.162 -e IP6= -e CALICO_NETWORKING=true -e AS= -e NO_DEFAULT_POOLS= -e ETCD_AUTHORITY=10.101.1.162:2379 -e ETCD_SCHEME=http -v /var/log/calico:/var/log/calico -v /lib/modules:/lib/modules -v /var/run/calico:/var/run/calico ca
lico/node:v0.19.0
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Calico node is running with id: 06dc7cbec5c7241fbdf0dec2cecce312908f7ce90224e90844b5a494765b6b1c
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Waiting for successful startup
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Traceback (most recent call last):
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "startup.py", line 295, in <module>
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     main()
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "startup.py", line 285, in main
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     _ensure_host_tunnel_addr(ipv4_pools, ipip_pools)
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "startup.py", line 55, in _ensure_host_tunnel_addr
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     _assign_host_tunnel_addr(ipip_pools)
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "startup.py", line 74, in _assign_host_tunnel_addr
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     host=hostname
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 128, in wrapped
2016-05-04 22:03:08,221 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     return fn(*args, **kwargs)
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "/usr/lib/python2.7/site-packages/pycalico/ipam.py", line 618, in auto_assign_ips
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     pool[0], host)
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [brooklyn-execmanager-FnS0lXyr-1063]: launching CalicoNodeImpl{id=OKsRTXuY}, on machine SshMachineLocation[10.101.1.162:[email protected]/10.101.1.162:22(id=hKrZGyax)], completed: return status
 0
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "/usr/lib/python2.7/site-packages/pycalico/ipam.py", line 723, in _auto_assign
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     ipam_config)
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]   File "/usr/lib/python2.7/site-packages/pycalico/ipam.py", line 189, in _new_affine_block
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout]     "wrong attributes" % pool)
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] pycalico.datastore_errors.PoolNotFound: Requested pool 50.0.3.0/24 is not configured or haswrong attributes
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Calico node failed to start
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Pulling Docker image calico/node-libnetwork:v0.8.0
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Calico libnetwork driver is running with id: dc8372dbd5e8e821dfc102f1d6e89c1384592870cd0766316d365bbae496ae1d
2016-05-04 22:03:08,222 DEBUG brooklyn.SSH [Thread-24165]: [[email protected]:stdout] Executed /tmp/brooklyn-20160504-220243832-D1kk-launching_CalicoNodeImpl_id_OK.sh, result 0

It then goes on to repeatedly fail the check-running for CalicoNodeImpl{id=OKsRTXuY}.

2016-05-04 22:05:14,762 DEBUG brooklyn.SSH [brooklyn-execmanager-FnS0lXyr-1348]: check-running CalicoNodeImpl{id=OKsRTXuY}, on machine SshMachineLocation[10.101.1.162:[email protected]/10.101.1.162:22(id=hKrZGyax)], completed: return status 1
2016-05-04 22:05:14,762 DEBUG brooklyn.SSH [Thread-33364]: [[email protected]:stdout] calico-node container not running
2016-05-04 22:05:14,762 DEBUG brooklyn.SSH [Thread-33364]: [[email protected]:stdout] Executed /tmp/brooklyn-20160504-220514011-ZJFA-check-running_CalicoNodeImpl_i.sh, result 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants