You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the last 2 Dydx updates (v0.1.2 / v0.1.3), we had to turn off the Dydx service. When we finished the updates, the node caught up with the network. After that we've noticed that we're no longer signing on the dydx-testnet-2 network.
However:
from a horcrux point of view, everything's working and we're signing well
from a dydx network point of view --> we don't sign
It seems that the signature is not propagated on time to the network (not sure about that)
Environment:
Horcrux cluster with 3 remote signer (Bruxelles) : v2.1.1
1 dydx node (Helsinki) : v0.1.3 (and since the beginning of this network)
Here is a test I've tried this morning with v2.1.1 horcrux and v0.1.3 Dydx :
Stop dydx service during ~1 minute
Start dydx again
Catch the few blocks
Then, on Mintscan, for example, we didn't sign block 379479 (and all the other blocks ±60) even though we were well synchronized:
But on the horcrux remote signer :
Shard 1 :
| | 2023-08-23 11:34:37.373 | I[2023-08-23\|11:34:37.373] Signed with share module=validator height=379509 round=1 step=3 |
| | 2023-08-23 11:34:15.266 | I[2023-08-23\|11:34:15.266] Signed with share module=validator height=379505 round=0 step=3 |
| | 2023-08-23 11:34:02.535 | I[2023-08-23\|11:34:02.534] Signed with share module=validator height=379502 round=1 step=2 |
| | 2023-08-23 11:33:50.031 | I[2023-08-23\|11:33:50.031] Signed with share module=validator height=379501 round=2 step=2 |
| | 2023-08-23 11:33:44.274 | I[2023-08-23\|11:33:44.274] Signed with share module=validator height=379501 round=1 step=3 |
| | 2023-08-23 11:33:34.729 | I[2023-08-23\|11:33:34.729] Signed with share module=validator height=379500 round=0 step=3 |
| | 2023-08-23 11:33:30.843 | I[2023-08-23\|11:33:30.843] Signed with share module=validator height=379498 round=0 step=2 |
| | 2023-08-23 11:33:27.539 | I[2023-08-23\|11:33:27.539] Signed with share module=validator height=379496 round=1 step=3 |
| | 2023-08-23 11:33:21.171 | I[2023-08-23\|11:33:21.171] Signed with share module=validator height=379495 round=0 step=3 |
| | 2023-08-23 11:33:07.282 | I[2023-08-23\|11:33:07.281] Signed with share module=validator height=379490 round=0 step=3 |
| | 2023-08-23 11:32:42.336 | I[2023-08-23\|11:32:42.336] Signed with share module=validator height=379477 round=0 step=3 |
| | 2023-08-23 11:32:37.784 | I[2023-08-23\|11:32:37.784] Signed with share module=validator height=379476 round=0 step=3 |
| | 2023-08-23 11:32:16.417 | I[2023-08-23\|11:32:16.416] Signed with share module=validator height=379465 round=0 step=3 |
| | 2023-08-23 11:32:11.598 | I[2023-08-23\|11:32:11.597] Signed with share module=validator height=379464 round=0 step=2 |
| | 2023-08-23 11:32:06.579 | I[2023-08-23\|11:32:06.579] Signed with share module=validator height=379461 round=0 step=3 |
| | 2023-08-23 11:32:04.629 | I[2023-08-23\|11:32:04.628] Signed with share module=validator height=379460 round=0 step=2 |
| | 2023-08-23 11:32:03.199 | I[2023-08-23\|11:32:03.198] Signed with share module=validator height=379459 round=0 step=3
Shard 2 :
| | 2023-08-23 11:32:54.776 | D[2023-08-23\|11:32:54.776] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:54.627 | I[2023-08-23\|11:32:54.627] Signed vote module=validator height=379483 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=FC1740A77F0D ts=1692783174 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:54.616 | I[2023-08-23\|11:32:54.616] Signed with share module=validator height=379483 round=0 step=2 |
| | 2023-08-23 11:32:54.569 | D[2023-08-23\|11:32:54.569] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:53.040 | I[2023-08-23\|11:32:53.040] Signed vote module=validator height=379482 round=0 type=SIGNED_MSG_TYPE_PRECOMMIT sig=7A1273B4DD4F ts=1692783172 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:53.033 | I[2023-08-23\|11:32:53.033] Signed with share module=validator height=379482 round=0 step=3 |
| | 2023-08-23 11:32:52.981 | D[2023-08-23\|11:32:52.981] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:52.819 | I[2023-08-23\|11:32:52.819] Signed vote module=validator height=379482 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=F32A5636014E ts=1692783172 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:52.811 | I[2023-08-23\|11:32:52.811] Signed with share module=validator height=379482 round=0 step=2 |
| | 2023-08-23 11:32:52.757 | D[2023-08-23\|11:32:52.757] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:51.322 | I[2023-08-23\|11:32:51.322] Signed vote module=validator height=379481 round=0 type=SIGNED_MSG_TYPE_PRECOMMIT sig=B87A29FF55FB ts=1692783171 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:51.314 | I[2023-08-23\|11:32:51.314] Signed with share module=validator height=379481 round=0 step=3 |
| | 2023-08-23 11:32:51.266 | D[2023-08-23\|11:32:51.266] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:50.991 | I[2023-08-23\|11:32:50.991] Signed vote module=validator height=379481 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=499F07D44A5B ts=1692783170 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:50.983 | I[2023-08-23\|11:32:50.983] Signed with share module=validator height=379481 round=0 step=2 |
| | 2023-08-23 11:32:50.937 | D[2023-08-23\|11:32:50.937] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:49.500 | I[2023-08-23\|11:32:49.500] Signed vote module=validator height=379480 round=0 type=SIGNED_MSG_TYPE_PRECOMMIT sig=EE240DF6B917 ts=1692783169 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:49.495 | I[2023-08-23\|11:32:49.494] Signed with share module=validator height=379480 round=0 step=3 |
| | 2023-08-23 11:32:49.443 | D[2023-08-23\|11:32:49.443] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:49.341 | I[2023-08-23\|11:32:49.341] Signed vote module=validator height=379480 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=4E879F90A788 ts=1692783169 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:49.335 | I[2023-08-23\|11:32:49.335] Signed with share module=validator height=379480 round=0 step=2 |
| | 2023-08-23 11:32:49.288 | D[2023-08-23\|11:32:49.288] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:47.697 | I[2023-08-23\|11:32:47.697] Signed vote module=validator height=379479 round=0 type=SIGNED_MSG_TYPE_PRECOMMIT sig=862FC30D860C ts=1692783167 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:47.688 | I[2023-08-23\|11:32:47.688] Signed with share module=validator height=379479 round=0 step=3 |
| | 2023-08-23 11:32:47.633 | D[2023-08-23\|11:32:47.633] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:47.386 | I[2023-08-23\|11:32:47.386] Signed vote module=validator height=379479 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=298E01D041AC ts=1692783167 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:47.376 | I[2023-08-23\|11:32:47.376] Signed with share module=validator height=379479 round=0 step=2 |
| | 2023-08-23 11:32:47.329 | D[2023-08-23\|11:32:47.328] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:45.936 | I[2023-08-23\|11:32:45.936] Signed vote module=validator height=379478 round=0 type=SIGNED_MSG_TYPE_PRECOMMIT sig=019D744A6BAB ts=1692783165 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:45.929 | I[2023-08-23\|11:32:45.929] Signed with share module=validator height=379478 round=0 step=3 |
| | 2023-08-23 11:32:45.882 | D[2023-08-23\|11:32:45.882] I am not the raft leader. Proxying request to the leader module=validator |
| | 2023-08-23 11:32:45.633 | I[2023-08-23\|11:32:45.633] Signed vote module=validator height=379478 round=0 type=SIGNED_MSG_TYPE_PREVOTE sig=9593871AD2AD ts=1692783165 node=tcp://1.2.3.4:1234 |
| | 2023-08-23 11:32:45.624 | I[2023-08-23\|11:32:45.624] Signed with share module=validator height=379478 round=0 step=2 |
| | 2023-08-23 11:32:45.576 | D[2023-08-23\|11:32:45.576] I am not the raft leader. Proxying request to the leader module=validator |
Shard 3 :
| | 2023-08-23 11:32:54.829 | D[2023-08-23\|11:32:54.829] Received signature from 2 module=validator |
| | 2023-08-23 11:32:54.828 | D[2023-08-23\|11:32:54.828] Received signature from 3 module=validator |
| | 2023-08-23 11:32:54.814 | D[2023-08-23\|11:32:54.814] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:54.814 | D[2023-08-23\|11:32:54.814] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:54.813 | D[2023-08-23\|11:32:54.813] Have threshold peers module=validator |
| | 2023-08-23 11:32:54.777 | D[2023-08-23\|11:32:54.777] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:54.619 | D[2023-08-23\|11:32:54.618] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:54.618 | D[2023-08-23\|11:32:54.618] Received signature from 3 module=validator |
| | 2023-08-23 11:32:54.617 | D[2023-08-23\|11:32:54.617] Received signature from 2 module=validator |
| | 2023-08-23 11:32:54.603 | D[2023-08-23\|11:32:54.602] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:54.603 | D[2023-08-23\|11:32:54.602] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:54.602 | D[2023-08-23\|11:32:54.602] Have threshold peers module=validator |
| | 2023-08-23 11:32:54.571 | D[2023-08-23\|11:32:54.571] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:53.034 | D[2023-08-23\|11:32:53.034] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:53.034 | D[2023-08-23\|11:32:53.034] Received signature from 2 module=validator |
| | 2023-08-23 11:32:53.033 | D[2023-08-23\|11:32:53.033] Received signature from 3 module=validator |
| | 2023-08-23 11:32:53.019 | D[2023-08-23\|11:32:53.017] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:53.017 | D[2023-08-23\|11:32:53.017] Have threshold peers module=validator |
| | 2023-08-23 11:32:53.017 | D[2023-08-23\|11:32:53.017] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:52.983 | D[2023-08-23\|11:32:52.983] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:52.812 | D[2023-08-23\|11:32:52.812] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:52.812 | D[2023-08-23\|11:32:52.812] Received signature from 2 module=validator |
| | 2023-08-23 11:32:52.806 | D[2023-08-23\|11:32:52.806] Received signature from 3 module=validator |
| | 2023-08-23 11:32:52.790 | D[2023-08-23\|11:32:52.790] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:52.790 | D[2023-08-23\|11:32:52.790] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:52.790 | D[2023-08-23\|11:32:52.790] Have threshold peers module=validator |
| | 2023-08-23 11:32:52.758 | D[2023-08-23\|11:32:52.758] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:51.316 | D[2023-08-23\|11:32:51.316] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:51.316 | D[2023-08-23\|11:32:51.316] Received signature from 2 module=validator |
| | 2023-08-23 11:32:51.314 | D[2023-08-23\|11:32:51.314] Received signature from 3 module=validator |
| | 2023-08-23 11:32:51.299 | D[2023-08-23\|11:32:51.299] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:51.299 | D[2023-08-23\|11:32:51.299] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:51.299 | D[2023-08-23\|11:32:51.299] Have threshold peers module=validator |
| | 2023-08-23 11:32:51.268 | D[2023-08-23\|11:32:51.268] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:50.984 | D[2023-08-23\|11:32:50.984] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:50.983 | D[2023-08-23\|11:32:50.983] Received signature from 2 module=validator |
| | 2023-08-23 11:32:50.983 | D[2023-08-23\|11:32:50.983] Received signature from 3 module=validator |
| | 2023-08-23 11:32:50.969 | D[2023-08-23\|11:32:50.969] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:50.969 | D[2023-08-23\|11:32:50.969] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:50.969 | D[2023-08-23\|11:32:50.969] Have threshold peers module=validator |
| | 2023-08-23 11:32:50.938 | D[2023-08-23\|11:32:50.938] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:49.495 | D[2023-08-23\|11:32:49.495] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:49.495 | D[2023-08-23\|11:32:49.495] Received signature from 2 module=validator |
| | 2023-08-23 11:32:49.494 | D[2023-08-23\|11:32:49.494] Received signature from 3 module=validator |
| | 2023-08-23 11:32:49.481 | D[2023-08-23\|11:32:49.481] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:49.481 | D[2023-08-23\|11:32:49.481] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:49.480 | D[2023-08-23\|11:32:49.480] Have threshold peers module=validator |
| | 2023-08-23 11:32:49.445 | D[2023-08-23\|11:32:49.445] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:49.336 | D[2023-08-23\|11:32:49.336] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:49.336 | D[2023-08-23\|11:32:49.336] Received signature from 2 module=validator |
| | 2023-08-23 11:32:49.334 | D[2023-08-23\|11:32:49.334] Received signature from 3 module=validator |
| | 2023-08-23 11:32:49.321 | D[2023-08-23\|11:32:49.321] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:49.320 | D[2023-08-23\|11:32:49.320] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:49.320 | D[2023-08-23\|11:32:49.320] Have threshold peers module=validator |
| | 2023-08-23 11:32:49.290 | D[2023-08-23\|11:32:49.290] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:47.691 | D[2023-08-23\|11:32:47.691] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:47.690 | D[2023-08-23\|11:32:47.690] Received signature from 3 module=validator |
| | 2023-08-23 11:32:47.689 | D[2023-08-23\|11:32:47.689] Received signature from 2 module=validator |
| | 2023-08-23 11:32:47.675 | D[2023-08-23\|11:32:47.675] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:47.674 | D[2023-08-23\|11:32:47.673] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:47.672 | D[2023-08-23\|11:32:47.672] Have threshold peers module=validator |
| | 2023-08-23 11:32:47.635 | D[2023-08-23\|11:32:47.635] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:47.379 | D[2023-08-23\|11:32:47.379] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:47.378 | D[2023-08-23\|11:32:47.378] Received signature from 3 module=validator |
| | 2023-08-23 11:32:47.377 | D[2023-08-23\|11:32:47.377] Received signature from 2 module=validator |
| | 2023-08-23 11:32:47.362 | D[2023-08-23\|11:32:47.362] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:47.362 | D[2023-08-23\|11:32:47.362] Have threshold peers module=validator |
| | 2023-08-23 11:32:47.362 | D[2023-08-23\|11:32:47.362] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:47.330 | D[2023-08-23\|11:32:47.330] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:45.931 | D[2023-08-23\|11:32:45.931] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:45.931 | D[2023-08-23\|11:32:45.931] Received signature from 2 module=validator |
| | 2023-08-23 11:32:45.929 | D[2023-08-23\|11:32:45.929] Received signature from 3 module=validator |
| | 2023-08-23 11:32:45.916 | D[2023-08-23\|11:32:45.916] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:45.915 | D[2023-08-23\|11:32:45.915] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:45.915 | D[2023-08-23\|11:32:45.915] Have threshold peers module=validator |
| | 2023-08-23 11:32:45.884 | D[2023-08-23\|11:32:45.884] I am the raft leader. Managing the sign process for this block module=validator |
| | 2023-08-23 11:32:45.626 | D[2023-08-23\|11:32:45.626] Done waiting for cosigners, assembling signatures module=validator |
| | 2023-08-23 11:32:45.626 | D[2023-08-23\|11:32:45.626] Received signature from 3 module=validator |
| | 2023-08-23 11:32:45.626 | D[2023-08-23\|11:32:45.626] Received signature from 2 module=validator |
| | 2023-08-23 11:32:45.610 | D[2023-08-23\|11:32:45.610] Number of eph parts for peer module=validator peer=3 count=1 |
| | 2023-08-23 11:32:45.610 | D[2023-08-23\|11:32:45.610] Number of eph parts for peer module=validator peer=2 count=1 |
| | 2023-08-23 11:32:45.609 | D[2023-08-23\|11:32:45.609] Have threshold peers module=validator
In these logs, we can see that for block 379479 the precommit and prevote votes have been cast.
If we're looking at the horcrux dashboard at this time :
The graphs show that there is no problem.
To correct this problem, I just have to restart the dydx node without restarting the remote signers:
systemctl restart dydx
What could be the cause? We did set a timeout_commit = "999ms" on the dydx node as the team asked. We don't want to migrate to v3 at the moment because we can reproduce the problem live if you wish (and we don't know if v3 will fix the problem).
What I find odd is that it does it to us almost every time we have to catch up with the network after turning off the dydx service, we were able to reproduce this behaviour several times. And the solution is just to restart the service once it's synchronized.
The text was updated successfully, but these errors were encountered:
Hi @chichi13 , thanks for your detailed report.
This sounds like either a p2p peering issue on the dydx node, or potentially a bug in the dydx node software when using a remote signer.
Since horcrux is signing and returning the signatures, and does not require restarting to fix, I am not sure it can be resolved on the horcrux side. I would suggest upgrading to the latest release, v3.1.0, to see if that has any impact on the situation.
I'm happy to help debug further, let me know what you discover.
so we updated Horcrux to v3 and it appears to have successfully resolved the behavior @chichi13 described here, as we are no longer able to reproduce it with the v3
perhaps the speed of the Horcrux v2 cluster with 3 shards is too short for the dydx node speed indeed
During the last 2 Dydx updates (v0.1.2 / v0.1.3), we had to turn off the Dydx service. When we finished the updates, the node caught up with the network. After that we've noticed that we're no longer signing on the
dydx-testnet-2
network.However:
It seems that the signature is not propagated on time to the network (not sure about that)
Environment:
Here is a test I've tried this morning with v2.1.1 horcrux and v0.1.3 Dydx :
dydx
service during ~1 minutedydx
againThen, on Mintscan, for example, we didn't sign block
379479
(and all the other blocks ±60) even though we were well synchronized:But on the horcrux remote signer :
In these logs, we can see that for block
379479
the precommit and prevote votes have been cast.If we're looking at the horcrux dashboard at this time :
The graphs show that there is no problem.
To correct this problem, I just have to restart the dydx node without restarting the remote signers:
systemctl restart dydx
What could be the cause? We did set a
timeout_commit = "999ms"
on the dydx node as the team asked. We don't want to migrate to v3 at the moment because we can reproduce the problem live if you wish (and we don't know if v3 will fix the problem).What I find odd is that it does it to us almost every time we have to catch up with the network after turning off the dydx service, we were able to reproduce this behaviour several times. And the solution is just to restart the service once it's synchronized.
The text was updated successfully, but these errors were encountered: