-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not unnecessarily retransmit commitment_signed
in dual funding
#1214
base: master
Are you sure you want to change the base?
Do not unnecessarily retransmit commitment_signed
in dual funding
#1214
Conversation
…umber` As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. That's not what we're currently doing: we're currently setting this value to the next commitment number, regardless of whether or not we have received our peer's `commit_sig`. And we always retransmit our `commit_sig` if our peer is setting `next_funding_txid`, even if they have already received it. More importantly, if our peer behaves correctly and sends us the current commitment number, we will think that they're late and will halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that this doesn't yet make us spec-compliant, but in order to guarantee backwards-compatibility, we must first deploy that change before we can start removing spurious `commit_sig` retransmissions.
We fully implement lightning/bolts#1214 to stop retransmitting `commit_sig` when our peer has already received it. We also correctly set `next_commitment_number` to let our peer know whether we have received their `commit_sig` or not.
As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. More importantly, if our peer behaves correctly and sends us the current commitment number, we must not think that they're late and halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that we keep retransmitting our `commit_sig` regardless of the value our peer set in `next_commitment_number`, because we need to wait for them to have an opportunity to upgrade. In a future commit we will stop sending spurious `commit_sig`.
We fully implement lightning/bolts#1214 to stop retransmitting `commit_sig` when our peer has already received it. We also correctly set `next_commitment_number` to let our peer know whether we have received their `commit_sig` or not.
Core lightning currently follows these rules during
|
It's worth noting that the proposed change (setting |
…umber` (#2965) As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. That's not what we're currently doing: we're currently setting this value to the next commitment number, regardless of whether or not we have received our peer's `commit_sig`. And we always retransmit our `commit_sig` if our peer is setting `next_funding_txid`, even if they have already received it. More importantly, if our peer behaves correctly and sends us the current commitment number, we will think that they're late and will halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that this doesn't yet make us spec-compliant, but in order to guarantee backwards-compatibility, we must first deploy that change before we can start removing spurious `commit_sig` retransmissions.
But if I understand correctly it will only fail on the reconnect corner case, so maybe nobody will notice? Technically it's still experimental for CLN, so we could change it. Better would be to change the feature bit, but that is pretty intrusive as it's been merged in the spec already. |
Yes exactly. And I just combed our node's logs, and it never happened that we were in this reconnection case with a non-phoenix node. So I really feel that we shouldn't bother with backwards-compat and that nobody will run into this issue in practice.
Agreed, using a feature bit here would be annoying...I think it's safe to update this without it and consider it an implementation issue? |
On reconnection in the middle of the dual-funding flow, if both nodes have exchanged the initial `commitment_signed` and node A had sent its (initial) `tx_signatures` but node B never received them, both nodes should send a `channel_reestablish` with `next_funding_txid` set and a `next_commitment_number` of 1 (as they've already received the commitment transaction for commitment number 0). The spec indicates in this case that both nodes should retransmit their `commitment_signed`, however, as this is only gated on `next_funding_txid` and not the `next_commitment_number` field. This may cause implementations which assume that each new `commitment_signed` is for a new state to fail and potentially fail the channel. Instead, we should rely both the presence of `next_funding_txid` *and* `next_commitment_number` being zero to decide if we need to resend our `commitment_signed`. Sadly, we cannot rely on just `next_commitment_number` as that is used to request a force-closure in a non-standard way to work around implementations not honoring the `error` message.
Sending a `channel_reestablish` with `next_commitment_number` of zero is used in practice to request a peer fail a channel and broadcast the latest state (for implementations which continue to ignore the `error` message). Because its used in practice we should document it to avoid accidentally writing up incompatible things in the future.
90772ff
to
d8cfa95
Compare
Fixed the issue dusty pointed out: $ git diff-tree -U5 90772ff d8cfa95
diff --git a/02-peer-protocol.md b/02-peer-protocol.md
index e9ba8bc..3e7ecb2 100644
--- a/02-peer-protocol.md
+++ b/02-peer-protocol.md
@@ -2480,16 +2480,12 @@ A node:
- MUST ignore any redundant `channel_ready` it receives.
- if `next_commitment_number` is equal to the commitment number of
the last `commitment_signed` message the receiving node has sent:
- MUST reuse the same commitment number for its next `commitment_signed`.
- otherwise:
- - if `next_commitment_number` is not 1 greater than the
- commitment number of the last `commitment_signed` message the receiving
- node has sent:
- - SHOULD send an `error` and fail the channel.
- - if it has not sent `commitment_signed`, AND `next_commitment_number`
- is not equal to 1:
+ - if `next_commitment_number` is not equal to the commitment number of the
+ next `commitment_signed` the receiving node will send:
- SHOULD send an `error` and fail the channel.
- if `next_revocation_number` is equal to the commitment number of
the last `revoke_and_ack` the receiving node sent, AND the receiving node
hasn't already received a `closing_signed`:
- MUST re-send the `revoke_and_ack`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK d8cfa95, thanks!
As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. More importantly, if our peer behaves correctly and sends us the current commitment number, we must not think that they're late and halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that we keep retransmitting our `commit_sig` regardless of the value our peer set in `next_commitment_number`, because we need to wait for them to have an opportunity to upgrade. In a future commit we will stop sending spurious `commit_sig`.
As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. More importantly, if our peer behaves correctly and sends us the current commitment number, we must not think that they're late and halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that we keep retransmitting our `commit_sig` regardless of the value our peer set in `next_commitment_number`, because we need to wait for them to have an opportunity to upgrade. In a future commit we will stop sending spurious `commit_sig`.
We fully implement lightning/bolts#1214 to stop retransmitting `commit_sig` when our peer has already received it. We also correctly set `next_commitment_number` to let our peer know whether we have received their `commit_sig` or not.
ACK d8cfa95 |
As pointed out in lightning/bolts#1214, when reconnecting a partially signed `interactive-tx` session, we should set `next_commitment_number` to the current commitment number if we haven't received our peer's `commit_sig`, which tells them they need to retransmit it. More importantly, if our peer behaves correctly and sends us the current commitment number, we must not think that they're late and halt, waiting for them to send `error`. This commit fixes that by allowing our peers to use the current commitment number when they set `next_funding_txid`. Note that we keep retransmitting our `commit_sig` regardless of the value our peer set in `next_commitment_number`, because we need to wait for them to have an opportunity to upgrade. In a future commit we will stop sending spurious `commit_sig`.
On reconnection in the middle of the dual-funding flow, if both nodes have exchanged the initial
commitment_signed
and node A had sent its (initial)tx_signatures
but node B never received them, both nodes should send achannel_reestablish
withnext_funding_txid
set and anext_commitment_number
of 1 (as they've already received the commitment transaction for commitment number 0).The spec indicates in this case that both nodes should retransmit their
commitment_signed
, however, as this is only gated onnext_funding_txid
and not thenext_commitment_number
field. This may cause implementations which assume that each newcommitment_signed
is for a new state to fail and potentially fail the channel.Instead, we should rely both the presence of
next_funding_txid
andnext_commitment_number
being zero to decide if we need to resend ourcommitment_signed
. Sadly, we cannot rely on justnext_commitment_number
as that is used to request a force-closure in a non-standard way to work around implementations not honoring theerror
message.