Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 0.10.x / 0.11.x stopped working after restart "requirement failed: public key is invalid" #3009

Open
EagleTM opened this issue Feb 16, 2025 · 14 comments · May be fixed by #3012
Open

Node 0.10.x / 0.11.x stopped working after restart "requirement failed: public key is invalid" #3009

EagleTM opened this issue Feb 16, 2025 · 14 comments · May be fixed by #3012

Comments

@EagleTM
Copy link

EagleTM commented Feb 16, 2025

I'm now on master and Bitcoin Core 28.1, same issue unfortunately.

Discussed in #3008

Originally posted by EagleTM February 15, 2025
Hi,
I'm running into a fatal issue on a bech32 wallet on 0.10.x with core 25.0 the error persists upon upgrading to 0.11.x with bitcoin core 27.2
The node was shutdown cleanly. Upon restart the error message is:

It never binds the ln port, so can't be queried further:

2025-02-15 17:01:17,722 INFO fr.acinq.eclair.Setup - initial address= for bitcoin wallet=eclair
2025-02-15 17:01:17,811 INFO akka.actor.ActorCell - channels listener is ready
2025-02-15 17:01:17,871 ERROR a.a.OneForOneStrategy - requirement failed: public key is invalidjava.lang.IllegalArgumentException: requirement failed: public key is invalid
at scala.Predef$.require(Predef.scala:337)
at fr.acinq.bitcoin.scalacompat.Crypto$PublicKey$.fromBin(Crypto.scala:115)
at fr.acinq.bitcoin.scalacompat.Crypto$PublicKey$.apply(Crypto.scala:111)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$4(CommonCodecs.scala:167)
at scodec.DecodeResult.map(DecodeResult.scala:17)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$3(CommonCodecs.scala:167)
at scodec.Attempt$Successful.map(Attempt.scala:111)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$2(CommonCodecs.scala:167)
at scodec.Codec$$anon$10.decode(Codec.scala:484)
at scodec.Codec$$anon$7.decode(Codec.scala:411)
at scodec.DecoderFunctions.decodeBothCombine(Decoder.scala:137)
at scodec.DecoderFunctions.decodeBothCombine$(Decoder.scala:133)
at scodec.Codec$.decodeBothCombine(Codec.scala:472)
at scodec.codecs.HListCodec$$anon$2.decode(HListCodec.scala:22)
at scodec.Codec$$anon$2.decode(Codec.scala:201)
at scodec.DecoderFunctions.decodeCollect(Decoder.scala:184)
at scodec.DecoderFunctions.decodeCollect$(Decoder.scala:175)
at scodec.Decoder$.decodeCollect(Decoder.scala:228)
at scodec.codecs.ListCodec.decode(ListCodec.scala:16)
at scodec.Decoder$$anon$1.decode(Decoder.scala:48)
at scodec.GenCodec$$anon$2.decode(GenCodec.scala:78)
at scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57)
at scodec.Attempt$Successful.flatMap(Attempt.scala:113)
at scodec.Decoder$$anon$2.decode(Decoder.scala:56)

I tried starting with elcair network database moved to backup, no difference
I'm at loss what the issue could be - I'm assuming it has issues parsing bitcoin-core wallet pubkeys rather than ln pubkeys.

Any ideas what I could do to debug this further?

@EagleTM
Copy link
Author

EagleTM commented Feb 16, 2025

Error message on master changed line numbers to:

12:26:19,212 ERROR a.a.OneForOneStrategy - requirement failed: public key is invalidjava.lang.IllegalArgumentException: requirement failed: public key is invalid
at scala.Predef$.require(Predef.scala:337)
at fr.acinq.bitcoin.scalacompat.Crypto$PublicKey$.fromBin(Crypto.scala:115)
at fr.acinq.bitcoin.scalacompat.Crypto$PublicKey$.apply(Crypto.scala:111)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$4(CommonCodecs.scala:162)
at scodec.DecodeResult.map(DecodeResult.scala:17)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$3(CommonCodecs.scala:162)
at scodec.Attempt$Successful.map(Attempt.scala:111)
at fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$2(CommonCodecs.scala:162)
at scodec.Codec$$anon$10.decode(Codec.scala:484)
at scodec.Codec$$anon$7.decode(Codec.scala:411)
at scodec.DecoderFunctions.decodeBothCombine(Decoder.scala:137)
at scodec.DecoderFunctions.decodeBothCombine$(Decoder.scala:133)

@EagleTM
Copy link
Author

EagleTM commented Feb 16, 2025

With the help of some debug logging I've managed to narrow it down to:

2025-02-16 14:29:25,758 ERROR f.a.e.w.p.CommonCodecs - [CommonCodecs] Failed to decode public key: 03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370d0c863e8f0ce56java.lang.IllegalArgumentException: requirement failed: public key is invalid

This looks like it's parsing LN node pub keys?

With the help of GPT4 I've modified to do some logging. As it's now using try{} / catch{} logic it's not crashing the code opening the LN and API port anymore and the node seems function. I'd still like to better understand what's going on:

eclair-core/src/main/scala/fr/acinq/eclair/wire/protocol/CommonCodecs.scala

import org.slf4j.LoggerFactory // Import Logger
object CommonCodecs {
private val logger = LoggerFactory.getLogger(this.getClass) // Define Logger`

val publicKey: Codec[PublicKey] = Codec[PublicKey](
(pub: PublicKey) => bytes(33).encode(pub.value),
(wire: BitVector) => {
val rawBytesAttempt = bytes(33).decode(wire)

  rawBytesAttempt match {
    case Attempt.Successful(decoded) =>
      val hexKey = decoded.value.toHex
      logger.error(s"[CommonCodecs] Decoding public key: $hexKey (length: ${decoded.value.length} bytes)")

      try {
        val pubKey = PublicKey.fromBin(decoded.value)  // This is where it fails
        logger.info(s"[CommonCodecs] Successfully decoded public key: $hexKey")
        Attempt.successful(DecodeResult(pubKey, wire.drop(33 * 8))) // Normal flow
      } catch {
        case e: IllegalArgumentException =>
          logger.error(s"[CommonCodecs] ❌ Failed to decode public key: $hexKey", e)
          Attempt.failure(Err(s"Invalid public key: $hexKey - ${e.getMessage}"))
      }

    case Attempt.Failure(err) =>
      logger.error(s"[CommonCodecs] ❌ Failed to extract public key bytes: ${err.messageWithContext}")
      Attempt.failure(err)
  }
}

)
}

@ecdsa
Copy link

ecdsa commented Feb 16, 2025

With the help of some debug logging I've managed to narrow it down to:

2025-02-16 14:29:25,758 ERROR f.a.e.w.p.CommonCodecs - [CommonCodecs] Failed to decode public key: >03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370d0c863e8f0ce56java.lang.IllegalArgumentException: requirement failed: public key is invalid

The pubkey in your log starts like the acinq node pubkey. Only the last 7 bytes differ:

03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f
03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370d0c863e8f0ce56

@EagleTM
Copy link
Author

EagleTM commented Feb 16, 2025

I've dumped the sqlite databases eclair, audit, network hunting for 03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370d0c863e8f0ce56 but I can't find it there.

The legit ACINQ node and Electrum Trampoline are in there and are also checked for mutiple times (once per channel?) as pub keys on startup.

@t-bast
Copy link
Member

t-bast commented Feb 17, 2025

That's very strange indeed: this public key is not a valid public key (not a point on the secp256k1 curve). The fact that it looks a lot like our node's public key is intriguing. Since this fails at start-up, it is likely that this is data that is either read from the DB or the configuration.

Invalid data should never get into the DB: we always validate data before writing it to the DB, unless we've missed something important. Are you 100% sure this public key doesn't appear in the DB? Note that public keys are not hex-encoded in SQLite but rather encoded as blobs of bytes. A good way to make sure is to read your DB in a unit test and verify that everything can be listed properly: that would guarantee that this invalid data is not found in the DB. I can share some code that does this validation for you if you want.

If it's not coming from the DB, it is likely coming from eclair.conf: did you check that this public key isn't found in your configuration files or in the code itself (if you use any custom plugin)?

It would be useful to have a better stack trace to figure out what data is being decoded that contains this public key. Can you try printing the exception's stack trace explicitly (with e.getStackTrace) to see if it provides a more detailed stack?

@sstone
Copy link
Member

sstone commented Feb 17, 2025

I'm running into a fatal issue on a bech32 wallet on 0.10.x with core 25.0 the error persists upon upgrading to 0.11.x with bitcoin core 27.2

Were you running eclair 0.10.0 or master ? were you upgrading to 0.11.0 or to a commit on master ?

It does look like a db error, can you try starting eclair without the network db ? (just move all the network.sqlite* files to a different location. ⚠️ just move the network db, NOT the other databases!)

@EagleTM
Copy link
Author

EagleTM commented Feb 17, 2025

I was running 0.10.0 then upgraded to 0.11.0 and then master (from yesterday). The issue occurred on restart back on 0.10.0
I tried removing the network.db as a first measure back when i was debugging this after it failed in 0.10.0 but that did not fix it.
I'll try to find ways of hunting down the malformed pubkey in the db or config and will report back.
What are the risks of losing funds with just ignoring the malformed key (assuming the rest of the database is fine)?

@t-bast
Copy link
Member

t-bast commented Feb 17, 2025

Just to make sure we understand you correctly: please confirm that you never downgraded eclair? You were initially at v0.10.0, then switched to v.0.11.0, then to master?

Eclair doesn't support reverting to a previous version if there was a DB migration (we only support migrating data towards a newer version, not towards an older one).

@t-bast
Copy link
Member

t-bast commented Feb 17, 2025

What are the risks of losing funds with just ignoring the malformed key (assuming the rest of the database is fine)?

It's impossible to know yet, that depends on where that public key comes from (channel data or network data).

@EagleTM
Copy link
Author

EagleTM commented Feb 19, 2025

Just to make sure we understand you correctly: please confirm that you never downgraded eclair? You were initially at v0.10.0, then switched to v.0.11.0, then to master?

Eclair doesn't support reverting to a previous version if there was a DB migration (we only support migrating data towards a newer version, not towards an older one).

Yes, I can confirm, I've never downgraded!

What I've tried doing is dumping the sqlite database to text and then grepping through it. I can find all the legit channel public ids but not the malformed one. Still curious to me.

@EagleTM
Copy link
Author

EagleTM commented Feb 19, 2025

A good way to make sure is to read your DB in a unit test and verify that everything can be listed properly: that would guarantee that this invalid data is not found in the DB. I can share some code that does this validation for you if you want.

Yes please, I'd happily run that.

If it's not coming from the DB, it is likely coming from eclair.conf: did you check that this public key isn't found in your configuration files or in the code itself (if you use any custom plugin)?

I've checked the config, it's not there.
The node had been running RideTheLightning which interacted with the API and was used to open channels but I can't find it there either

It would be useful to have a better stack trace to figure out what data is being decoded that contains this public key. Can you try printing the exception's stack trace explicitly (with e.getStackTrace) to see if it provides a more detailed stack?

Here's a getStackTrace'd output of the error

2025-02-19 11:57:55,778 ERROR f.a.e.w.p.CommonCodecs - [CommonCodecs] Failed to decode public key: 03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370d0c863e8f0ce56 - Exception: requirement failed: public key is invalid 2025-02-19 11:57:55,779 ERROR f.a.e.w.p.CommonCodecs - [CommonCodecs] Stack trace: scala.Predef$.require(Predef.scala:337) fr.acinq.bitcoin.scalacompat.Crypto$PublicKey$.fromBin(Crypto.scala:115) fr.acinq.eclair.wire.protocol.CommonCodecs$.$anonfun$publicKey$2(CommonCodecs.scala:181) scodec.Codec$$anon$10.decode(Codec.scala:484) scodec.Codec$$anon$7.decode(Codec.scala:411) scodec.DecoderFunctions.decodeBothCombine(Decoder.scala:137) scodec.DecoderFunctions.decodeBothCombine$(Decoder.scala:133) scodec.Codec$.decodeBothCombine(Codec.scala:472) scodec.codecs.HListCodec$$anon$2.decode(HListCodec.scala:22) scodec.Codec$$anon$2.decode(Codec.scala:201) scodec.DecoderFunctions.decodeCollect(Decoder.scala:184) scodec.DecoderFunctions.decodeCollect$(Decoder.scala:175) scodec.Decoder$.decodeCollect(Decoder.scala:228) scodec.codecs.ListCodec.decode(ListCodec.scala:16) scodec.Decoder$$anon$1.decode(Decoder.scala:48) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57) scodec.Attempt$Successful.flatMap(Attempt.scala:113) scodec.Decoder$$anon$2.decode(Decoder.scala:56) scodec.Codec$$anon$3.decode(Codec.scala:316) scodec.Decoder$$anon$3.decode(Decoder.scala:66) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Codec$$anon$1.decode(Codec.scala:191) scodec.Codec$$anon$8.decode(Codec.scala:422) scodec.DecoderFunctions.decodeCollect(Decoder.scala:184) scodec.DecoderFunctions.decodeCollect$(Decoder.scala:175) scodec.Decoder$.decodeCollect(Decoder.scala:228) scodec.codecs.ListCodec.decode(ListCodec.scala:16) scodec.Codec$$anon$2.decode(Codec.scala:201) scodec.Decoder$$anon$4.decode(Decoder.scala:78) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Codec$$anon$11.decode(Codec.scala:494) scodec.codecs.FixedSizeCodec.decode(FixedSizeCodec.scala:25) scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57) scodec.Attempt$Successful.flatMap(Attempt.scala:113) scodec.Decoder$$anon$2.decode(Decoder.scala:56) scodec.codecs.VariableSizeCodec.decode(VariableSizeCodec.scala:30) scodec.codecs.package$$anon$13.decode(package.scala:986) scodec.Decoder$$anon$1.decode(Decoder.scala:48) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57) scodec.Attempt$Successful.flatMap(Attempt.scala:113) scodec.Decoder$$anon$2.decode(Decoder.scala:56) scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57) scodec.Attempt$Successful.flatMap(Attempt.scala:113) scodec.Decoder$$anon$2.decode(Decoder.scala:56) scodec.codecs.DiscriminatorCodec.decode(DiscriminatorCodec.scala:417) scodec.codecs.package$$anon$23.decode(package.scala:1946) scodec.DecoderFunctions.decodeCollect(Decoder.scala:184) scodec.DecoderFunctions.decodeCollect$(Decoder.scala:175) scodec.Decoder$.decodeCollect(Decoder.scala:228) scodec.codecs.ListCodec.decode(ListCodec.scala:16) scodec.Decoder$$anon$3.decode(Decoder.scala:66) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Codec$$anon$1.decode(Codec.scala:191) scodec.codecs.FixedSizeCodec.decode(FixedSizeCodec.scala:25) scodec.Decoder$$anon$2.$anonfun$decode$2(Decoder.scala:57) scodec.Attempt$Successful.flatMap(Attempt.scala:113) scodec.Decoder$$anon$2.decode(Decoder.scala:56) scodec.codecs.VariableSizeCodec.decode(VariableSizeCodec.scala:30) scodec.codecs.package$$anon$13.decode(package.scala:986) scodec.Decoder$$anon$4.decode(Decoder.scala:78) scodec.GenCodec$$anon$2.decode(GenCodec.scala:78) scodec.Codec$$anon$11.decode(Codec.scala:494) fr.acinq.eclair.payment.IncomingPaymentPacket$.decryptOnion(PaymentPacket.scala:71) fr.acinq.eclair.payment.IncomingPaymentPacket$.$anonfun$decrypt$1(PaymentPacket.scala:165) scala.util.Either.flatMap(Either.scala:352) fr.acinq.eclair.payment.IncomingPaymentPacket$.decrypt(PaymentPacket.scala:127) fr.acinq.eclair.payment.relay.PostRestartHtlcCleaner$.$anonfun$getIncomingHtlcs$2(PostRestartHtlcCleaner.scala:381) scala.collection.immutable.List.map(List.scala:250) scala.collection.immutable.List.map(List.scala:79) fr.acinq.eclair.payment.relay.PostRestartHtlcCleaner$.fr$acinq$eclair$payment$relay$PostRestartHtlcCleaner$$getIncomingHtlcs(PostRestartHtlcCleaner.scala:381) fr.acinq.eclair.payment.relay.PostRestartHtlcCleaner$$anonfun$receive$1.applyOrElse(PostRestartHtlcCleaner.scala:73) akka.actor.Actor.aroundReceive(Actor.scala:537) akka.actor.Actor.aroundReceive$(Actor.scala:535) fr.acinq.eclair.payment.relay.PostRestartHtlcCleaner.akka$actor$DiagnosticActorLogging$$super$aroundReceive(PostRestartHtlcCleaner.scala:52) akka.actor.DiagnosticActorLogging.aroundReceive(Actor.scala:368) akka.actor.DiagnosticActorLogging.aroundReceive$(Actor.scala:365) fr.acinq.eclair.payment.relay.PostRestartHtlcCleaner.aroundReceive(PostRestartHtlcCleaner.scala:52) akka.actor.ActorCell.receiveMessage(ActorCell.scala:579) akka.actor.ActorCell.invoke(ActorCell.scala:547) akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) akka.dispatch.Mailbox.run(Mailbox.scala:231) akka.dispatch.Mailbox.exec(Mailbox.scala:243) java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312) java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

@EagleTM
Copy link
Author

EagleTM commented Feb 19, 2025

Some further info:
Node was shutdown on Friday 14.2.2025 on 19:53:52 CET issuing "stop" with CLI to upgrade to 0.11.0.
0.11.0 required a newer bitcoind, which could only be compiled after dist-upgrade. Several starts failed b/c of connection issues to bitcoind - common "failed to connect to bitcoind (could not connect to bitcoind using json-rpc), retrying" errors.

This was fixed by upgrading to a supported bitcoind on Sat 15th:

2025-02-15 16:06:18,291 INFO f.a.e.b.b.zmq.ZMQActor - connected to tcp://127.0.0.1:29000
2025-02-15 16:06:18,291 INFO f.a.e.b.b.zmq.ZMQActor - connected to tcp://127.0.0.1:29000
2025-02-15 16:06:19,519 INFO f.a.eclair.router.Router - loaded from db: channels=29439 nodes=6108
2025-02-15 16:06:19,520 INFO f.a.eclair.router.Router - 18346 pruned channels at blockHeight=883899
2025-02-15 16:06:19,765 INFO f.a.eclair.router.Router - initialization completed, ready to process messages
2025-02-15 16:06:19,779 INFO fr.acinq.eclair.Setup - initial address=bc1 for bitcoin wallet=eclair
2025-02-15 16:06:19,881 INFO akka.actor.ActorCell - channels listener is ready
2025-02-15 16:06:19,942 ERROR a.a.OneForOneStrategy - requirement failed: public key is invalidjava.lang.IllegalArgumentException: requirement failed: public key is invalid
<...>

The last msg=UpdateFailHtlc before the restart was recorded 2025-02-14 11:50:04,425 to the log.

@t-bast
Copy link
Member

t-bast commented Feb 19, 2025

The stack trace is very helpful! The issue comes from an incoming HTLCs that somehow contains invalid content (which should never crash the node...), I still need to figure out why exactly. This means it's not something coming from your DB, so your temporary patch should be fine and doesn't create a risk of mass channel force-close or loss of funds.

I have what I need to keep investigating and will come back to this issue once I've figured it out.

@t-bast t-bast linked a pull request Feb 19, 2025 that will close this issue
@t-bast
Copy link
Member

t-bast commented Feb 19, 2025

Thanks for reporting this @EagleTM and for your help troubleshooting this issue, it should be fixed in #3012.
We will make an eclair release soon after that PR lands (even though only trampoline nodes with buggy/malicious senders should be affected).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants