Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getchaintips broken, crashed daemon #883

Open
leto opened this issue Sep 22, 2018 · 8 comments
Open

getchaintips broken, crashed daemon #883

leto opened this issue Sep 22, 2018 · 8 comments

Comments

@leto
Copy link

leto commented Sep 22, 2018

reported by siddhartha on discord

[Switching to Thread 0x7fffe8ff9700 (LWP 7389)]
std::_Rb_tree<CBlockIndex const*, CBlockIndex const*, std::_Identity<CBlockIndex const*>, CompareBlocksByHeight, std::allocator<CBlockIndex const*> >::_M_get_insert_unique_pos (__k=@0x7fffe8ff7f38: 0x0, 
    this=0x7fffe8ff7f50) at /usr/include/c++/5/bits/stl_tree.h:1810
1810    /usr/include/c++/5/bits/stl_tree.h: No such file or directory.
(gdb) backtrace
#0  std::_Rb_tree<CBlockIndex const*, CBlockIndex const*, std::_Identity<CBlockIndex const*>, CompareBlocksByHeight, std::allocator<CBlockIndex const*> >::_M_get_insert_unique_pos (__k=@0x7fffe8ff7f38: 0x0, 
    this=0x7fffe8ff7f50) at /usr/include/c++/5/bits/stl_tree.h:1810
#1  std::_Rb_tree<CBlockIndex const*, CBlockIndex const*, std::_Identity<CBlockIndex const*>, CompareBlocksByHeight, std::allocator<CBlockIndex const*> >::_M_insert_unique<CBlockIndex const*> (
    this=this@entry=0x7fffe8ff7f50, __v=@0x7fffe8ff7f38: 0x0)
    at /usr/include/c++/5/bits/stl_tree.h:1863
#2  0x0000555555716efe in std::set<CBlockIndex const*, CompareBlocksByHeight, std::allocator<CBlockIndex const*> >::insert (
    __x=@0x7fffe8ff7f38: 0x0, this=0x7fffe8ff7f50)
    at /usr/include/c++/5/bits/stl_set.h:494
#3  getchaintips (params=..., fHelp=<optimized out>)
    at rpcblockchain.cpp:1408
#4  0x000055555576b089 in CRPCTable::execute (this=<optimized out>, 
    strMethod="getchaintips", params=...) at rpcserver.cpp:672

I also tested this on PIRATE and it crashed my daemon, nothing interesting in debug.log

@leto
Copy link
Author

leto commented Sep 22, 2018

I ran this on 189117d from the jl777 branch to get a crash:
./src/komodo-cli -ac_name=PIRATE getchaintips

@leto
Copy link
Author

leto commented Oct 8, 2018

FYI EMC2 has the same issue, as well as GAME. I believe this issue is related to notarizations. For some reason, Hush does not have this problem.

@jl777
Copy link
Owner

jl777 commented Oct 8, 2018

I wonder if VRSC has it too...
A nice bounty for a fix to this bug

@leto
Copy link
Author

leto commented Oct 9, 2018

I have a VRSC node on master at e59d53d which successfully gave me getchaintips output. It took 9 minutes to complete and generated 1GB of JSON 😕 Perhaps the coredump is due to the size of the returned data in KMD. I already have a getchaintips branch in Hush which adds an option minBranchLen argument, that greatly cuts down on the returned data. I will see how that patch changes things on KMD and EMC2:

MyHush/hush@4e66ca4

@leto
Copy link
Author

leto commented Oct 11, 2018

I have an improved getchaintips that will be merged into Hush soon which will help in debugging this issue. It also has passing tests, which can be used to detect if getchaintips worked on older versions and perhaps isolate when getchaintips started to coredump: MyHush/hush#151

@leto
Copy link
Author

leto commented Oct 13, 2018

I have a full backtrace of this coredump on dev branch commit 01ba6d, the error is triggered by rpcserver.cpp line 728, which is the code that tries to invoke the getchaintips RPC method. It doesn't seem to ever correctly begin executing the method, something fails while trying to proxy/dispatch. Will continue investigating: https://gist.github.com/leto/467977023e3357a626fa47a941977961

@jl777
Copy link
Owner

jl777 commented Oct 13, 2018

yes that is what I found. I put in printouts to see the failure and sometimes it gets farther. I tried a custom mutex, but nothing worked. It seems the iterator just fails for some reason

@leto
Copy link
Author

leto commented Oct 16, 2018

A few more datapoints from my debugging: This does not happen in regtest mode by default (which only starts off with a genesis block):

$ ./komodo-cli -ac_name=AXO -regtest getchaintips
[
  {
    "height": 0,
    "hash": "029f11d80ef9765602235e1bc9727e3eb6ba20839319f761fee920d63401e327",
    "branchlen": 0,
    "status": "active"
  }
]

That makes this an even sneakier bug, because a simple regtest chain with a few hundred blocks most likely won't trigger this bug. I will try to make a longer regtest chain to see if I can reproduce there.

My tests with AXO showed that many thousands of blocks are compared by CompareBlocksByHeight before the coredump occurs, and from what I read on similar-looking boost bugs, memory-corruption might be happening somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants