Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test for concurrent removal of all tiered devices at once #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ojab
Copy link
Contributor

@ojab ojab commented Nov 7, 2023

Quick & dirty test, echo & fs us should be removed, join_by and alike used, dd size & cat count tweaked before merge and I don't even know if it works fine after mentioned issues are fixed.

  1. Invalid option metadata_replicas: too big (max 4)
  2. After making metadata_replicas=3 we'll get Invalid option data_replicas: too big (max 4)
  3. After making data_replicas=3 we'll get either hang (ssh got disconnected, output stops) or (less often) backtrace, for example
bcachefs (ad5a54fa-d6bc-4b4e-9435-7e90827cf772): going read-write
DONE
------------[ cut here ]------------
------------[ cut here ]------------
btree trans held srcu lock (delaying memory reclaim) for 39 seconds
btree trans held srcu lock (delaying memory reclaim) for 20 seconds
WARNING: CPU: 1 PID: 411 at fs/bcachefs/btree_iter.c:2838 bch2_trans_put+0x4f5/0x550
WARNING: CPU: 0 PID: 53 at fs/bcachefs/btree_iter.c:2838 bch2_trans_srcu_unlock+0x150/0x160
Modules linked in:
Modules linked in:
CPU: 0 PID: 53 Comm: kworker/u17:0 Not tainted 6.6.0-ktest-03573-g6c5850f4860d #3
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Workqueue: writeback wb_workfn (flush-bcachefs-2)

RIP: 0010:bch2_trans_srcu_unlock+0x150/0x160
CPU: 1 PID: 411 Comm: bch-reclaim/ad5 Not tainted 6.6.0-ktest-03573-g6c5850f4860d #3
Code: 46 9e cd 00 48 c7 c7 f8 28 00 82 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 d1 ea 48 f7 e2 48 89 d6 48 c1 ee 04 e8 70 f2 b3 ff <0f> 0b e9 7c ff ff ff 0f 0b 0f 0b 0f 0b eb 87 90 66 0f 1f 00 0f 1f
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RSP: 0018:ffff88810082b9b8 EFLAGS: 00010296
RIP: 0010:bch2_trans_put+0x4f5/0x550

Code: d1 8c cd 00 48 c7 c7 f8 28 00 82 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 d1 ea 48 f7 e2 48 89 d6 48 c1 ee 04 e8 fb e0 b3 ff <0f> 0b 8b b3 90 00 00 00 49 8d bc 24 70 35 00 00 83 fe 01 77 3b e8
RAX: 0000000000000043 RBX: ffff888108ba4000 RCX: 0000000000000027
RDX: ffff88817981c3c8 RSI: 0000000000000001 RDI: ffff88817981c3c0
RBP: ffff88810082b9d0 R08: ffffffff81e8bd80 R09: 000000000002cdb0
R10: 0000000000000001 R11: ffff88810007c758 R12: ffff88810af40000
RSP: 0018:ffff888108117d38 EFLAGS: 00010292
R13: ffff888108ba6530 R14: 0000000000000003 R15: 00000000fffff790

FS:  0000000000000000(0000) GS:ffff888179800000(0000) knlGS:0000000000000000
RAX: 0000000000000043 RBX: ffff888105544000 RCX: 0000000000000027
RDX: ffff88817985c3c8 RSI: 0000000000000001 RDI: ffff88817985c3c0
RBP: ffff888108117d68 R08: 0000000000000003 R09: ffff88817dbfe000
R10: 0000000000000001 R11: ffff88817dbfe000 R12: ffff88810af40000
R13: ffff88810af40000 R14: ffff88810af434a0 R15: ffff88811692e170
FS:  0000000000000000(0000) GS:ffff888179840000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f60f6e6ee84 CR3: 000000000222a000 CR4: 00000000003506a0
Call Trace:
 <TASK>
 ? show_regs+0x65/0x70
 ? __warn+0x89/0x130
 ? bch2_trans_put+0x4f5/0x550
 ? report_bug+0x159/0x180
 ? prb_read_valid+0x20/0x30
 ? handle_bug+0x40/0x70
 ? exc_invalid_op+0x1c/0x70
 ? asm_exc_invalid_op+0x1f/0x30
 ? bch2_trans_put+0x4f5/0x550
 ? bch2_trans_put+0x4f5/0x550
 bch2_btree_key_cache_journal_flush+0x1a0/0x240
 ? bch2_btree_key_cache_journal_flush+0x9e/0x240
 journal_flush_pins.constprop.0+0x183/0x2d0
 __bch2_journal_reclaim+0x2d0/0x460
 bch2_journal_reclaim_thread+0x80/0x160
 ? __bch2_journal_reclaim+0x460/0x460
 kthread+0xdb/0x100
 ? kthread_complete_and_exit+0x30/0x30
 ret_from_fork+0x3a/0x60
 ? kthread_complete_and_exit+0x30/0x30
 ret_from_fork_asm+0x11/0x20
 </TASK>
---[ end trace 0000000000000000 ]---
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6c5ed106f4 CR3: 000000000222a000 CR4: 00000000003506b0
Call Trace:
 <TASK>
 ? show_regs+0x65/0x70
 ? __warn+0x89/0x130
 ? bch2_trans_srcu_unlock+0x150/0x160
 ? report_bug+0x159/0x180
 ? handle_bug+0x40/0x70
 ? exc_invalid_op+0x1c/0x70
 ? asm_exc_invalid_op+0x1f/0x30
 ? bch2_trans_srcu_unlock+0x150/0x160
 ? bch2_trans_srcu_unlock+0x150/0x160
 bch2_trans_begin+0x5da/0x6a0
 bch2_write_inode+0x79/0x200
 ? bch2_getattr+0x130/0x130
 ? bch2_inode_peek_nowarn.isra.0+0x92/0x100
 bch2_vfs_write_inode+0x49/0x80
 __writeback_single_inode+0x24d/0x2d0
 writeback_sb_inodes+0x1a4/0x450
 __writeback_inodes_wb+0x54/0xf0
 ? queue_io+0xf1/0x100
 wb_writeback+0x233/0x280
 wb_workfn+0x2dc/0x410
 ? __switch_to+0x131/0x460
 process_one_work+0x138/0x2c0
 worker_thread+0x2ea/0x420
 ? rescuer_thread+0x400/0x400
 kthread+0xdb/0x100
 ? kthread_complete_and_exit+0x30/0x30
 ret_from_fork+0x3a/0x60
 ? kthread_complete_and_exit+0x30/0x30
 ret_from_fork_asm+0x11/0x20
 </TASK>
---[ end trace 0000000000000000 ]---

@ojab ojab changed the title max replicas & concurrent removal issues Add a test for concurrent removal of all tiered devices at once Nov 7, 2023
@ojab ojab marked this pull request as ready for review November 7, 2023 23:15
@koverstreet koverstreet force-pushed the master branch 11 times, most recently from a0220af to 2b255f9 Compare July 14, 2024 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant