Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(1808): migrate to linux 6.8.9 #2124

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

kingluo
Copy link
Contributor

@kingluo kingluo commented May 21, 2024

close #1808

@kingluo kingluo force-pushed the jinhua/feat-1808-migrate-to-linux-6.8.9 branch from 669c09b to e6256ea Compare May 25, 2024 09:30
Problems:

1. In the new kernel, assembly functions uniformly return from
   `__x86_return_thunk`. However, our assembly code uses the original
   `ret` instruction, so objtool in the kernel will notice this is a naked
   return during compilation.

2. `SYM_FUNC_START` in the new kernel will add endbr64 to the head of
   the assembly function, and all indirect jumps to ENDBR instructions,
   that is, the code snippet within the same function, will fail, but we
   use jump tables in the assembly function to perform indirect jumps. It
   will raise CET exception:
   https://en.wikipedia.org/wiki/X86_instruction_listings#Added_with_Intel_CET).

Solutions:

1. Substitute the `ret` with `RET`, a macro in the new kernel to
   ensure the correct return.

2. `notrack jmp` and enable notrack in CPU setting:
   `wrmsrl(MSR_IA32_S_CET, CET_ENDBR_EN | CET_NO_TRACK_EN)`

As an aside, interestingly, if a user-mode C program uses a switch
statement that meets the conditions for generating a jump table (gcc
uses `-fcf-protection=full` by default), the generated jump table will
use a `jmp` with the `notrack` prefix, and IBT will be marked as `true`
in the `.note.gnu.property` section of the compiled elf file, so that
the `NO_TRACK_EN` of the `MSR` will be set to `true` in user mode when
the kernel is loaded. So user mode can use `notrack` to bypass CET
without caring about setting or not setting `NO_TRACK_EN`.
@kingluo kingluo marked this pull request as ready for review June 26, 2024 08:17
@biathlon3
Copy link
Contributor

During tests sometimes occurs this crash.
It does not depend on exactly one test, but this case happened in forwarding.test_match_host_forwarded_regex.TestMatchLocationsH2.test_host_WorkShop_uri_testwiki from PR#649

[  509.314542] ------------[ cut here ]------------
[  509.321182] [tdb] Close table 'sessions0.tdb'
[  569.337733] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  569.337736] rcu:     0-...!: (0 ticks this GP) idle=01c4/1/0x4000000000000000 softirq=8073/8073 fqs=0
[  569.337740] rcu:     2-...!: (0 ticks this GP) idle=a294/1/0x4000000000000000 softirq=8721/8721 fqs=0
[  569.337741] rcu:     (detected by 3, t=15005 jiffies, g=9857, q=37 ncpus=4)
[  569.337743] Sending NMI from CPU 3 to CPUs 0:
[  566.596664] NMI backtrace for cpu 0
[  566.596664] CPU: 0 PID: 4226 Comm: sysctl Tainted: G           OE      6.8.9+ #1
[  566.596664] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
[  566.596664] RIP: 0010:vprintk_emit+0x258/0x320
[  566.596664] Code: 4d 85 ed 74 57 65 48 8b 04 25 40 41 03 00 49 39 c5 74 49 48 c7 c7 98 dd 39 84 c6 05 c0 de 20 03 01 e8 9c a9 ef 00 eb 02 f3 90 <0f> b6 1d b0 de 20 03 80 fb 01 0f 87 73 6c e7 00 83 e3 01 75 e9 e8
[  566.596664] RSP: 0018:ffffc90001377ac0 EFLAGS: 00000002
[  566.596664] RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff83380788
[  566.596664] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffffffff8439dd98
[  566.596664] RBP: ffffc90001377b00 R08: 0000000000000021 R09: 00000000843b22d4
[  566.596664] R10: ffffffffffffffff R11: 0000000000000025 R12: 0000000000000246
[  566.596664] R13: ffff888171b119c0 R14: 0000000000000021 R15: ffffffffc0ab915e
[  566.596664] FS:  00007f0ed78e6740(0000) GS:ffff888277c00000(0000) knlGS:0000000000000000
[  566.596664] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  566.596664] CR2: 00007f0d779a1548 CR3: 0000000171b3e000 CR4: 0000000000750ef0
[  566.596664] PKRU: 55555554
[  566.596664] Call Trace:
[  566.596664]  <NMI>
[  566.596664]  ? show_regs+0x6e/0x80
[  566.596664]  ? nmi_cpu_backtrace+0xb1/0x120
[  566.596664]  ? nmi_cpu_backtrace_handler+0x15/0x20
[  566.596664]  ? nmi_handle+0x68/0x180
[  566.596664]  ? default_do_nmi+0x45/0x120
[  566.596664]  ? exc_nmi+0x12e/0x1b0
[  566.596664]  ? end_repeat_nmi+0xf/0x60
[  566.596664]  ? vprintk_emit+0x258/0x320
[  566.596664]  ? vprintk_emit+0x258/0x320
[  566.596664]  ? vprintk_emit+0x258/0x320
[  566.596664]  </NMI>
[  566.596664]  <TASK>
[  566.596664]  ? pcpu_free_area+0x1fd/0x320
[  566.596664]  vprintk_default+0x21/0x30
[  566.596664]  vprintk+0x40/0x70
[  566.596664]  _printk+0x5c/0x80
[  566.596664]  tdb_close+0x4e/0x70 [tempesta_db]
[  566.596664]  tfw_http_sess_stop+0x31/0x40 [tempesta_fw]
[  566.596664]  tfw_mods_stop+0x35/0xc0 [tempesta_fw]
[  566.596664]  tfw_ctlfn_state_io+0x1c3/0x4e0 [tempesta_fw]
[  566.596664]  ? __pfx_tfw_ctlfn_state_io+0x10/0x10 [tempesta_fw]
[  566.596664]  ? kvmalloc_node+0x2a/0x100
[  566.596664]  proc_sys_call_handler+0x1b3/0x2d0
[  566.596664]  proc_sys_write+0x17/0x20
[  566.596664]  vfs_write+0x311/0x430
[  566.596664]  ksys_write+0x6b/0xf0
[  566.596664]  __x64_sys_write+0x1d/0x30
[  566.596664]  x64_sys_call+0x1681/0x20c0
[  566.596664]  do_syscall_64+0x72/0x120
[  566.596664]  ? __count_memcg_events+0x6f/0x110
[  566.596664]  ? count_memcg_events.constprop.0+0x1e/0x40
[  566.596664]  ? handle_mm_fault+0x192/0x2f0
[  566.596664]  ? do_user_addr_fault+0x33f/0x6c0
[  566.596664]  ? irqentry_exit_to_user_mode+0x65/0x180
[  566.596664]  ? irqentry_exit+0x3f/0x50
[  566.596664]  ? clear_bhb_loop+0x25/0x80
[  566.596664]  ? clear_bhb_loop+0x25/0x80
[  566.596664]  ? clear_bhb_loop+0x25/0x80
[  566.596664]  ? clear_bhb_loop+0x25/0x80
[  566.596664]  ? clear_bhb_loop+0x25/0x80
[  566.596664]  entry_SYSCALL_64_after_hwframe+0x78/0x80
[  566.596664] RIP: 0033:0x7f0ed7714887
[  566.596664] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  566.596664] RSP: 002b:00007fff0a433748 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  566.596664] RAX: ffffffffffffffda RBX: 000055d0974204a0 RCX: 00007f0ed7714887
[  566.596664] RDX: 0000000000000005 RSI: 000055d0974204e0 RDI: 0000000000000004
[  566.596664] RBP: 000055d097422610 R08: 0000000000000010 R09: 000055d097422610
[  566.596664] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000005
[  566.596664] R13: 0000000000000005 R14: 00007f0ed7816b80 R15: 00007f0ed7816a00
[  566.596664]  </TASK>
[  569.338739] Sending NMI from CPU 3 to CPUs 2:
[  569.337736] NMI backtrace for cpu 2
[  569.337736] CPU: 2 PID: 994 Comm: kworker/u16:1 Tainted: G           OE      6.8.9+ #1
[  569.337736] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014

Full log
crush2.txt

@kingluo
Copy link
Contributor Author

kingluo commented Aug 13, 2024

@biathlon3 Please describe which commit of this branch the crash happens at.

@biathlon3
Copy link
Contributor

And a very rare case of OOM during compilation, VM with 4 cpu, make -j4

[ 7715.787707] process 'tempesta/tls/t/tgen_ec256' started with executable stack
[ 7770.509446] cc1 invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
[ 7770.510470] CPU: 2 PID: 26858 Comm: cc1 Tainted: G        W  OE      6.8.9+ #1
[ 7770.511212] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
[ 7770.512200] Call Trace:
[ 7770.512677]  <TASK>
[ 7770.512923]  dump_stack_lvl+0x70/0x90
[ 7770.512923]  dump_stack+0x14/0x20
[ 7770.512923]  dump_header+0x47/0x1c0
[ 7770.512923]  out_of_memory+0x461/0x570
[ 7770.512923]  __alloc_pages+0x101a/0x1230
[ 7770.512923]  alloc_pages_mpol+0x95/0x210
[ 7770.512923]  ? filemap_alloc_folio+0xf9/0x100
[ 7770.512923]  alloc_pages+0x62/0xd0
[ 7770.512923]  folio_alloc+0x1c/0x50
[ 7770.512923]  filemap_alloc_folio+0xf9/0x100
[ 7770.512923]  __filemap_get_folio+0x116/0x2f0
[ 7770.512923]  filemap_fault+0x170/0xcd0
[ 7770.512923]  __do_fault+0x38/0x130
[ 7770.512923]  do_fault+0x279/0x4a0
[ 7770.512923]  __handle_mm_fault+0x8b0/0xed0
[ 7770.512923]  handle_mm_fault+0xc7/0x2f0
[ 7770.512923]  do_user_addr_fault+0x168/0x6c0
[ 7770.512923]  exc_page_fault+0x7d/0x190
[ 7770.512923]  asm_exc_page_fault+0x2b/0x30
[ 7770.512923] RIP: 0033:0x5fed26
[ 7770.512923] Code: Unable to access opcode bytes at 0x5fecfc.
[ 7770.512923] RSP: 002b:00007ffcb2ee4c80 EFLAGS: 00010206
[ 7770.512923] RAX: 00007fe6f2fc2c40 RBX: 0000000000000001 RCX: 0000000000000001
[ 7770.512923] RDX: 00007fe6f2fc3900 RSI: 0000012000000003 RDI: 00007fe6f6100dc8
[ 7770.512923] RBP: 00007fe6f2fc2be0 R08: 000000000000002f R09: 000000000000007f
[ 7770.512923] R10: 00007fe6f2fc2c40 R11: 0000000000000000 R12: 00007fe6f2fc1300
[ 7770.512923] R13: 0000000000000060 R14: 00007fe6f861b930 R15: 00007fe6f2fc2c40
[ 7770.512923]  </TASK>

Full log
crash2.txt

@biathlon3
Copy link
Contributor

@biathlon3 Please describe which commit of this branch the crash happens at.

#2161

@kingluo
Copy link
Contributor Author

kingluo commented Aug 13, 2024

@biathlon3 have you made changes beyond this PR to adapt #2131? If so, please record the error in #2131 instead of this one. I cannot reproduce your error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate to a Linux 6.8 kernel
2 participants