-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drm/stm: ltdc: fix pinctrl recovery after sleep #17
base: v5.10-stm32mp
Are you sure you want to change the base?
drm/stm: ltdc: fix pinctrl recovery after sleep #17
Conversation
When the encoder is disabled, the pinctrl goes to sleep state. If the encoder is re-enabled after that, the pinctrl needs to go back to default state. This wasn't happening after the pinctrl setup was moved to ltdc_encoder_mode_set. Without this fix, if the framebuffer is put to sleep and then awakened (e.g. using /sys/class/graphics/fb0/blank), it doesn't recover properly. Signed-off-by: Doug Brown <[email protected]> Fixes: f412af1 ("drm/stm: ltdc: move pinctrl to encoder mode set")
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] STMicroelectronics#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] STMicroelectronics#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] STMicroelectronics#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] STMicroelectronics#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] STMicroelectronics#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 STMicroelectronics#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 STMicroelectronics#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 STMicroelectronics#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf STMicroelectronics#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 STMicroelectronics#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 STMicroelectronics#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 STMicroelectronics#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff STMicroelectronics#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f STMicroelectronics#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f22881d ] In calipso_map_cat_ntoh(), in the for loop, if the return value of netlbl_bitmap_walk() is equal to (net_clen_bits - 1), when netlbl_bitmap_walk() is called next time, out-of-bounds memory accesses of bitmap[byte_offset] occurs. The bug was found during fuzzing. The following is the fuzzing report BUG: KASAN: slab-out-of-bounds in netlbl_bitmap_walk+0x3c/0xd0 Read of size 1 at addr ffffff8107bf6f70 by task err_OH/252 CPU: 7 PID: 252 Comm: err_OH Not tainted 5.17.0-rc7+ STMicroelectronics#17 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x21c/0x230 show_stack+0x1c/0x60 dump_stack_lvl+0x64/0x7c print_address_description.constprop.0+0x70/0x2d0 __kasan_report+0x158/0x16c kasan_report+0x74/0x120 __asan_load1+0x80/0xa0 netlbl_bitmap_walk+0x3c/0xd0 calipso_opt_getattr+0x1a8/0x230 calipso_sock_getattr+0x218/0x340 calipso_sock_getattr+0x44/0x60 netlbl_sock_getattr+0x44/0x80 selinux_netlbl_socket_setsockopt+0x138/0x170 selinux_socket_setsockopt+0x4c/0x60 security_socket_setsockopt+0x4c/0x90 __sys_setsockopt+0xbc/0x2b0 __arm64_sys_setsockopt+0x6c/0x84 invoke_syscall+0x64/0x190 el0_svc_common.constprop.0+0x88/0x200 do_el0_svc+0x88/0xa0 el0_svc+0x128/0x1b0 el0t_64_sync_handler+0x9c/0x120 el0t_64_sync+0x16c/0x170 Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Yufen <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] STMicroelectronics#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] STMicroelectronics#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 STMicroelectronics#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 STMicroelectronics#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 STMicroelectronics#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 STMicroelectronics#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 STMicroelectronics#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc STMicroelectronics#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d STMicroelectronics#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 STMicroelectronics#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Is there a realistic world in which this PR gets merged at this point? Or should I try submitting it to the upstream kernel instead? |
Hello Doug, sorry for the delay and thanks for your contribution. I will push internally for rapid analysis and recovery of normal process. |
The fix is already part of last delivery (V4.1). It will be also backported (as your request) on V3.1.3 release (this summer). |
Hi Bernard, Thanks for looking into this. Looking at the latest code in the v5.15-stm32mp branch of this repository it looks like the problem is still there ( |
after review, the first patch was not in V4.1 but is planned to be also delivered for V4.1.1. So a review will be done to check all the use cases are taken into account with the 2 patches for V4.1.1. |
@BernardPuel so there is the same behaviour for DSI too but on different kernel version (5.15.67) with run on board STM32MP157F-DK2, but patch fixed only non-DSI part of code. |
Here is the patch planned for V4.1.1: diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 3dcd63f..f7b9258 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -1060,6 +1060,20 @@
}
}
+static int ltdc_crtc_atomic_check(struct drm_crtc *crtc,
+ struct drm_atomic_state *state)
+{
+ struct drm_crtc_state *crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
+
+ DRM_DEBUG_ATOMIC("\n");
+
+ /* force a full mode set if active state changed */
+ if (crtc_state->active_changed)
+ crtc_state->mode_changed = true;
+
+ return 0;
+}
+
static bool ltdc_crtc_get_scanout_position(struct drm_crtc *crtc,
bool in_vblank_irq,
int *vpos, int *hpos,
@@ -1120,6 +1134,7 @@
.atomic_flush = ltdc_crtc_atomic_flush,
.atomic_enable = ltdc_crtc_atomic_enable,
.atomic_disable = ltdc_crtc_atomic_disable,
+ .atomic_check = ltdc_crtc_atomic_check,
.get_scanout_position = ltdc_crtc_get_scanout_position,
}; |
@BernardPuel thank you for patch, it works on manual test, will you have plans to release to this repo? |
Thanks for your feedback. Yes in this repo but not in 5.10 kernel branch. Only 5.15. |
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb STMicroelectronics#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 STMicroelectronics#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 STMicroelectronics#15 [ffff88aa841efb88] device_del at ffffffff82179d23 STMicroelectronics#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] STMicroelectronics#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] STMicroelectronics#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a STMicroelectronics#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff STMicroelectronics#20 [ffff88aa841eff10] kthread at ffffffff811d87a0 STMicroelectronics#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
I encountered a problem where an RGB TFT LCD panel attached to an STM32MP1 would stop working after the framebuffer was put to sleep and woken back up. The reason was that the sleep pinctrl was applied when the framebuffer went to sleep, but the default pinctrl wasn't being restored after the LTDC woke up. This is because an older commit moved the LTDC default pinctrl configuration to ltdc_encoder_mode_set, which doesn't get called again after it wakes up.
This fix was tested in the v5.4-stm32mp branch on a custom STM32MP1 board. I tested by blanking and then unblanking the framebuffer. Before this fix, the LCD display got garbled after unblanking. With this fix, it recovers correctly.
echo 1 > /sys/class/graphics/fb0/blank
echo 0 > /sys/class/graphics/fb0/blank