Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick Charge Doesn't work #3

Open
leskal opened this issue Mar 18, 2017 · 0 comments
Open

Quick Charge Doesn't work #3

leskal opened this issue Mar 18, 2017 · 0 comments

Comments

@leskal
Copy link
Owner

leskal commented Mar 18, 2017

No description provided.

leskal pushed a commit that referenced this issue Mar 27, 2017
[ Upstream commit 293f293 ]

On arm/arm64, we depend on the kvm_unmap_hva* callbacks (via
mmu_notifiers::invalidate_*) to unmap the stage2 pagetables when
the userspace buffer gets unmapped. However, when the Hypervisor
process exits without explicit unmap of the guest buffers, the only
notifier we get is kvm_arch_flush_shadow_all() (via mmu_notifier::release
) which does nothing on arm. Later this causes us to access pages that
were already released [via exit_mmap() -> unmap_vmas()] when we actually
get to unmap the stage2 pagetable [via kvm_arch_destroy_vm() ->
kvm_free_stage2_pgd()]. This triggers crashes with CONFIG_DEBUG_PAGEALLOC,
which unmaps any free'd pages from the linear map.

 [  757.644120] Unable to handle kernel paging request at virtual address
  ffff800661e00000
 [  757.652046] pgd = ffff20000b1a2000
 [  757.655471] [ffff800661e00000] *pgd=00000047fffe3003, *pud=00000047fcd8c003,
  *pmd=00000047fcc7c003, *pte=00e8004661e00712
 [  757.666492] Internal error: Oops: 96000147 [#3] PREEMPT SMP
 [  757.672041] Modules linked in:
 [  757.675100] CPU: 7 PID: 3630 Comm: qemu-system-aar Tainted: G      D
 4.8.0-rc1 #3
 [  757.683240] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board,
  BIOS 3.06.15 Aug 19 2016
 [  757.692938] task: ffff80069cdd3580 task.stack: ffff8006adb7c000
 [  757.698840] PC is at __flush_dcache_area+0x1c/0x40
 [  757.703613] LR is at kvm_flush_dcache_pmd+0x60/0x70
 [  757.708469] pc : [<ffff20000809dbdc>] lr : [<ffff2000080b4a70>] pstate: 20000145
 ...
 [  758.357249] [<ffff20000809dbdc>] __flush_dcache_area+0x1c/0x40
 [  758.363059] [<ffff2000080b6748>] unmap_stage2_range+0x458/0x5f0
 [  758.368954] [<ffff2000080b708c>] kvm_free_stage2_pgd+0x34/0x60
 [  758.374761] [<ffff2000080b2280>] kvm_arch_destroy_vm+0x20/0x68
 [  758.380570] [<ffff2000080aa330>] kvm_put_kvm+0x210/0x358
 [  758.385860] [<ffff2000080aa524>] kvm_vm_release+0x2c/0x40
 [  758.391239] [<ffff2000082ad234>] __fput+0x114/0x2e8
 [  758.396096] [<ffff2000082ad46c>] ____fput+0xc/0x18
 [  758.400869] [<ffff200008104658>] task_work_run+0x108/0x138
 [  758.406332] [<ffff2000080dc8ec>] do_exit+0x48c/0x10e8
 [  758.411363] [<ffff2000080dd5fc>] do_group_exit+0x6c/0x130
 [  758.416739] [<ffff2000080ed924>] get_signal+0x284/0xa18
 [  758.421943] [<ffff20000808a098>] do_signal+0x158/0x860
 [  758.427060] [<ffff20000808aad4>] do_notify_resume+0x6c/0x88
 [  758.432608] [<ffff200008083624>] work_pending+0x10/0x14
 [  758.437812] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)

This patch fixes the issue by moving the kvm_free_stage2_pgd() to
kvm_arch_flush_shadow_all().

Cc: <[email protected]> # 3.9+
Tested-by: Itaru Kitayama <[email protected]>
Reported-by: Itaru Kitayama <[email protected]>
Reported-by: James Morse <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoffer Dall <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
leskal pushed a commit that referenced this issue May 22, 2017
commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 upstream.

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

 ---- 8< ----

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

  [ BUG: held lock freed! ]
  4.10.0+ #234 Not tainted
  -------------------------
  syz-executor6/6898 is freeing memory
  ffff88006286cac0-ffff88006286d3b7, with a lock still held there!
   (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
  include/linux/spinlock.h:299 [inline]
   (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
  5 locks held by syz-executor6/6898:
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock
  include/net/sock.h:1460 [inline]
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>]
  inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
   #1:  (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>]
  inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink
  include/linux/skbuff.h:1767 [inline]
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue
  include/linux/skbuff.h:1783 [inline]
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>]
  process_backlog+0x264/0x730 net/core/dev.c:4835
   #3:  (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>]
  ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
   #4:  (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
  include/linux/spinlock.h:299 [inline]
   #4:  (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Gerrit Renker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
leskal pushed a commit that referenced this issue Dec 4, 2017
[ Upstream commit 120e9dabaf551c6dc03d3a10a1f026376cb1811c ]

syszkaller team reported another problem in DCCP [1]

Problem here is that the structure holding RTO timer
(ccid2_hc_tx_rto_expire() handler) is freed too soon.

We can not use del_timer_sync() to cancel the timer
since this timer wants to grab socket lock (that would risk a dead lock)

Solution is to defer the freeing of memory when all references to
the socket were released. Socket timers do own a reference, so this
should fix the issue.

[1]

==================================================================
BUG: KASAN: use-after-free in ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
Read of size 4 at addr ffff8801d2660540 by task kworker/u4:7/3365

CPU: 1 PID: 3365 Comm: kworker/u4:7 Not tainted 4.13.0-rc4+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound call_usermodehelper_exec_work
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x24e/0x340 mm/kasan/report.c:409
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
 ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
 call_timer_fn+0x233/0x830 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1cc/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:638 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline]
RIP: 0010:__raw_write_unlock_irq include/linux/rwlock_api_smp.h:267 [inline]
RIP: 0010:_raw_write_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:343
RSP: 0018:ffff8801cd50eaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: ffffffff85a090c0 RCX: 0000000000000006
RDX: 1ffffffff0b595f3 RSI: 1ffff1003962f989 RDI: ffffffff85acaf98
RBP: ffff8801cd50eab0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801cc96ea60
R13: dffffc0000000000 R14: ffff8801cc96e4c0 R15: ffff8801cc96e4c0
 </IRQ>
 release_task+0xe9e/0x1a40 kernel/exit.c:220
 wait_task_zombie kernel/exit.c:1162 [inline]
 wait_consider_task+0x29b8/0x33c0 kernel/exit.c:1389
 do_wait_thread kernel/exit.c:1452 [inline]
 do_wait+0x441/0xa90 kernel/exit.c:1523
 kernel_wait4+0x1f5/0x370 kernel/exit.c:1665
 SYSC_wait4+0x134/0x140 kernel/exit.c:1677
 SyS_wait4+0x2c/0x40 kernel/exit.c:1673
 call_usermodehelper_exec_sync kernel/kmod.c:286 [inline]
 call_usermodehelper_exec_work+0x1a0/0x2c0 kernel/kmod.c:323
 process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2097
 worker_thread+0x223/0x1860 kernel/workqueue.c:2231
 kthread+0x35e/0x430 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425

Allocated by task 21267:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
 kmem_cache_alloc+0x127/0x750 mm/slab.c:3561
 ccid_new+0x20e/0x390 net/dccp/ccid.c:151
 dccp_hdlr_ccid+0x27/0x140 net/dccp/feat.c:44
 __dccp_feat_activate+0x142/0x2a0 net/dccp/feat.c:344
 dccp_feat_activate_values+0x34e/0xa90 net/dccp/feat.c:1538
 dccp_rcv_request_sent_state_process net/dccp/input.c:472 [inline]
 dccp_rcv_state_process+0xed1/0x1620 net/dccp/input.c:677
 dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
 sk_backlog_rcv include/net/sock.h:911 [inline]
 __release_sock+0x124/0x360 net/core/sock.c:2269
 release_sock+0xa4/0x2a0 net/core/sock.c:2784
 inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
 __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
 inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
 SYSC_connect+0x204/0x470 net/socket.c:1642
 SyS_connect+0x24/0x30 net/socket.c:1623
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 3049:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kmem_cache_free+0x77/0x280 mm/slab.c:3763
 ccid_hc_tx_delete+0xc5/0x100 net/dccp/ccid.c:190
 dccp_destroy_sock+0x1d1/0x2b0 net/dccp/proto.c:225
 inet_csk_destroy_sock+0x166/0x3f0 net/ipv4/inet_connection_sock.c:833
 dccp_done+0xb7/0xd0 net/dccp/proto.c:145
 dccp_time_wait+0x13d/0x300 net/dccp/minisocks.c:72
 dccp_rcv_reset+0x1d1/0x5b0 net/dccp/input.c:160
 dccp_rcv_state_process+0x8fc/0x1620 net/dccp/input.c:663
 dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
 sk_backlog_rcv include/net/sock.h:911 [inline]
 __sk_receive_skb+0x33e/0xc00 net/core/sock.c:521
 dccp_v4_rcv+0xef1/0x1c00 net/dccp/ipv4.c:871
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:248 [inline]
 ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:477 [inline]
 ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:248 [inline]
 ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
 __netif_receive_skb_core+0x19af/0x33d0 net/core/dev.c:4417
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4455
 process_backlog+0x203/0x740 net/core/dev.c:5130
 napi_poll net/core/dev.c:5527 [inline]
 net_rx_action+0x792/0x1910 net/core/dev.c:5593
 __do_softirq+0x2f5/0xba3 kernel/softirq.c:284

The buggy address belongs to the object at ffff8801d2660100
 which belongs to the cache ccid2_hc_tx_sock of size 1240
The buggy address is located 1088 bytes inside of
 1240-byte region [ffff8801d2660100, ffff8801d26605d8)
The buggy address belongs to the page:
page:ffffea0007499800 count:1 mapcount:0 mapping:ffff8801d2660100 index:0x0 compound_mapcount: 0
flags: 0x200000000008100(slab|head)
raw: 0200000000008100 ffff8801d2660100 0000000000000000 0000000100000005
raw: ffffea00075271a0 ffffea0007538820 ffff8801d3aef9c0 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801d2660400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8801d2660480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801d2660500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff8801d2660580: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff8801d2660600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Cc: Gerrit Renker <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
leskal pushed a commit that referenced this issue Dec 4, 2017
commit f5527ff upstream.

This fixes CVE-2016-8650.

If mpi_powm() is given a zero exponent, it wants to immediately return
either 1 or 0, depending on the modulus.  However, if the result was
initalised with zero limb space, no limbs space is allocated and a
NULL-pointer exception ensues.

Fix this by allocating a minimal amount of limb space for the result when
the 0-exponent case when the result is 1 and not touching the limb space
when the result is 0.

This affects the use of RSA keys and X.509 certificates that carry them.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
PGD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 3 PID: 3014 Comm: keyctl Not tainted 4.9.0-rc6-fscache+ #278
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8804011944c0 task.stack: ffff880401294000
RIP: 0010:[<ffffffff8138ce5d>]  [<ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
RSP: 0018:ffff880401297ad8  EFLAGS: 00010212
RAX: 0000000000000000 RBX: ffff88040868bec0 RCX: ffff88040868bba0
RDX: ffff88040868b260 RSI: ffff88040868bec0 RDI: ffff88040868bee0
RBP: ffff880401297ba8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000047 R11: ffffffff8183b210 R12: 0000000000000000
R13: ffff8804087c7600 R14: 000000000000001f R15: ffff880401297c50
FS:  00007f7a7918c700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000401250000 CR4: 00000000001406e0
Stack:
 ffff88040868bec0 0000000000000020 ffff880401297b00 ffffffff81376cd4
 0000000000000100 ffff880401297b10 ffffffff81376d12 ffff880401297b30
 ffffffff81376f37 0000000000000100 0000000000000000 ffff880401297ba8
Call Trace:
 [<ffffffff81376cd4>] ? __sg_page_iter_next+0x43/0x66
 [<ffffffff81376d12>] ? sg_miter_get_next_page+0x1b/0x5d
 [<ffffffff81376f37>] ? sg_miter_next+0x17/0xbd
 [<ffffffff8138ba3a>] ? mpi_read_raw_from_sgl+0xf2/0x146
 [<ffffffff8132a95c>] rsa_verify+0x9d/0xee
 [<ffffffff8132acca>] ? pkcs1pad_sg_set_buf+0x2e/0xbb
 [<ffffffff8132af40>] pkcs1pad_verify+0xc0/0xe1
 [<ffffffff8133cb5e>] public_key_verify_signature+0x1b0/0x228
 [<ffffffff8133d974>] x509_check_for_self_signed+0xa1/0xc4
 [<ffffffff8133cdde>] x509_cert_parse+0x167/0x1a1
 [<ffffffff8133d609>] x509_key_preparse+0x21/0x1a1
 [<ffffffff8133c3d7>] asymmetric_key_preparse+0x34/0x61
 [<ffffffff812fc9f3>] key_create_or_update+0x145/0x399
 [<ffffffff812fe227>] SyS_add_key+0x154/0x19e
 [<ffffffff81001c2b>] do_syscall_64+0x80/0x191
 [<ffffffff816825e4>] entry_SYSCALL64_slow_path+0x25/0x25
Code: 56 41 55 41 54 53 48 81 ec a8 00 00 00 44 8b 71 04 8b 42 04 4c 8b 67 18 45 85 f6 89 45 80 0f 84 b4 06 00 00 85 c0 75 2f 41 ff ce <49> c7 04 24 01 00 00 00 b0 01 75 0b 48 8b 41 18 48 83 38 01 0f
RIP  [<ffffffff8138ce5d>] mpi_powm+0x32/0x7e6
 RSP <ffff880401297ad8>
CR2: 0000000000000000
---[ end trace d82015255d4a5d8d ]---

Basically, this is a backport of a libgcrypt patch:

	http://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=patch;h=6e1adb05d290aeeb1c230c763970695f4a538526

Fixes: cdec9cb ("crypto: GnuPG based MPI lib - source files (part 1)")
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Dmitry Kasatkin <[email protected]>
cc: [email protected]
Signed-off-by: James Morris <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
leskal pushed a commit that referenced this issue Dec 4, 2017
commit ab31fd0ce65ec93828b617123792c1bb7c6dcc42 upstream.

v4.10 commit 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of struct
zfcp_erp_action for tracing.  If an erp_action has never been enqueued
before, these parent pointer fields are uninitialized and NULL. Examples
are zfcp objects freshly added to the parent object's children list,
before enqueueing their first recovery subsequently. In
zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action
fields can cause a NULL pointer dereference.  Since the kernel can read
from lowcore on s390, it does not immediately cause a kernel page
fault. Instead it can cause hangs on trying to acquire the wrong
erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
                      ^bogus^
while holding already other locks with IRQs disabled.

Real life example from attaching lots of LUNs in parallel on many CPUs:

crash> bt 17723
PID: 17723  TASK: ...               CPU: 25  COMMAND: "zfcperp0.0.1800"
 LOWCORE INFO:
  -psw      : 0x0404300180000000 0x000000000038e424
  -function : _raw_spin_lock_wait_flags at 38e424
...
 #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
 #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
 #2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
 #3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
 #4 [fdde8fe60] kthread at 173550
 #5 [fdde8feb8] kernel_thread_starter at 10add2

zfcp_adapter
 zfcp_port
  zfcp_unit <address>, 0x404040d600000000
  scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x40000000
0x40000000 ZFCP_STATUS_COMMON_RUNNING

crash> zfcp_unit <address>
struct zfcp_unit {
  erp_action = {
    adapter = 0x0,
    port = 0x0,
    unit = 0x0,
  },
}

zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.

To fix the issue, initialize the erp_action parent pointers before
adding the erp_action container to any list and thus before it becomes
accessible from outside of its initializing function.

In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually
all erp_action fields except for parent pointers. To be extra careful
not to introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with
WARN_ON_ONCE that erp_action parent pointers never change, so we get to
know when we would deviate from previous behavior.

Signed-off-by: Steffen Maier <[email protected]>
Fixes: 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN recovery")
Reviewed-by: Benjamin Block <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant