Cannot boot using vhost-scsi controller. #2

cuinutanix · 2017-05-04T05:32:54Z

After SLOF successfully probes scsi disks on a virtio-scsi controller, it sends a write to the PCI_COMMAND register with value 0x0 which disables the PCI device, which qemu translates to virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK); which tells the vhost backend that the driver wants the device stopped.

qemu's own implementation of virtio-scsi seems to ignore the fact that the pci device is disabled, and will happily continue to process scsi requests from the ring, and guests are able to boot. However, with a vhost-scsi backend, qemu tells the vhost backend to stop processing the ring, and the guest does not boot.

Here is a gdb stack trace of when qemu disables the vhost-scsi device:

#0  0x00000000202f63e8 in vhost_user_scsi_set_status (vdev=0x2173ffb0, 
    status=11 '\v')
    at /home/mcui/workspace/NXPower/qemu/hw/scsi/vhost-user-scsi.c:43
#1  0x000000002030fd9c in virtio_set_status (vdev=0x2173ffb0, val=11 '\v')
    at /home/mcui/workspace/NXPower/qemu/hw/virtio/virtio.c:898
#2  0x00000000205c4b38 in virtio_write_config (pci_dev=0x21737ba0, address=4, 
    val=1048832, len=4) at hw/virtio/virtio-pci.c:633
#3  0x000000002055bcec in pci_host_config_write_common (pci_dev=0x21737ba0, 
    addr=<optimized out>, limit=<optimized out>, val=<optimized out>, 
    len=<optimized out>) at hw/pci/pci_host.c:66
#4  0x0000000020341810 in finish_write_pci_config (spapr=<optimized out>, 
    buid=<optimized out>, addr=<optimized out>, size=<optimized out>, 
    val=<optimized out>, rets=2121530032)
    at /home/mcui/workspace/NXPower/qemu/hw/ppc/spapr_pci.c:200
#5  0x000000002033c99c in spapr_rtas_call (cpu=<optimized out>, 
    spapr=<optimized out>, token=<optimized out>, nargs=<optimized out>, 
    args=<optimized out>, nret=<optimized out>, rets=<optimized out>)
    at /home/mcui/workspace/NXPower/qemu/hw/ppc/spapr_rtas.c:665
#6  0x0000000020336784 in h_rtas (cpu=0x3fffb6260010, spapr=0x210aee80, 
    opcode=<optimized out>, args=<optimized out>)
    at /home/mcui/workspace/NXPower/qemu/hw/ppc/spapr_hcall.c:667
#7  0x0000000020339338 in spapr_hypercall (cpu=0x3fffb6260010, opcode=61440, 
    args=0x3fffb6200030)
    at /home/mcui/workspace/NXPower/qemu/hw/ppc/spapr_hcall.c:1243
#8  0x000000002040c6b4 in kvm_arch_handle_exit (cs=0x3fffb6260010, 
    run=0x3fffb6200000)
    at /home/mcui/workspace/NXPower/qemu/target-ppc/kvm.c:1817
#9  0x00000000202a25f8 in kvm_cpu_exec (cpu=0x3fffb6260010)
    at /home/mcui/workspace/NXPower/qemu/kvm-all.c:2016
#10 0x0000000020288350 in qemu_kvm_cpu_thread_fn (arg=0x3fffb6260010)
    at /home/mcui/workspace/NXPower/qemu/cpus.c:998
#11 0x00003fffb7488728 in start_thread () from /lib64/libpthread.so.0
#12 0x00003fffb73bd210 in clone () from /lib64/libc.so.6

You would need to set up vhost-scsi in order to reproduce the exact same issue. You can observe that the VIRTIO_CONFIG_S_DRIVER_OK status bit being cleared even with a standard virtio-scsi backend by setting a breakpoint on virtio_set_status() and observe that the status goes from 0 to 0xf (fully functional), then after the bus scan completes, the status is set to 0xb, with VIRTIO_CONFIG_S_DRIVER_OK cleared.

The text was updated successfully, but these errors were encountered:

mdroth · 2017-05-04T14:48:58Z

the register value in the trace seems to be disabling io/mem too? that could be pci-device-disable in board-qemu/slof/pci-device_1af4_1004.fs

i wonder if simply removing the pci-device-disable from that file allow the boot to continue?

cuinutanix · 2017-05-04T15:21:44Z

No it doesn't. It just removes 1 call to pci-device-disable but there are others, from pci-device.fs:

\ prepare the device for subsequent use
\ this word should be overloaded by the device file (if present)
\ the device file can call this file before implementing
\ its own open functionality
: open
        puid >r             \ save the old puid
        my-puid TO puid     \ set up the puid to the devices Hostbridge
        pci-master-enable   \ And enable Bus Master, IO and MEM access again.
        pci-mem-enable      \ enable mem access
        pci-io-enable       \ enable io access
        r> TO puid          \ restore puid
        true
;

\ close the previously opened device
\ this word should be overloaded by the device file (if present)
\ the device file can call this file after its implementation
\ of own close functionality
: close 
        puid >r             \ save the old puid
        my-puid TO puid     \ set up the puid
        pci-device-disable  \ and disable the device
        r> TO puid          \ restore puid
;

Node that the open method enables the device, but that's not getting called.

cuinutanix · 2017-05-04T15:29:30Z

OK, I think I figured it out, clue is in the comment:

\ prepare the device for subsequent use
\ this word should be overloaded by the device file (if present)
\ the device file can call this file before implementing
\ its own open functionality

virtio-scsi.fs does not have the "open" word.

laggarcia · 2017-05-11T16:25:43Z

Closing as per previous comment.

RH-Author: Marc-André Lureau <[email protected]> Message-id: <[email protected]> Patchwork-id: 70942 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH v2 5/5] char: do not use atexit cleanup handler Bugzilla: 1347077 RH-Acked-by: Xiao Wang <[email protected]> RH-Acked-by: Laurent Vivier <[email protected]> RH-Acked-by: Victor Kaplansky <[email protected]> RH-Acked-by: Paolo Bonzini <[email protected]> From: Marc-André Lureau <[email protected]> It turns out qemu is calling exit() in various places from various threads without taking much care of resources state. The atexit() cleanup handlers cannot easily destroy resources that are in use (by the same thread or other). Since c1111a2, TCG arm guests run into the following abort() when running tests, the chardev mutex is locked during the write, so qemu_mutex_destroy() returns an error: #0 0x00007fffdbb806f5 in raise () at /lib64/libc.so.6 #1 0x00007fffdbb822fa in abort () at /lib64/libc.so.6 #2 0x00005555557616fe in error_exit (err=<optimized out>, msg=msg@entry=0x555555c38c30 <__func__.14622> "qemu_mutex_destroy") at /home/drjones/code/qemu/util/qemu-thread-posix.c:39 #3 0x0000555555b0be20 in qemu_mutex_destroy (mutex=mutex@entry=0x5555566aa0e0) at /home/drjones/code/qemu/util/qemu-thread-posix.c:57 open-power-host-os#4 0x00005555558aab00 in qemu_chr_free_common (chr=0x5555566aa0e0) at /home/drjones/code/qemu/qemu-char.c:4029 open-power-host-os#5 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4038 open-power-host-os#6 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4044 open-power-host-os#7 0x00005555558b062c in qemu_chr_cleanup () at /home/drjones/code/qemu/qemu-char.c:4557 open-power-host-os#8 0x00007fffdbb851e8 in __run_exit_handlers () at /lib64/libc.so.6 open-power-host-os#9 0x00007fffdbb85235 in () at /lib64/libc.so.6 open-power-host-os#10 0x00005555558d1b39 in testdev_write (testdev=0x5555566aa0a0) at /home/drjones/code/qemu/backends/testdev.c:71 open-power-host-os#11 0x00005555558d1b39 in testdev_write (chr=<optimized out>, buf=0x7fffc343fd9a "", len=0) at /home/drjones/code/qemu/backends/testdev.c:95 open-power-host-os#12 0x00005555558adced in qemu_chr_fe_write (s=0x5555566aa0e0, buf=buf@entry=0x7fffc343fd98 "0q", len=len@entry=2) at /home/drjones/code/qemu/qemu-char.c:282 Instead of using a atexit() handler, only run the chardev cleanup as initially proposed at the end of main(), where there are less chances (hic) of conflicts or other races. Signed-off-by: Marc-André Lureau <[email protected]> Reported-by: Andrew Jones <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> (cherry picked from commit dd9250d5d99a2671584f65b7d0745355bbbe353f) Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Miroslav Rezanina <[email protected]>

RH-Author: Markus Armbruster <[email protected]> Message-id: <[email protected]> Patchwork-id: 71467 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH] block: drop support for using qcow[2] encryption with system emulators Bugzilla: 1336659 RH-Acked-by: Miroslav Rezanina <[email protected]> RH-Acked-by: Laurent Vivier <[email protected]> RH-Acked-by: Daniel P. Berrange <[email protected]> From: "Daniel P. Berrange" <[email protected]> Back in the 2.3.0 release we declared qcow[2] encryption as deprecated, warning people that it would be removed in a future release. commit a1f688f Author: Markus Armbruster <[email protected]> Date: Fri Mar 13 21:09:40 2015 +0100 block: Deprecate QCOW/QCOW2 encryption The code still exists today, but by a (happy?) accident we entirely broke the ability to use qcow[2] encryption in the system emulators in the 2.4.0 release due to commit 8336aaf Author: Daniel P. Berrange <[email protected]> Date: Tue May 12 17:09:18 2015 +0100 qcow2/qcow: protect against uninitialized encryption key This commit was designed to prevent future coding bugs which might cause QEMU to read/write data on an encrypted block device in plain text mode before a decryption key is set. It turns out this preventative measure was a little too good, because we already had a long standing bug where QEMU read encrypted data in plain text mode during system emulator startup, in order to guess disk geometry: Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)): #0 0x00007fffe90b1a28 in raise () at /lib64/libc.so.6 #1 0x00007fffe90b362a in abort () at /lib64/libc.so.6 #2 0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fffe90aa2d2 in () at /lib64/libc.so.6 open-power-host-os#4 0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229 open-power-host-os#5 0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908 open-power-host-os#6 0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999 open-power-host-os#7 0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544 open-power-host-os#8 0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134 open-power-host-os#9 0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778 open-power-host-os#10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0 open-power-host-os#11 0x00007fffe917f59d in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)): #0 0x00007fffe91797a9 in syscall () at /lib64/libc.so.6 #1 0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397 #2 0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117 #3 0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175 open-power-host-os#4 0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116 open-power-host-os#5 0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187 open-power-host-os#6 0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85 open-power-host-os#7 0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135 open-power-host-os#8 0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291 open-power-host-os#9 0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591 open-power-host-os#10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614 open-power-host-os#11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622 open-power-host-os#12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634 nb_sectors@entry=1) at block/block-backend.c:504 open-power-host-os#14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68 open-power-host-os#15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133 open-power-host-os#16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71 open-power-host-os#17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174 open-power-host-os#18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247 open-power-host-os#19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058 open-power-host-os#20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730) at qom/object.c:1514 open-power-host-os#21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24 open-power-host-os#22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905 open-power-host-os#23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380 open-power-host-os#24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122 open-power-host-os#25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440 open-power-host-os#26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218 open-power-host-os#27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256 open-power-host-os#28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249 So the safety net is correctly preventing QEMU reading cipher text as if it were plain text, during startup and aborting QEMU to avoid bad usage of this data. For added fun this bug only happens if the encrypted qcow2 file happens to have data written to the first cluster, otherwise the cluster won't be allocated and so qcow2 would not try the decryption routines at all, just return all 0's. That no one even noticed, let alone reported, this bug that has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number of actual users of encrypted qcow2 is approximately zero. So rather than fix the crash, and backport it to stable releases, just go ahead with what we have warned users about and disable any use of qcow2 encryption in the system emulators. qemu-img/qemu-io/qemu-nbd are still able to access qcow2 encrypted images for the sake of data conversion. In the future, qcow2 will gain support for the alternative luks format, but when this happens it'll be using the '-object secret' infrastructure for getting keys, which avoids this problematic scenario entirely. Signed-off-by: Daniel P. Berrange <[email protected]> Reviewed-by: Eric Blake <[email protected]> Signed-off-by: Kevin Wolf <[email protected]> (cherry picked from commit 8c0dcbc) Signed-off-by: Markus Armbruster <[email protected]> Signed-off-by: Miroslav Rezanina <[email protected]>

RH-Author: Marc-André Lureau <[email protected]> Message-id: <[email protected]> Patchwork-id: 71914 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH 1/2] monitor: fix crash when leaving qemu with spice audio Bugzilla: 1355704 RH-Acked-by: Thomas Huth <[email protected]> RH-Acked-by: Markus Armbruster <[email protected]> RH-Acked-by: Miroslav Rezanina <[email protected]> Since aa5cb7f, the chardevs are being cleaned up when leaving qemu. However, the monitor has still references to them, which may lead to crashes when running atexit() and trying to send monitor events: #0 0x00007fffdb18f6f5 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x00007fffdb1912fa in __GI_abort () at abort.c:89 #2 0x0000555555c263e7 in error_exit (err=22, msg=0x555555d47980 <__func__.13537> "qemu_mutex_lock") at util/qemu-thread-posix.c:39 #3 0x0000555555c26488 in qemu_mutex_lock (mutex=0x5555567a2420) at util/qemu-thread-posix.c:66 open-power-host-os#4 0x00005555558c52db in qemu_chr_fe_write (s=0x5555567a2420, buf=0x55555740dc40 "{\"timestamp\": {\"seconds\": 1470041716, \"microseconds\": 989699}, \"event\": \"SPICE_DISCONNECTED\", \"data\": {\"server\": {\"port\": \"5900\", \"family\": \"ipv4\", \"host\": \"127.0.0.1\"}, \"client\": {\"port\": \"40272\", \"f"..., len=240) at qemu-char.c:280 open-power-host-os#5 0x0000555555787cad in monitor_flush_locked (mon=0x5555567bd9e0) at /home/elmarco/src/qemu/monitor.c:311 open-power-host-os#6 0x0000555555787e46 in monitor_puts (mon=0x5555567bd9e0, str=0x5555567a44ef "") at /home/elmarco/src/qemu/monitor.c:353 open-power-host-os#7 0x00005555557880fe in monitor_json_emitter (mon=0x5555567bd9e0, data=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:401 open-power-host-os#8 0x00005555557882d2 in monitor_qapi_event_emit (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:472 open-power-host-os#9 0x000055555578838f in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0, errp=0x7fffffffca88) at /home/elmarco/src/qemu/monitor.c:497 open-power-host-os#10 0x0000555555c15541 in qapi_event_send_spice_disconnected (server=0x5555571139d0, client=0x5555570d0db0, errp=0x5555566c0428 <error_abort>) at qapi-event.c:1038 open-power-host-os#11 0x0000555555b11bc6 in channel_event (event=3, info=0x5555570d6c00) at ui/spice-core.c:248 open-power-host-os#12 0x00007fffdcc9983a in adapter_channel_event (event=3, info=0x5555570d6c00) at reds.c:120 open-power-host-os#13 0x00007fffdcc99a25 in reds_handle_channel_event (reds=0x5555567a9d60, event=3, info=0x5555570d6c00) at reds.c:324 open-power-host-os#14 0x00007fffdcc7d4c4 in main_dispatcher_self_handle_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:175 open-power-host-os#15 0x00007fffdcc7d5b1 in main_dispatcher_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:194 open-power-host-os#16 0x00007fffdcca7674 in reds_stream_push_channel_event (s=0x5555570d9910, event=3) at reds-stream.c:354 open-power-host-os#17 0x00007fffdcca749b in reds_stream_free (s=0x5555570d9910) at reds-stream.c:323 open-power-host-os#18 0x00007fffdccb5dad in snd_disconnect_channel (channel=0x5555576a89a0) at sound.c:229 open-power-host-os#19 0x00007fffdccb9e57 in snd_detach_common (worker=0x555557739720) at sound.c:1589 open-power-host-os#20 0x00007fffdccb9f0e in snd_detach_playback (sin=0x5555569fe3f8) at sound.c:1602 open-power-host-os#21 0x00007fffdcca3373 in spice_server_remove_interface (sin=0x5555569fe3f8) at reds.c:3387 open-power-host-os#22 0x00005555558ff6e2 in line_out_fini (hw=0x5555569fe370) at audio/spiceaudio.c:152 open-power-host-os#23 0x00005555558f909e in audio_atexit () at audio/audio.c:1754 open-power-host-os#24 0x00007fffdb1941e8 in __run_exit_handlers (status=0, listp=0x7fffdb5175d8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82 open-power-host-os#25 0x00007fffdb194235 in __GI_exit (status=<optimized out>) at exit.c:104 open-power-host-os#26 0x00007fffdb17b738 in __libc_start_main (main=0x5555558d7874 <main>, argc=67, argv=0x7fffffffcf48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffcf38) at ../csu/libc-start.c:323 Add a monitor_cleanup() functions to remove all the monitors before cleaning up the chardev. Note that we are "losing" some events that used to be sent during atexit(). Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Markus Armbruster <[email protected]> Signed-off-by: Markus Armbruster <[email protected]> (cherry picked from commit 2ef4571) BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1355704 Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Miroslav Rezanina <[email protected]>

@entry

RH-Author: Gerd Hoffmann <[email protected]> Message-id: <[email protected]> Patchwork-id: 71946 O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 1/3] vnc: don't crash getting server info if lsock is NULL Bugzilla: 1359655 RH-Acked-by: Thomas Huth <[email protected]> RH-Acked-by: Marcel Apfelbaum <[email protected]> RH-Acked-by: Markus Armbruster <[email protected]> From: "Daniel P. Berrange" <[email protected]> When VNC is started with '-vnc none' there will be no listener socket present. When we try to populate the VncServerInfo we'll crash accessing a NULL 'lsock' field. #0 qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7ffd5b8aa0f0) at io/channel-socket.c:33 #1 0x00007f4b9a297d6f in vnc_init_basic_info_from_server_addr (errp=0x7ffd5b8aa0f0, info=0x7f4b9d425460, ioc=<optimized out>) at ui/vnc.c:146 #2 vnc_server_info_get (vd=0x7f4b9e858000) at ui/vnc.c:223 #3 0x00007f4b9a29d318 in vnc_qmp_event (vs=0x7f4b9ef82000, vs=0x7f4b9ef82000, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:279 open-power-host-os#4 vnc_connect (vd=vd@entry=0x7f4b9e858000, sioc=sioc@entry=0x7f4b9e8b3a20, skipauth=skipauth@entry=true, websocket=websocket @entry=false) at ui/vnc.c:2994 open-power-host-os#5 0x00007f4b9a29e8c8 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/v nc.c:3825 open-power-host-os#6 0x00007f4b9a18d8a1 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7ffd5b8aa230) at qmp-marsh al.c:123 open-power-host-os#7 0x00007f4b9a0b53f5 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/mon itor.c:3922 open-power-host-os#8 0x00007f4b9a348580 in json_message_process_token (lexer=0x7f4b9c78dfe8, input=0x7f4b9c7350e0, type=JSON_RCURLY, x=111, y=5 9) at qobject/json-streamer.c:94 open-power-host-os#9 0x00007f4b9a35cfeb in json_lexer_feed_char (lexer=lexer@entry=0x7f4b9c78dfe8, ch=125 '}', flush=flush@entry=false) at qobj ect/json-lexer.c:310 open-power-host-os#10 0x00007f4b9a35d0ae in json_lexer_feed (lexer=0x7f4b9c78dfe8, buffer=<optimized out>, size=<optimized out>) at qobject/json -lexer.c:360 open-power-host-os#11 0x00007f4b9a348679 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at q object/json-streamer.c:114 open-power-host-os#12 0x00007f4b9a0b3a1b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/deb ug/qemu-2.6.0/monitor.c:3938 open-power-host-os#13 0x00007f4b9a186751 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7f4b9c7add40) at qemu-char.c:2895 open-power-host-os#14 0x00007f4b92b5c79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 open-power-host-os#15 0x00007f4b9a2bb0c0 in glib_pollfds_poll () at main-loop.c:213 open-power-host-os#16 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258 open-power-host-os#17 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 open-power-host-os#18 0x00007f4b9a0835cf in main_loop () at vl.c:1934 open-power-host-os#19 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4667 Do an upfront check for a NULL lsock and report an error to the caller, which matches behaviour from before commit 04d2529 Author: Daniel P. Berrange <[email protected]> Date: Fri Feb 27 16:20:57 2015 +0000 ui: convert VNC server to use QIOChannelSocket where getsockname() would be given a FD value -1 and thus report an error to the caller. Signed-off-by: Daniel P. Berrange <[email protected]> Message-id: [email protected] Signed-off-by: Gerd Hoffmann <[email protected]> (cherry picked from commit 624cdd4) Signed-off-by: Miroslav Rezanina <[email protected]>

RH-Author: Gerd Hoffmann <[email protected]> Message-id: <[email protected]> Patchwork-id: 71947 O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 2/3] vnc: fix crash when vnc_server_info_get has an error Bugzilla: 1359655 RH-Acked-by: Thomas Huth <[email protected]> RH-Acked-by: Marcel Apfelbaum <[email protected]> RH-Acked-by: Markus Armbruster <[email protected]> From: "Daniel P. Berrange" <[email protected]> The vnc_server_info_get will allocate the VncServerInfo struct and then call vnc_init_basic_info_from_server_addr to populate the basic fields. If this returns an error though, the qapi_free_VncServerInfo call will then crash because the VncServerInfo struct instance was not properly NULL-initialized and thus contains random stack garbage. #0 0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6 #1 0x00007f1987c902fa in abort () at /lib64/libc.so.6 #2 0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6 #3 0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6 open-power-host-os#4 0x00007f1987cdb2ac in free () at /lib64/libc.so.6 open-power-host-os#5 0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0 open-power-host-os#6 0x0000559193cdcf54 in visit_type_str (v=v@entry= 0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80) at qapi/qapi-visit-core.c:255 open-power-host-os#7 0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry= 0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307 open-power-host-os#8 0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry= 0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632 open-power-host-os#9 0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry= 0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658 open-power-host-os#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970 open-power-host-os#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233 open-power-host-os#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284 open-power-host-os#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039 open-power-host-os#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/vnc.c:3877 open-power-host-os#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90) at qmp-marshal.c:105 open-power-host-os#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3971 open-power-host-os#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105 open-power-host-os#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:319 open-power-host-os#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369 open-power-host-os#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 open-power-host-os#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3987 open-power-host-os#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900) at qemu-char.c:2895 open-power-host-os#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 open-power-host-os#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213 open-power-host-os#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258 open-power-host-os#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 open-power-host-os#27 0x0000559193964c55 in main () at vl.c:1908 open-power-host-os#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603 This was introduced in commit 98481bf Author: Eric Blake <[email protected]> Date: Mon Oct 26 16:34:45 2015 -0600 vnc: Hoist allocation of VncBasicInfo to callers which added error reporting for vnc_init_basic_info_from_server_addr but didn't change the g_malloc calls to g_malloc0. Signed-off-by: Daniel P. Berrange <[email protected]> Message-id: [email protected] Signed-off-by: Gerd Hoffmann <[email protected]> (cherry picked from commit 3e7f136) Signed-off-by: Miroslav Rezanina <[email protected]>

RH-Author: Marc-André Lureau <[email protected]> Message-id: <[email protected]> Patchwork-id: 72914 O-Subject: [RHEV-7.3.z qemu-kvm-rhev PATCH 10/10] ahci: fix sglist leak on retry Bugzilla: 1397745 RH-Acked-by: Stefan Hajnoczi <[email protected]> RH-Acked-by: Laurent Vivier <[email protected]> RH-Acked-by: Miroslav Rezanina <[email protected]> ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20) #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58) #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896 #3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367 open-power-host-os#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844 open-power-host-os#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333 open-power-host-os#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921 open-power-host-os#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911 open-power-host-os#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486 open-power-host-os#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027 open-power-host-os#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204 open-power-host-os#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254 open-power-host-os#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510 open-power-host-os#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314 open-power-host-os#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435 open-power-host-os#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525 open-power-host-os#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591 open-power-host-os#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262 open-power-host-os#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578 open-power-host-os#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635 open-power-host-os#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737 open-power-host-os#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746 open-power-host-os#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72 open-power-host-os#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382 open-power-host-os#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573 open-power-host-os#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585 open-power-host-os#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387 open-power-host-os#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399 open-power-host-os#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902 open-power-host-os#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84 Follow John Snow recommendation: Everywhere else ncq_err is used, it is accompanied by a list cleanup except for ncq_cb, which is the case you are fixing here. Move the sglist destruction inside of ncq_err and then delete it from the other two locations to keep it tidy. Call dma_buf_commit in ide_dma_cb after the early return. Though, this is also a little wonky because this routine does more than clear the list, but it is at the moment the centralized "we're done with the sglist" function and none of the other side effects that occur in dma_buf_commit will interfere with the reset that occurs from ide_restart_bh, I think Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: John Snow <[email protected]> (cherry picked from commit 5839df7) Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Miroslav Rezanina <[email protected]>

RH-Author: Marc-André Lureau <[email protected]> Message-id: <[email protected]> Patchwork-id: 73171 O-Subject: [RHEV-7.3.z qemu-kvm-rhev PATCH] net: don't poke at chardev internal QemuOpts Bugzilla: 1410200 RH-Acked-by: Michael S. Tsirkin <[email protected]> RH-Acked-by: Stefan Hajnoczi <[email protected]> RH-Acked-by: Maxime Coquelin <[email protected]> RH-Acked-by: Victor Kaplansky <[email protected]> From: "Daniel P. Berrange" <[email protected]> The vhost-user & colo code is poking at the QemuOpts instance in the CharDriverState struct, not realizing that it is valid for this to be NULL. e.g. the following crash shows a codepath where it will be NULL: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 617 QTAILQ_FOREACH(opt, &opts->head, next) { [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))] (gdb) bt #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 #1 0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314 #2 0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at net/vhost-user.c:360 #3 0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250, is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051 open-power-host-os#4 0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0, is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108 open-power-host-os#5 0x000055baf696083f in netdev_add (opts=0x55baf776e7e0, errp=0x7ffc51368f00) at net/net.c:1186 open-power-host-os#6 0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205 open-power-host-os#7 0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590, tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978 open-power-host-os#8 0x000055baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at qobject/json-streamer.c:105 open-power-host-os#9 0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 '}', flush=false) at qobject/json-lexer.c:319 open-power-host-os#10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369 open-power-host-os#11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124 open-power-host-os#12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, buf=0x7ffc51369170 "}R\204\367\272U", size=1) at /path/to/qemu.git/monitor.c:3994 open-power-host-os#13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387 open-power-host-os#14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399 open-power-host-os#15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, opaque=0x55baf7610a40) at qemu-char.c:2927 open-power-host-os#16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>, user_data=0x55baf7610a40) at io/channel-watch.c:84 open-power-host-os#17 0x00007f1d3e80cbbd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 open-power-host-os#18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213 open-power-host-os#19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at main-loop.c:258 open-power-host-os#20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506 open-power-host-os#21 0x000055baf676587b in main_loop () at vl.c:1908 open-power-host-os#22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, envp=0x7ffc5136a9f8) at vl.c:4604 (gdb) p opts $1 = (QemuOpts *) 0x0 The crash occurred when attaching vhost-user net via QMP: { "execute": "chardev-add", "arguments": { "id": "charnet2", "backend": { "type": "socket", "data": { "addr": { "type": "unix", "data": { "path": "/var/run/openvswitch/vhost-user1" } }, "wait": false, "server": false } } }, "id": "libvirt-19" } { "return": { }, "id": "libvirt-19" } { "execute": "netdev_add", "arguments": { "type": "vhost-user", "chardev": "charnet2", "id": "hostnet2" }, "id": "libvirt-20" } Code using chardevs should not be poking at the internals of the CharDriverState struct. What vhost-user wants is a chardev that is operating as reconnectable network service, along with the ability to do FD passing over the connection. The colo code simply wants a network service. Add a feature concept to the char drivers so that chardev users can query the actual features they wish to have supported. The QemuOpts member is removed to prevent future mistakes in this area. Signed-off-by: Daniel P. Berrange <[email protected]> Reviewed-by: Marc-André Lureau <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> (cherry picked from commit 0a73336) BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1394140 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12298577 [ Marc-André - backport: drop colo-compare bits ] Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Miroslav Rezanina <[email protected]>

"nc" is freed after hotplug vhost-user, but the watcher is not removed. The QEMU crash when the watcher access the "nc" when socket disconnects. Program received signal SIGSEGV, Segmentation fault. #0 object_get_class (obj=obj@entry=0x2) at qom/object.c:750 #1 0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>) at chardev/char-fe.c:372 #2 0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188 #3 0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #4 0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213 #5 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 #6 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515 #7 0x00007f9bb3e266a7 in main_loop () at vl.c:1917 #8 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4786 Signed-off-by: Yunjian Wang <[email protected]> Reviewed-by: Marc-André Lureau <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

currently for sh4 cpu_model argument for '-cpu' option could be either 'cpu model' name or cpu_typename. however typically '-cpu' takes 'cpu model' name and cpu type for sh4 target isn't advertised publicly ('-cpu help' prints only 'cpu model' names) so we shouldn't care about this use case (it's more of a bug). 1. Drop '-cpu cpu_typename' to align with the rest of targets. 2. Compose searched for typename from cpu model and use it with object_class_by_name() directly instead of over-complicated object_class_get_list() g_slist_find_custom() + superh_cpu_name_compare() With #1 droped, #2 could be used for both lookups which simplifies superh_cpu_class_by_name() quite a bit. Signed-off-by: Igor Mammedov <[email protected]> Acked-by: Philippe Mathieu-Daudé <[email protected]> Message-Id: <[email protected]> [ehabkost: Include fixup sent by Igor] Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Signed-off-by: Eduardo Habkost <[email protected]>

Migration of pseries is broken with TCG because QEMU tries to restore KVM MMU state unconditionally. The result is a SIGSEGV in kvm_vm_ioctl(): #0 kvm_vm_ioctl (s=0x0, type=-2146390353) at qemu/accel/kvm/kvm-all.c:2032 #1 0x00000001003e3e2c in kvmppc_configure_v3_mmu (cpu=<optimized out>, radix=<optimized out>, gtse=<optimized out>, proc_tbl=<optimized out>) at qemu/target/ppc/kvm.c:396 #2 0x00000001002f8b88 in spapr_post_load (opaque=0x1019103c0, version_id=<optimized out>) at qemu/hw/ppc/spapr.c:1578 #3 0x000000010059e4cc in vmstate_load_state (f=0x106230000, vmsd=0x1009479e0 <vmstate_spapr>, opaque=0x1019103c0, version_id=<optimized out>) at qemu/migration/vmstate.c:165 #4 0x00000001005987e0 in vmstate_load (f=<optimized out>, se=<optimized out>) at qemu/migration/savevm.c:748 This patch fixes the problem by not calling the KVM function with the TCG mode. Fixes: d39c90f ("spapr: Fix migration of Radix guests") Signed-off-by: Laurent Vivier <[email protected]> Reviewed-by: Suraj Jitindar Singh <[email protected]> Signed-off-by: David Gibson <[email protected]>

when qemu is started with '-no-acpi' CLI option, an attempt to unplug a CPU using device_del results in null pointer dereference at: #0 object_get_class #1 pc_machine_device_unplug_request_cb #2 qmp_marshal_device_del which is caused by pcms->acpi_dev == NULL due to ACPI support being disabled. Considering that ACPI support is necessary for unplug to work, check that it's enabled and fail unplug request gracefully if no acpi device were found. Signed-off-by: Igor Mammedov <[email protected]> Reviewed-by: Eduardo Habkost <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

The passed-through device might be an express device. In this case the old code allocated a too small emulated config space in pci_config_alloc() since pci_config_size() returned the size for a non-express device. This leads to an out-of-bound write in xen_pt_config_reg_init(), which sometimes results in crashes. So set is_express as already done for KVM in vfio-pci. Shortened ASan report: ==17512==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000041648 at pc 0x55e0fdac51ff bp 0x7ffe4af07410 sp 0x7ffe4af07408 WRITE of size 2 at 0x611000041648 thread T0 #0 0x55e0fdac51fe in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 #1 0x55e0fdac51fe in stw_he_p include/qemu/bswap.h:330 #2 0x55e0fdac51fe in stw_le_p include/qemu/bswap.h:379 #3 0x55e0fdac51fe in pci_set_word include/hw/pci/pci.h:490 #4 0x55e0fdac51fe in xen_pt_config_reg_init hw/xen/xen_pt_config_init.c:1991 #5 0x55e0fdac51fe in xen_pt_config_init hw/xen/xen_pt_config_init.c:2067 #6 0x55e0fdabcf4d in xen_pt_realize hw/xen/xen_pt.c:830 #7 0x55e0fdf59666 in pci_qdev_realize hw/pci/pci.c:2034 #8 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] 0x611000041648 is located 8 bytes to the right of 256-byte region [0x611000041540,0x611000041640) allocated by thread T0 here: #0 0x7ff596a94bb8 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd9bb8) #1 0x7ff57da66580 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x50580) #2 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] Signed-off-by: Simon Gaiser <[email protected]> Acked-by: Stefano Stabellini <[email protected]> Signed-off-by: Stefano Stabellini <[email protected]>

bdrv_unref() requires the AioContext lock because bdrv_flush() uses BDRV_POLL_WHILE(), which assumes the AioContext is currently held. If BDRV_POLL_WHILE() runs without AioContext held the pthread_mutex_unlock() call in aio_context_release() fails. This patch moves bdrv_unref() into the AioContext locked region to solve the following pthread_mutex_unlock() failure: #0 0x00007f566181969b in raise () at /lib64/libc.so.6 #1 0x00007f566181b3b1 in abort () at /lib64/libc.so.6 #2 0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36 #3 0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96 #4 0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507 #5 0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478 #6 0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207 #7 0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395 #8 0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418 #9 0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308 Signed-off-by: Stefan Hajnoczi <[email protected]> Reviewed-by: Kevin Wolf <[email protected]> Reviewed-by: Eric Blake <[email protected]> Message-id: [email protected] Signed-off-by: Stefan Hajnoczi <[email protected]>

There is a small chance that iothread_stop() hangs as follows: Thread 3 (Thread 0x7f63eba5f700 (LWP 16105)): #0 0x00007f64012c09b6 in ppoll () at /lib64/libc.so.6 #1 0x000055959992eac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055959992eac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x0000559599930711 in aio_poll (ctx=0x55959bdb83c0, blocking=blocking@entry=true) at util/aio-posix.c:629 #4 0x00005595996806fe in iothread_run (opaque=0x55959bd78400) at iothread.c:59 #5 0x00007f640159f609 in start_thread () at /lib64/libpthread.so.0 #6 0x00007f64012cce6f in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7f640b45b280 (LWP 16103)): #0 0x00007f64015a0b6d in pthread_join () at /lib64/libpthread.so.0 #1 0x00005595999332ef in qemu_thread_join (thread=<optimized out>) at util/qemu-thread-posix.c:547 #2 0x00005595996808ae in iothread_stop (iothread=<optimized out>) at iothread.c:91 #3 0x000055959968094d in iothread_stop_iter (object=<optimized out>, opaque=<optimized out>) at iothread.c:102 #4 0x0000559599857d97 in do_object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0, recurse=recurse@entry=false) at qom/object.c:852 #5 0x0000559599859477 in object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0) at qom/object.c:867 #6 0x0000559599680a6e in iothread_stop_all () at iothread.c:341 #7 0x000055959955b1d5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4913 The relevant code from iothread_run() is: while (!atomic_read(&iothread->stopping)) { aio_poll(iothread->ctx, true); and iothread_stop(): iothread->stopping = true; aio_notify(iothread->ctx); ... qemu_thread_join(&iothread->thread); The following scenario can occur: 1. IOThread: while (!atomic_read(&iothread->stopping)) -> stopping=false 2. Main loop: iothread->stopping = true; aio_notify(iothread->ctx); 3. IOThread: aio_poll(iothread->ctx, true); -> hang The bug is explained by the AioContext->notify_me doc comments: "If this field is 0, everything (file descriptors, bottom halves, timers) will be re-evaluated before the next blocking poll(), thus the event_notifier_set call can be skipped." The problem is that "everything" does not include checking iothread->stopping. This means iothread_run() will block in aio_poll() if aio_notify() was called just before aio_poll(). This patch fixes the hang by replacing aio_notify() with aio_bh_schedule_oneshot(). This makes aio_poll() or g_main_loop_run() to return. Implementing this properly required a new bool running flag. The new flag prevents races that are tricky if we try to use iothread->stopping. Now iothread->stopping is purely for iothread_stop() and iothread->running is purely for the iothread_run() thread. Signed-off-by: Stefan Hajnoczi <[email protected]> Reviewed-by: Eric Blake <[email protected]> Message-id: [email protected] Signed-off-by: Stefan Hajnoczi <[email protected]>

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault with low probability. The backtrace is: #0 0x00007f46c60291d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f46c602a8c8 in __GI_abort () at abort.c:90 #2 0x00000000008543c9 in PAT_abort () #3 0x000000000085140d in patchIllInsHandler () #4 <signal handler called> #5 pthread_detach (th=139933037614848) at pthread_detach.c:50 #6 0x0000000000829759 in qemu_thread_create (thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", start_routine=start_routine@entry=0x7eb9a0 <qio_task_thread_worker>, arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512 #7 0x00000000007ebc96 in qio_task_run_in_thread (task=0x31db2c0, worker=worker@entry=0x7e7e40 <qio_channel_socket_connect_worker>, opaque=0xcd23380, destroy=0x7f1180 <qapi_free_SocketAddress>) at io/task.c:141 #8 0x00000000007e7f33 in qio_channel_socket_connect_async (ioc=ioc@entry=0x626c0b0, addr=<optimized out>, callback=callback@entry=0x55e080 <qemu_chr_socket_connected>, opaque=opaque@entry=0x42862c0, destroy=destroy@entry=0x0) at io/channel_socket.c:194 #9 0x000000000055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at qemu_char.c:4744 #10 0x00007f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0 #11 0x00007f46c724799a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #12 0x000000000076c646 in glib_pollfds_poll () at main_loop.c:228 #13 0x000000000076c6eb in os_host_main_loop_wait (timeout=348000000) at main_loop.c:273 #14 0x000000000076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at main_loop.c:521 #15 0x000000000056a511 in main_loop () at vl.c:2076 #16 0x0000000000420705 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4940 The cause of this problem is a glibc bug; for more information, see https://sourceware.org/bugzilla/show_bug.cgi?id=19951. The solution for this bug is to use pthread_attr_setdetachstate. There is a similar issue with pthread_setname_np, which is moved from creating thread to created thread. Signed-off-by: linzhecheng <[email protected]> Message-Id: <[email protected]> Reviewed-by: Fam Zheng <[email protected]> [Simplify the code by removing qemu_thread_set_name, and free the arguments before invoking the start routine. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>

The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <[email protected]> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <[email protected]> Message-Id: <[email protected]> CC: [email protected] Fixes: 7ccc44f Signed-off-by: Eric Blake <[email protected]>

/public/qobject_is_equal_conversion: OK ================================================================= ==14396==ERROR: LeakSanitizer: detected memory leaks Direct leak of 56 byte(s) in 1 object(s) allocated from: #0 0x7f07682c5850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f0767d12f0c in g_malloc ../glib/gmem.c:94 #2 0x7f0767d131cf in g_malloc_n ../glib/gmem.c:331 #3 0x562bd767371f in do_test_equality /home/elmarco/src/qq/tests/check-qobject.c:49 #4 0x562bd7674a35 in qobject_is_equal_dict_test /home/elmarco/src/qq/tests/check-qobject.c:267 #5 0x7f0767d37b04 in test_case_run ../glib/gtestutils.c:2237 #6 0x7f0767d37ec4 in g_test_run_suite_internal ../glib/gtestutils.c:2321 #7 0x7f0767d37f6d in g_test_run_suite_internal ../glib/gtestutils.c:2333 #8 0x7f0767d38184 in g_test_run_suite ../glib/gtestutils.c:2408 #9 0x7f0767d36e0d in g_test_run ../glib/gtestutils.c:1674 #10 0x562bd7674e75 in main /home/elmarco/src/qq/tests/check-qobject.c:327 #11 0x7f0766009039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Markus Armbruster <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Note that data_dir[] will now point to allocated strings. Fixes: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f1448181850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f1446ed8f0c in g_malloc ../glib/gmem.c:94 #2 0x7f1446ed91cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f1446ef739a in g_strsplit ../glib/gstrfuncs.c:2364 #4 0x55cf276439d7 in main /home/elmarco/src/qq/vl.c:4311 #5 0x7f143dfad039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Eric Blake <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Fixes leaks such as: Direct leak of 2 byte(s) in 1 object(s) allocated from: #0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94 #2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331 #3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258 #5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387 #6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896 #7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167 #8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179 #9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214 #14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261 #15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515 #16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995 #17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914 #18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039) (while at it, use g_new0(ReadLineState), it's a bit easier to read) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

ASAN complains about: ==8856==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd8a1fe168 at pc 0x561136cb4451 bp 0x7ffd8a1fe130 sp 0x7ffd8a1fd8e0 READ of size 16 at 0x7ffd8a1fe168 thread T0 #0 0x561136cb4450 in __asan_memcpy (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) #1 0x561136d2a6a7 in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:83:5 #2 0x561136d29af8 in qcrypto_ivgen_calculate /home/elmarco/src/qq/crypto/ivgen.c:72:12 #3 0x561136d07c8e in test_ivgen /home/elmarco/src/qq/tests/test-crypto-ivgen.c:148:5 #4 0x7f77772c3b04 in test_case_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2237 #5 0x7f77772c3ec4 in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2321 #6 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #7 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #8 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #9 0x7f77772c4184 in g_test_run_suite /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2408 #10 0x7f77772c2e0d in g_test_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:1674 #11 0x561136d0799b in main /home/elmarco/src/qq/tests/test-crypto-ivgen.c:173:12 #12 0x7f77756e6039 in __libc_start_main (/lib64/libc.so.6+0x21039) #13 0x561136c13d89 in _start (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x6fd89) Address 0x7ffd8a1fe168 is located in stack of thread T0 at offset 40 in frame #0 0x561136d2a40f in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:76 This frame has 1 object(s): [32, 40) 'sector.addr' <== Memory access at offset 40 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) in __asan_memcpy Shadow bytes around the buggy address: 0x100031437bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x100031437c20: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[f3]f3 f3 0x100031437c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb It looks like the rest of the code copes with ndata being larger than sizeof(sector), so limit the memcpy() range. Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Daniel P. Berrange <[email protected]> Message-Id: <[email protected]> Tested-by: Thomas Huth <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Direct leak of 160 byte(s) in 4 object(s) allocated from: #0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8) #1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124 #2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16 #3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19 #4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36 #5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14 #6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5 #7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11 #8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11 #9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5 #10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13 #11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15 #12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12 #13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5 #14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9 #15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9 #16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13 #17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12 #18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182 #19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847 #20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9 #21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5 #22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11 #23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9 #24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5 #25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Eric Blake <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

The coroutine is not finished by the time the test ends, resulting in ASAN warning: ==7005==ERROR: LeakSanitizer: detected memory leaks Direct leak of 312 byte(s) in 1 object(s) allocated from: #0 0x7fd35290fa38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7fd3506c5f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55994af03e47 in qemu_coroutine_new /home/elmarco/src/qemu/util/coroutine-ucontext.c:144 #3 0x55994aefed99 in qemu_coroutine_create /home/elmarco/src/qemu/util/qemu-coroutine.c:76 #4 0x55994ac1eb50 in verify_entered_step_1 /home/elmarco/src/qemu/tests/test-coroutine.c:80 #5 0x55994af03c75 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:119 #6 0x7fd34ec02bef (/lib64/libc.so.6+0x50bef) Do not yield() to let the coroutine terminate. Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Spotted thanks to ASAN: ==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350 READ of size 1 at 0x556715a1f120 thread T0 #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219 #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294 #2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635 #3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324 #4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418 #5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109 #6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613 #7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704 #8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104 #9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131 #10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839 #11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105 #12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323 #13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373 #14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124 #15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881 #16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172 #17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184 #18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440 #19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84 #20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182 #21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847 #22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950 #26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865 #27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009) #28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9) 0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104 0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600 This fix is based on Andreas Arnez <[email protected]> upstream commit: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b 2014-08-19 Andreas Arnez <[email protected]> * s390-dis.c (init_disasm): Simplify initialization of opc_index[]. This also fixes an access after the last element of s390_opcodes[]. Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

… into staging x86 queue, 2018-01-17 Highlight: new CPU models that expose CPU features that guests can use to mitigate CVE-2017-5715 (Spectre variant #2). # gpg: Signature made Thu 18 Jan 2018 02:00:03 GMT # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <[email protected]>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/x86-pull-request: i386: Add EPYC-IBPB CPU model i386: Add new -IBRS versions of Intel CPU models i386: Add FEAT_8000_0008_EBX CPUID feature word i386: Add spec-ctrl CPUID bit i386: Add support for SPEC_CTRL MSR i386: Change X86CPUDefinition::model_id to const char* target/i386: add clflushopt to "Skylake-Server" cpu model pc: add 2.12 machine types Signed-off-by: Peter Maydell <[email protected]>

…e object missed in 60765b6. Thread 1 "qemu-system-aarch64" received signal SIGSEGV, Segmentation fault. address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 3050 as->root = root; (gdb) bt #0 address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 #1 0x0000555555af62c3 in sdhci_sysbus_realize (dev=<optimized out>, errp=0x7fff7f931150) at hw/sd/sdhci.c:1564 #2 0x00005555558b25e5 in zynqmp_sdhci_realize (dev=0x555557051520, errp=0x7fff7f931150) at hw/sd/zynqmp-sdhci.c:151 #3 0x0000555555a2e7f3 in device_set_realized (obj=0x555557051520, value=<optimized out>, errp=0x7fff7f931270) at hw/core/qdev.c:966 #4 0x0000555555ba3f74 in property_set_bool (obj=0x555557051520, v=<optimized out>, name=<optimized out>, opaque=0x555556e04a20, errp=0x7fff7f931270) at qom/object.c:1906 #5 0x0000555555ba51f4 in object_property_set (obj=obj@entry=0x555557051520, v=v@entry=0x5555576dbd60, name=name@entry=0x555555dd6306 "realized", errp=errp@entry=0x7fff7f931270) at qom/object.c:1102 Suggested-by: Peter Maydell <[email protected]> Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Message-id: [email protected] Reviewed-by: Peter Maydell <[email protected]> Signed-off-by: Peter Maydell <[email protected]>

Fixes the following ASAN report: Direct leak of 128 byte(s) in 8 object(s) allocated from: #0 0x7fefce311850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7fefcdd5ef0c in g_malloc ../glib/gmem.c:94 #2 0x559b976faff0 in create_ahci_io_test /home/elmarco/src/qemu/tests/ahci-test.c:1810 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Fix the following ASAN reports: ==20125==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f0faea03a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f0fae450f75 in g_malloc0 ../glib/gmem.c:124 #2 0x562fffd526fc in machine_start /home/elmarco/src/qemu/tests/sdhci-test.c:180 Indirect leak of 152 byte(s) in 1 object(s) allocated from: #0 0x7f0faea03850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f0fae450f0c in g_malloc ../glib/gmem.c:94 #2 0x562fffd5d21d in qpci_init_pc /home/elmarco/src/qemu/tests/libqos/pci-pc.c:122 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Spotted by ASAN: QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59 #3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95 #4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393 #5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830 #6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022 #7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40 #8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716 #9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679 #10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Peter Maydell <[email protected]> Signed-off-by: Peter Maydell <[email protected]>

Spotted by ASAN: elmarco@boraha:~/src/qemu/build (master *%)$ QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test /aarch64/boot-serial/virt: ** (process:19740): DEBUG: 18:39:30.275: foo /tmp/qtest-boot-serial-cXaS94D ================================================================= ==19740==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000069648 at pc 0x7f1d2201cc54 bp 0x7fff331f6a40 sp 0x7fff331f61e8 READ of size 4 at 0x603000069648 thread T0 #0 0x7f1d2201cc53 (/lib64/libasan.so.4+0xafc53) #1 0x55bc86685ee3 in load_aarch64_image /home/elmarco/src/qemu/hw/arm/boot.c:894 #2 0x55bc86687217 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1047 #3 0x55bc877363b5 in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40 #4 0x55bc869331ea in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716 #5 0x55bc8693bc39 in main /home/elmarco/src/qemu/vl.c:4679 #6 0x7f1d1652c009 in __libc_start_main (/lib64/libc.so.6+0x21009) #7 0x55bc86255cc9 in _start (/home/elmarco/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x1ae5cc9) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Peter Maydell <[email protected]> Signed-off-by: Peter Maydell <[email protected]>

Spotted thanks to ASAN: QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest ==30302==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124 #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203 #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Indirect leak of 45 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94 #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204 #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Dr. David Alan Gilbert <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Found thanks to ASAN: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124 #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355 #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414 #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684 #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333 Fixes: 6303631 Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Eric Blake <[email protected]> Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Fix leak spotted by ASAN: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7fe1abb80a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7fe1aaf1bf75 in g_malloc0 ../glib/gmem.c:124 #2 0x7fe1aaf1c249 in g_malloc0_n ../glib/gmem.c:355 #3 0x55f4841cfaa9 in postcopy_ram_fault_thread /home/elmarco/src/qemu/migration/postcopy-ram.c:596 #4 0x55f48479447b in qemu_thread_start /home/elmarco/src/qemu/util/qemu-thread-posix.c:504 #5 0x7fe1a043550a in start_thread (/lib64/libpthread.so.0+0x750a) Regression introduced with commit 00fa4fc. Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Peter Xu <[email protected]> Signed-off-by: Dr. David Alan Gilbert <[email protected]>

This fixes leaks found by ASAN such as: GTESTER tests/test-blockjob ================================================================= ==31442==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129 #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360 #3 0x5584d2732498 in block_job_txn_new /home/elmarco/src/qemu/blockjob.c:172 #4 0x5584d2739b28 in block_job_create /home/elmarco/src/qemu/blockjob.c:973 #5 0x5584d270ae31 in mk_job /home/elmarco/src/qemu/tests/test-blockjob.c:34 #6 0x5584d270b1c1 in do_test_id /home/elmarco/src/qemu/tests/test-blockjob.c:57 #7 0x5584d270b65c in test_job_ids /home/elmarco/src/qemu/tests/test-blockjob.c:118 #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255 #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339 #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351 #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426 #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692 #13 0x5584d270d6e2 in main /home/elmarco/src/qemu/tests/test-blockjob.c:377 #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29) Add an assert to make sure that the job doesn't have associated txn before free(). [Jeff Cody: N.B., used updated patch provided by John Snow] Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Jeff Cody <[email protected]>

Free the AIO context earlier than the GMainContext (if we have) to workaround a glib2 bug that GSource context pointer is not cleared even if the context has already been destroyed (while it should). The patch itself only changed the order to destroy the objects, no functional change at all. Without this workaround, we can encounter qmp-test hang with oob (and possibly any other use case when iothread is used with GMainContexts): #0 0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685 #4 0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497 #5 0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129 #6 0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462 #7 0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476 #8 0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924 #9 0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436 #10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436 #11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455 #12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365 #13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663 #14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749 The glib2 bug is fixed in commit 26056558b ("gmain: allow g_source_get_context() on destroyed sources", 2012-07-30), so the first good version is glib2 2.33.10. But we still support building with glib as old as 2.28, so we need the workaround. Let's make sure we destroy the GSources first before its owner context until we drop support for glib older than 2.33.10. Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Reviewed-by: Eric Blake <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Eric Blake <[email protected]>

Eric Auger reported the problem days ago that OOB broke ARM when running with libvirt: http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html The problem was that the monitor dispatcher bottom half was bound to qemu_aio_context now, which could be polled unexpectedly in block code. We should keep the dispatchers run in iohandler_ctx just like what we did before the Out-Of-Band series (chardev uses qio, and qio binds everything with iohandler_ctx). If without this change, QMP dispatcher might be run even before reaching main loop in block IO path, for example, in a stack like (the ARM case, "cont" command handler run even during machine init phase): #0 qmp_cont () #1 0x00000000006bd210 in qmp_marshal_cont () #2 0x0000000000ac05c4 in do_qmp_dispatch () #3 0x0000000000ac07a0 in qmp_dispatch () #4 0x0000000000472d60 in monitor_qmp_dispatch_one () #5 0x000000000047302c in monitor_qmp_bh_dispatcher () #6 0x0000000000acf374 in aio_bh_call () #7 0x0000000000acf428 in aio_bh_poll () #8 0x0000000000ad5110 in aio_poll () #9 0x0000000000a08ab8 in blk_prw () #10 0x0000000000a091c4 in blk_pread () #11 0x0000000000734f94 in pflash_cfi01_realize () #12 0x000000000075a3a4 in device_set_realized () #13 0x00000000009a26cc in property_set_bool () #14 0x00000000009a0a40 in object_property_set () #15 0x00000000009a3a08 in object_property_set_qobject () #16 0x00000000009a0c8c in object_property_set_bool () #17 0x0000000000758f94 in qdev_init_nofail () #18 0x000000000058e190 in create_one_flash () #19 0x000000000058e2f4 in create_flash () #20 0x00000000005902f0 in machvirt_init () #21 0x00000000007635cc in machine_run_board_init () #22 0x00000000006b135c in main () Actually the problem is more severe than that. After we switched to the qemu AIO handler it means the monitor dispatcher code can even be called with nested aio_poll(), then it can be an explicit aio_poll() inside another main loop aio_poll() which could be racy too; breaking code like TPM and 9p that use nested event loops. Switch to use the iohandler_ctx for monitor dispatchers. My sincere thanks to Eric Auger who offered great help during both debugging and verifying the problem. The ARM test was carried out by applying this patch upon QEMU 2.12.0-rc0 and problem is gone after the patch. A quick test of mine shows that after this patch applied we can pass all raw iotests even with OOB on by default. CC: Eric Blake <[email protected]> CC: Markus Armbruster <[email protected]> CC: Stefan Hajnoczi <[email protected]> CC: Fam Zheng <[email protected]> Reported-by: Eric Auger <[email protected]> Tested-by: Eric Auger <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Reviewed-by: Eric Blake <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Eric Blake <[email protected]>

cuinutanix mentioned this issue May 4, 2017

Two fixes for virtio-scsi open-power-host-os/slof#1

Open

rmatinata-ibm assigned mdroth May 4, 2017

laggarcia closed this as completed May 11, 2017

nasastry mentioned this issue Oct 12, 2017

qemu crashes with qemu_event_set: Assertion `ev->initialized' failed error #20

Closed

This was referenced Oct 17, 2017

qemu savevm overwrites snapshots when tags used 0 and 1 consecutively #23

Closed

Guest shuts-off with call trace after migrating between Boston and ZZ systems #19

Closed

nasastry mentioned this issue Oct 30, 2017

qemu-io crashes with SIGABRT and Assertion `c->entries[i].offset != 0' failed #25

Closed

cdeadmin mentioned this issue Feb 8, 2018

qemu does not default to host capabilities for cap-{sbbc,cfpc,ibs} and needs clarity on usecase scenarios of these machine property #35

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot boot using vhost-scsi controller. #2

Cannot boot using vhost-scsi controller. #2

cuinutanix commented May 4, 2017 •

edited

Loading

mdroth commented May 4, 2017

cuinutanix commented May 4, 2017

cuinutanix commented May 4, 2017

laggarcia commented May 11, 2017

Cannot boot using vhost-scsi controller. #2

Cannot boot using vhost-scsi controller. #2

Comments

cuinutanix commented May 4, 2017 • edited Loading

mdroth commented May 4, 2017

cuinutanix commented May 4, 2017

cuinutanix commented May 4, 2017

laggarcia commented May 11, 2017

cuinutanix commented May 4, 2017 •

edited

Loading