Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.
This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests. It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stash the RAMBlock and offset for later use looking up
addresses.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.
This is done as a 3 stage set:
QEMU -> client
set_mem_table
mmap stuff, get addresses
client -> qemu
here are the addresses
qemu -> client
OK - now you can use them
That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.
Note: We don't ask for the default 'ack' reply since we've got our own.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.
Later patches will fill in the behaviour/contents of the
message.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
domain of a NVDIMM SPA range must match with corresponding entry in
SRAT table.
The address ranges of vNVDIMM in QEMU are allocated from the
hot-pluggable address space, which is entirely covered by one SRAT
memory affinity structure. However, users can set the vNVDIMM
proximity domain in NFIT SPA range structure by the 'node' property of
'-device nvdimm' to a value different than the one in the above SRAT
memory affinity structure.
In order to solve such proximity domain mismatch, this patch builds
one SRAT memory affinity structure for each DIMM device present at
boot time, including both PC-DIMM and NVDIMM, with the proximity
domain specified in '-device pc-dimm' or '-device nvdimm'.
The remaining hot-pluggable address space is covered by one or multiple
SRAT memory affinity structures with the proximity domain of the last
node as before.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.
A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.
It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.
While at it hide recursive callbacks from callers, so that:
qmp_pc_dimm_device_list(qdev_get_machine(), &list);
could be replaced with simpler:
list = qmp_pc_dimm_device_list();
* follow up patch will use it in build_srat()
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All PCI devices are now QOM'ified.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extend generic build_fadt() to support rev5.1 FADT
and reuse it for 'virt' board, it would allow to
phase out usage of AcpiFadtDescriptorRev5_1 and
later ACPI_FADT_COMMON_DEF.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will be extended and reused by follow up patch for ARM target.
PS:
Since it's generic function now, don't patch FIRMWARE_CTRL, DSDT
fields if they don't point to tables since platform might not
provide them and use X_ variants instead if applicable.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
build_append_foo() API doesn't need explicit endianness
conversions which eliminates a source of errors and
it makes build_fadt() look like declarative definition of
FADT table in ACPI spec, which makes it easy to review.
Also it allows easily extending FADT to support other
revisions which will be used by follow up patches
where build_fadt() will be reused for ARM target.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
move FADT data initialization out of fadt_setup() into dedicated
init_fadt_data() that will set common for pc/q35 values in
AcpiFadtData structure and acpi_get_pm_info() will complement
it with pc/q35 specific values initialization.
That will allow to get rid of fadt_setup() and generalize
build_fadt() so it could be easily extended for rev5 and
reused by ARM target.
While at it also move facs/dsdt/xdsdt offsets from build_fadt()
arg list into AcpiFadtData, as they belong to the same dataset.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ACPI_PORT_SMI_CMD is alias for APM_CNT_IOPORT,
so make it really one instead of duplicating its value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it will help to add Generic Address Structure to ACPI tables
without using packed C structures and avoid endianness
issues as API doesn't need an explicit conversion.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
next patch will need it before it gets to piix4/lpc branches
that initializes 'obj' now.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Although linkspeed and duplex can be set in a linux guest via 'ethtool -s',
this requires custom ethtool commands for virtio-net by default.
Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.
Linkspeed and duplex settings can be set as:
'-device virtio-net,speed=10000,duplex=full'
where speed is [0...INT_MAX], and duplex is ["half"|"full"].
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In prepartion for using some of the high order feature bits, make sure that
virtio-net uses 64-bit values everywhere.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Typically the scanline length and the line offset are identical. But
in case they are not our calculation for region_end is incorrect. Using
line_offset is fine for all scanlines, except the last one where we have
to use the actual scanline length.
Fixes: CVE-2018-7550
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-id: 20180309143704.13420-1-kraxel@redhat.com
Changing the current ordering saves 8 bytes per entry in x86_64.
Signed-off-by: zhenwei.pi <zhenwei.pi@youruncloud.com>
Message-id: 1520318781-22644-1-git-send-email-zhenwei.pi@youruncloud.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180309153654.13518-8-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
From the "Physical Layer Simplified Specification Version 3.01":
A known data block ("Tuning block") can be used to tune sampling
point for tuning required hosts. [...]
This procedure gives the system optimal timing for each specific
host and card combination and compensates for static delays in
the timing budget including process, voltage and different PCB
loads and skews. [...]
Data block, carried by DAT[3:0], contains a pattern for tuning
sampling position to receive data on the CMD and DAT[3:0] line.
[based on a patch from Alistair Francis <alistair.francis@xilinx.com>
from qemu/xilinx tag xilinx-v2015.2]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180309153654.13518-5-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180309153654.13518-4-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SDBus will reuse these functions, so we put them in a new source file.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180309153654.13518-3-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: slight wordsmithing of comments, added note that string
returned does not need to be freed]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180309153654.13518-2-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add support for passing 'max' to -machine gic-version. By analogy
with the -cpu max option, this picks the "best available" GIC version
whether you're using KVM or TCG, so it behaves like 'host' when
using KVM, and gives you GICv3 when using TCG.
Also like '-cpu host', using -machine gic-version=max' means there
is no guarantee of migration compatibility between QEMU versions;
in future 'max' might mean '4'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-7-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Allow the virt board to support '-cpu max' in the same way
it already handles '-cpu host'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-6-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Spotted by ASAN:
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124
#2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59
#3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95
#4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393
#5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830
#6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022
#7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
#8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
#9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679
#10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A lot of ARM object files are linked into the executable unconditionally,
even though we have corresponding CONFIG switches like CONFIG_PXA2XX or
CONFIG_OMAP. We should make sure to use these switches in the Makefile so
that the users can disable certain unwanted boards and devices more easily.
While we're at it, also add some new switches for the boards that do not
have a CONFIG option yet.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1520266949-29817-1-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add code needed to get a functional PCI subsytem when using in
conjunction with upstream Linux guest (4.13+). Tested to work against
"e1000e" (network adapter, using MSI interrupts) as well as
"usb-ehci" (USB controller, using legacy PCI interrupts).
Based on "i.MX6 Applications Processor Reference Manual" (Document
Number: IMX6DQRM Rev. 4) as well as corresponding dirver in Linux
kernel (circa 4.13 - 4.16 found in drivers/pci/dwc/*)
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Set the ARM CPU core count property for the A53's attached to the Xilnx
ZynqMP machine.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: fe0dd90b85ac73f9fc9548c253bededa70a07006.1520018138.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock. By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!
Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread. This way no races with the vq handler are
possible.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock. By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!
Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread. This way no races with the vq handler are
possible.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit 5b2ffbe4d9 ("virtio-blk: dataplane:
notify guest as a batch") deferred guest notification to a BH in order
batch notifications, with purpose of avoiding flooding the guest with
interruptions.
This optimization came with a cost. The average latency perceived in the
guest is increased by a few microseconds, but also when multiple IO
operations finish at the same time, the guest won't be notified until
all completions from each operation has been run. On the contrary,
virtio-scsi issues the notification at the end of each completion.
On the other hand, nowadays we have the EVENT_IDX feature that allows a
better coordination between QEMU and the Guest OS to avoid sending
unnecessary interruptions.
With this change, virtio-blk/dataplane only batches notifications if the
EVENT_IDX feature is not present.
Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1):
- Test specs:
* fio-3.4 (ioengine=sync, iodepth=1, direct=1)
* qemu master
* virtio-blk with a dedicated iothread (default poll-max-ns)
* backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000
* 8 vCPUs pinned to isolated physical cores
* Emulator and iothread also pinned to separate isolated cores
* variance between runs < 1%
- Not patched
* numjobs=1: lat_avg=327.32 irqs=29998
* numjobs=4: lat_avg=337.89 irqs=29073
* numjobs=8: lat_avg=342.98 irqs=28643
- Patched:
* numjobs=1: lat_avg=323.92 irqs=30262
* numjobs=4: lat_avg=332.65 irqs=29520
* numjobs=8: lat_avg=335.54 irqs=29323
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20180307114459.26636-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce an sccb_mask_t to be used for SCLP event masks instead of just
unsigned int or uint32_t. This will allow later to extend the mask with
more ease.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1519407778-23095-3-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Until 67915de9f0 ("s390x/event-facility: variable-length event masks")
we only supported sclp event masks with a size of exactly 4 bytes, even
though the architecture allows the guests to set up sclp event masks
from 1 to 1021 bytes in length.
After that patch, the behaviour was almost compliant, but some issues
were still remaining, in particular regarding the handling of selective
reads and migration.
When setting the sclp event mask, a mask size is also specified. Until
now we only considered the size in order to decide which bits to save
in the internal state. On the other hand, when a guest performs a
selective read, it sends a mask, but it does not specify a size; the
implied size is the size of the last mask that has been set.
Specifying bits in the mask of selective read that are not available in
the internal mask should return an error, and bits past the end of the
mask should obviously be ignored. This can only be achieved by keeping
track of the lenght of the mask.
The mask length is thus now part of the internal state that needs to be
migrated.
This patch fixes the handling of selective reads, whose size will now
match the length of the event mask, as per architecture.
While the default behaviour is to be compliant with the architecture,
when using older machine models the old broken behaviour is selected
(allowing only masks of size exactly 4), in order to be able to migrate
toward older versions.
Fixes: 67915de9f0 ("s390x/event-facility: variable-length event masks")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1519407778-23095-2-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The license text currently specifies "any version" of the GPL. It
is unlikely that GPL v1 was ever intended; change this to the
standard "or any later version" text.
Cc: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Cc: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The other event handlers (quiesce and cpu) do not define these
handlers, and this one does nothing, so it can be removed.
Signed-off-by: Nia Alarie <nia.alarie@gmail.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180306100721.19419-1-nia.alarie@gmail.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Right now it is possible to crash QEMU for s390x by providing e.g.
-numa node,nodeid=0,cpus=0-1
Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
indicator whether NUMA is supported by a machine type. We don't
implement NUMA for s390x ("topology") yet. However we need
mc->cpu_index_to_instance_props for query-cpus.
So let's fix this case by also checking for mc->get_default_cpu_node_id,
which will be needed by machine_set_cpu_numa_node().
qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
this machine-type
While at it, make s390_cpu_index_to_props() look like on other
architectures.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180227110255.20999-1-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>