This change removes userland code that worked around a restriction
in the mshv driver in the 6.18 kernel: regions from userland
couldn't be mapped to multiple regions in the kernel. We maintained a
shadow mapping table in qemu and used a heuristic to swap in a requested
region in case of UNMAPPED_GPA exits.
However, this heuristic wasn't reliable in all cases, since HyperV
behaviour is not 100% reliable across versions. HyperV itself doesn't
prohibit to map regions at multiple places into the guest, so the
restriction has been removed in the mshv driver.
Hence we can remove the remapping code. Effectively this will mandate a
6.19 kernel, if the workload attempt to map e.g. BIOS to multiple
reagions. I still think it's the right call to remove this logic:
- The workaround only seems to work reliably with a certain revision
of HyperV as a nested hypervisor.
- We expect Direct Virtualization (L1VH) to be the main platform for
the mshv accelerator, which also requires a 6.19 kernel
This reverts commit efc4093358.
Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com>
Acked-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Tested-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260113153708.448968-1-magnuskulke@linux.microsoft.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this patch, users of the property array would free the
array themselves in their cleanup functions. This causes
inconsistencies where some users leak the array and some free them.
This patch makes it so that the property array's release function
frees the property array (instead of just its elements). It fixes any
leaks and requires less code.
DEFINE_PROP_ARRAY leakers that are fixed in this patch:
ebpf-rss_fds in hw/net/virtio-net.c
rnmi_irqvec, rnmi_excpvec in hw/riscv/riscv_hart.c
common.display_modes in hw/display/apple-gfx-mmio.m
common.display_modes in hw/display/apple-gfx-pci.m
Signed-off-by: Chandan Somani <csomani@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20260108230311.584141-2-csomani@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since perfmon-v2, the AMD PMU supports additional registers. This update
includes get/put functionality for these extra registers.
Similar to the implementation in KVM:
- MSR_CORE_PERF_GLOBAL_STATUS and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS both
use env->msr_global_status.
- MSR_CORE_PERF_GLOBAL_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_CTL both use
env->msr_global_ctrl.
- MSR_CORE_PERF_GLOBAL_OVF_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR
both use env->msr_global_ovf_ctrl.
No changes are needed for vmstate_msr_architectural_pmu or
pmu_enable_needed().
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20260109075508.113097-6-dongli.zhang@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU uses the kvm_get_msrs() function to save Intel PMU registers from KVM
and kvm_put_msrs() to restore them to KVM. However, there is no support for
AMD PMU registers. Currently, pmu_version and num_pmu_gp_counters are
initialized based on cpuid(0xa), which does not apply to AMD processors.
For AMD CPUs, prior to PerfMonV2, the number of general-purpose registers
is determined based on the CPU version.
To address this issue, we need to add support for AMD PMU registers.
Without this support, the following problems can arise:
1. If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while
running "perf top", the PMU registers are not disabled properly.
2. Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs()
does not handle AMD PMU registers, causing some PMU events to remain
enabled in KVM.
3. The KVM kvm_pmc_speculative_in_use() function consistently returns true,
preventing the reclamation of these events. Consequently, the
kvm_pmc->perf_event remains active.
4. After a reboot, the VM kernel may report the following error:
[ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
[ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)
5. In the worst case, the active kvm_pmc->perf_event may inject unknown
NMIs randomly into the VM kernel:
[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.
To resolve these issues, we propose resetting AMD PMU registers during the
VM reset process.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260109075508.113097-5-dongli.zhang@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AMD does not have what is commonly referred to as an architectural PMU.
Therefore, we need to rename the following variables to be applicable for
both Intel and AMD:
- has_architectural_pmu_version
- num_architectural_pmu_gp_counters
- num_architectural_pmu_fixed_counters
For Intel processors, the meaning of pmu_version remains unchanged.
For AMD processors:
pmu_version == 1 corresponds to versions before AMD PerfMonV2.
pmu_version == 2 corresponds to AMD PerfMonV2.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20260109075508.113097-4-dongli.zhang@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initialization of 'has_architectural_pmu_version',
'num_architectural_pmu_gp_counters', and
'num_architectural_pmu_fixed_counters' is unrelated to the process of
building the CPUID.
Extract them out of kvm_x86_build_cpuid().
In addition, use cpuid_find_entry() instead of cpu_x86_cpuid(), because
CPUID has already been filled at this stage.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20260109075508.113097-3-dongli.zhang@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
there is no way to fully disable KVM AMD PMU virtualization. Neither
"-cpu host,-pmu" nor "-cpu EPYC" achieves this.
As a result, the following message still appears in the VM dmesg:
[ 0.263615] Performance Events: AMD PMU driver.
However, the expected output should be:
[ 0.596381] Performance Events: PMU not available due to virtualization, using software events only.
[ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
This occurs because AMD does not use any CPUID bit to indicate PMU
availability.
To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE
when "-pmu" is configured.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20260109075508.113097-2-dongli.zhang@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new Diamond Rapids x86 cpu model definition that was added in 7a6dd8bde1
has an unexpected comma in the `.features[FEAT_VMX_EXIT_CTLS]` subobject
initializer, causing the prior initialization to be overridden. For this
reason `VMX_VM_EXIT_SAVE_DEBUG_CONTROLS | VMX_VM_EXIT_HOST_ADDR_SPACE_SIZE`
is not included.
Fix this by replacing the comma with the missing bitwise OR to properly
combine all the flags into a single bitmask value.
Fixes: 7a6dd8bde1 ("i386/cpu: Add CPU model for Diamond Rapids")
Signed-off-by: Aidan Khoury <aidan@aktech.ai>
The VIRTIO_CONSOLE_F_EMERG_WRITE feature bit was only set
in the hw_compat_2_7[] array, via the 'emergency-write=off'
property. We removed all machines using that array, lets remove
that property. All instances have this feature bit set and
it can not be disabled. VirtIOSerial::host_features mask is
now unused, remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-28-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VirtIOPCIProxy::ignore_backend_features boolean was only set
in the hw_compat_2_7[] array, via the 'x-ignore-backend-features=on'
property. We removed all machines using that array, lets remove
that property, simplify by only using the default version.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-27-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The IntelIOMMUState::buggy_eim boolean was only set in
the hw_compat_2_7[] array, via the 'x-buggy-eim=true'
property. We removed all machines using that array, lets
remove that property, simplifying vtd_decide_config().
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-26-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hw_compat_2_7[] array was only used by the pc-q35-2.7 and
pc-i440fx-2.7 machines, which got removed. Remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-25-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PCSpkState::migrate boolean was only set in the
pc_compat_2_7[] array, via the 'migrate=off' property.
We removed all machines using that array, lets remove
that property, simplifying vmstate_spk[].
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-24-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUX86State::full_cpuid_auto_level boolean was only
disabled for the pc-q35-2.7 and pc-i440fx-2.7 machines,
which got removed. Being now always %true, we can remove
it and simplify x86_cpu_expand_features().
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-23-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The pc_compat_2_7[] array was only used by the pc-q35-2.7
and pc-i440fx-2.7 machines, which got removed. Remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-22-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit
ce80c4fa6f "docs: document special exception for machine type
deprecation & removal") they can now be removed. Remove the qtest
in test-x86-cpuid-compat.c file.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-21-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VirtIOMMIOProxy::format_transport_address boolean was only set
in the hw_compat_2_6[] array, via the 'format_transport_address=off'
property. We removed all machines using that array, lets remove
that property, simplifying virtio_mmio_bus_get_dev_path().
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-20-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hw_compat_2_6[] array was only used by the pc-q35-2.6 and
pc-i440fx-2.6 machines, which got removed. Remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-19-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The APICCommonState::legacy_instance_id boolean was only set
in the pc_compat_2_6[] array, via the 'legacy-instance-id=on'
property. We removed all machines using that array, lets remove
that property, simplifying apic_common_realize().
Because instance_id is initialized as initial_apic_id, we can
not register vmstate_apic_common directly via dc->vmsd.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-18-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The pc_compat_2_6[] array was only used by the pc-q35-2.6
and pc-i440fx-2.6 machines, which got removed. Remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-17-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All machines now use the linuxboot_dma.bin binary, so it's safe to
remove the non-DMA version (linuxboot.bin).
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-16-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now all calls of x86 machines to fw_cfg_init_io_dma() pass DMA
arguments, so the FWCfgState (FWCfgIoState) created by x86 machines
enables DMA by default.
Although other callers of fw_cfg_init_io_dma() besides x86 also pass
DMA arguments to create DMA-enabled FwCfgIoState, the "dma_enabled"
property of FwCfgIoState cannot yet be removed, because Sun4u and Sun4v
still create DMA-disabled FwCfgIoState (bypass fw_cfg_init_io_dma()) in
sun4uv_init() (hw/sparc64/sun4u.c).
Maybe reusing fw_cfg_init_io_dma() for them would be a better choice, or
adding fw_cfg_init_io_nodma(). However, before that, first simplify the
handling of FwCfgState in x86.
Considering that FwCfgIoState in x86 enables DMA by default, remove the
handling for DMA-disabled cases and replace DMA checks with assertions
to ensure that the default DMA-enabled setting is not broken.
Then 'linuxboot.bin' isn't used anymore, and it will be removed in the
next commit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-15-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All PC machines now use the multiboot_dma.bin binary,
we can remove the non-DMA version (multiboot.bin).
This doesn't change multiboot_dma binary file.
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-14-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The X86MachineClass::fwcfg_dma_enabled boolean was only used
by the pc-q35-2.6 and pc-i440fx-2.6 machines, which got
removed. Remove it and simplify.
'multiboot.bin' isn't used anymore, we'll remove it in the
next commit.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-13-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"wide" in fw_cfg_init_mem_wide() means "DMA support".
Rename for clarity.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-12-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Factor fw_cfg_init_mem_internal() out of fw_cfg_init_mem_wide().
In fw_cfg_init_mem_wide(), assert DMA arguments are provided.
Callers without DMA have to use the fw_cfg_init_mem() helper.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fw_cfg_init_mem_wide() is prefered to initialize fw_cfg
with DMA support. Without DMA, use fw_cfg_init_mem_nodma().
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename fw_cfg_init_mem() as fw_cfg_init_mem_nodma()
to distinct with the DMA version (currently named
fw_cfg_init_mem_wide).
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now the legacy cpu hotplug way has gone away, and there's no _INIT
method in DSDT table for modern cpu hotplug support.
Update DSDT tables for pc machine, and_INIT methods are removed from
DSDT tables:
- Method (_INI, 0, Serialized) // _INI: Initialize
- {
- CSEL = Zero
- }
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Legacy cpu hotplug has been removed totally and machines start with
modern cpu hotplug interface directly.
Therefore, update the documentation to describe current QEMU cpu hotplug
logic.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cpu_hotplug.h and cpu_hotplug.c contain legacy cpu hotplug
utilities. Now there's no use case of legacy cpu hotplug, so it's safe
to drop legacy cpu hotplug support totally.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now all PC & Q35 machiens are using modern hotplug from the beginning,
and all legacy_cpu_hotplug flags keep false during runtime.
So it's safe to remove legacy_cpu_hotplug flags and related properties,
with unused gpe_cpu field.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For compatibility reasons PC/Q35 will start with legacy CPU hotplug
interface by default but with new CPU hotplug AML code since 2.7
machine type (in commit 679dd1a957 ("pc: use new CPU hotplug interface
since 2.7 machine type")). In that way, legacy firmware that doesn't use
QEMU generated ACPI tables was able to continue using legacy CPU hotplug
interface.
While later machine types, with firmware supporting QEMU provided ACPI
tables, generate new CPU hotplug AML, which will switch to new CPU
hotplug interface when guest OS executes its _INI method on ACPI tables
loading.
Since 2.6 machine type is now gone, and consider that the legacy BIOS
(based on QEMU ACPI prior to v2.7) should be no longer in use, previous
compatibility requirements are no longer necessary. So initialize
'modern' hotplug directly from the very beginning for PC/Q35 machines
with cpu_hotplug_hw_init(), and drop _INIT method.
Additionally, remove the checks and settings around cpu_hotplug_legacy
in cpuhp VMState (for piix4 & ich9), to eliminate the risk of
segmentation faults, as gpe_cpu no longer has the opportunity to be
initialized. This is safe because all hotplug now start with the modern
way, and it's impossible to switch to legacy way at runtime (even the
"cpu-hotplug-legacy" properties does not allow it either).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/r/20260108033051.777361-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before dropping legacy CPU hotplug code, mark and allow the affected
ACPI tables, to avoid breaking ACPI table testing.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit
ce80c4fa6f "docs: document special exception for machine type
deprecation & removal") they can now be removed.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
intel_iommu:
SVM support
vhost:
support for indirect descriptors in shadow virtqueue
vhost-user:
vhost-user-spi support
vhost-user-blk inflight migration support
vhost-user-blk inflight migration support
misc fixes in pci, vhost, virtio, acpi, cxl
cleanups in acpi/ghes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmmEa9APHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpqj8H/iBqAHZSTmAdBJgoLnmgoTLB01J9aUTrQU2H
BHKyrd+G3m54pwjgUNN5ieZARtlXscigf6fr0Gq2wrc8/kV/O5G5jViw9+1Bo8nW
OkLDW45nDzZGhap4oUedV+PJ3fCuW2fC8Jyb1n8OGlkadbhq0NU6GtqiEx6/7QIh
hk5WUDE/3LH4cTp8qNtr0/nYfM4FZk2sjq7aRyg4cz/uC7rIAFRq7BCZ/dfRqMh/
T+rLnizSSAg9PFMd8slWqoxOGF9NzT9LIoDSkAlso1L9lUekUSNoUblhlWDrRlLn
DEEqqGCVounfBzA95WrTRmvWs6JodppjjAjI0M4isrMKGXXg8dg=
=HdgY
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features, fixes
intel_iommu:
SVM support
vhost:
support for indirect descriptors in shadow virtqueue
vhost-user:
vhost-user-spi support
vhost-user-blk inflight migration support
vhost-user-blk inflight migration support
misc fixes in pci, vhost, virtio, acpi, cxl
cleanups in acpi/ghes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmmEa9APHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpqj8H/iBqAHZSTmAdBJgoLnmgoTLB01J9aUTrQU2H
# BHKyrd+G3m54pwjgUNN5ieZARtlXscigf6fr0Gq2wrc8/kV/O5G5jViw9+1Bo8nW
# OkLDW45nDzZGhap4oUedV+PJ3fCuW2fC8Jyb1n8OGlkadbhq0NU6GtqiEx6/7QIh
# hk5WUDE/3LH4cTp8qNtr0/nYfM4FZk2sjq7aRyg4cz/uC7rIAFRq7BCZ/dfRqMh/
# T+rLnizSSAg9PFMd8slWqoxOGF9NzT9LIoDSkAlso1L9lUekUSNoUblhlWDrRlLn
# DEEqqGCVounfBzA95WrTRmvWs6JodppjjAjI0M4isrMKGXXg8dg=
# =HdgY
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu Feb 5 10:07:12 2026 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (38 commits)
hw/cxl: Take into account how many media operations are requested for param check
hw/cxl: Check for overflow on santize media as both base and offset 64bit.
vhost-user-blk: support inter-host inflight migration
vhost: add vmstate for inflight region with inner buffer
vmstate: introduce VMSTATE_VBUFFER_UINT64
vhost-user: introduce protocol feature for skip drain on GET_VRING_BASE
vhost-user.rst: specify vhost-user back-end action on GET_VRING_BASE
virtio-gpu: use consistent error checking for virtio_gpu_create_mapping_iov
virtio-gpu: fix error handling in virgl_cmd_resource_create_blob
virtio-pmem: ignore empty queue notifications
virtio-gpu-virgl: correct parent for blob memory region
MAINTAINERS: Update VIOT maintainer
cryptodev-builtin: Limit the maximum size
hw/virtio/virtio-crypto: verify asym request size
virtio-spi: Add vhost-user-spi device support
standard-headers: Update virtio_spi.h from Linux v6.18-rc3
q35: Fix migration of SMRAM state
pcie_sriov: Fix PCI_SRIOV_* accesses in pcie_sriov_pf_exit()
virtio: Fix crash when sriov-pf is set for non-PCI-Express device
virtio-dmabuf: Ensure UUID persistence for hash table insertion
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Whilst the spec doesn't speak to it directly my assumption is that
a request for more operations than exist should result in an invalid
input error return.
Fixes: 77a8e9fe0e ("hw/cxl/cxl-mailbox-utils: Add support for Media operations discovery commands cxl r3.2 (8.2.10.9.5.3)")
Closes: https://lore.kernel.org/qemu-devel/CAFEAcA-p5wZkNxK7wNVq_3PAzEE-muOd1Def-0O-FSpck4DrBQ@mail.gmail.com/
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260102154731.474859-3-Jonathan.Cameron@huawei.com>
The both the size and base of a media sanitize operation are both provided
by the VM, an overflow is possible which may result in checks on valid
range passing when they should not. Close that by checking for overflow
on the addition.
Fixes: 40ab4ed107 ("hw/cxl/cxl-mailbox-utils: Media operations Sanitize and Write Zeros commands CXL r3.2(8.2.10.9.5.3)")
Closes: https://lore.kernel.org/qemu-devel/CAFEAcA8Rqop+ju0fuxN+0T57NBG+bep80z45f6pY0ci2fz_G3A@mail.gmail.com/
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260102154731.474859-2-Jonathan.Cameron@huawei.com>
During inter-host migration, waiting for disk requests to be drained
in the vhost-user backend can incur significant downtime.
This can be avoided if QEMU migrates the inflight region in
vhost-user-blk.
Thus, during the qemu migration, with feature flag the vhost-user
back-end can immediately stop vrings, so all in-flight requests will be
migrated to another host.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260115081103.655749-6-dtalexundeer@yandex-team.ru>
Prepare for future inflight region migration for vhost-user-blk.
We need to migrate size, queue_size, and inner buffer.
So firstly it migrate size and queue_size fields, then allocate memory
for buffer with
migrated size, then migrate inner buffer itself.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260115081103.655749-5-dtalexundeer@yandex-team.ru>
This is an analog of VMSTATE_VBUFFER_UINT32 macro, but for uint64 type.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260115081103.655749-4-dtalexundeer@yandex-team.ru>
Add vhost-user protocol feature
VHOST_USER_PROTOCOL_F_GET_VRING_BASE_INFLIGHT
Now on GET_VRING_BASE this feature can control whether to wait for
in-flight requests to complete or not.
Also we have to validate that this feature will be enabled only when
qemu and back-end supports in-flight buffer and in-flight migration
It will be helpfull in future for in-flight requests migration in
vhost-user devices.
Update docs, add ref to label for inflight-io-tracking
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260115081103.655749-3-dtalexundeer@yandex-team.ru>
By default, we assume that server need to wait all inflight IO on
GET_VRING_BASE. However, this fact is not recorded anywhere in the
documentation.
So, add this info in rst.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260115081103.655749-2-dtalexundeer@yandex-team.ru>
Unify error checking style for virtio_gpu_create_mapping_iov() across the
codebase to improve consistency and readability.
virtio_gpu_create_mapping_iov() returns 0 on success and negative values
on error. The original code used inconsistent patterns for checking errors:
- Some used 'if (ret != 0)' in virtio-gpu-virgl.c and virtio-gpu.c
- Some used 'CHECK(!ret, cmd)' in virtio-gpu-rutabaga.c
For if-statement checks, change to 'if (ret < 0)' which is the preferred
QEMU coding convention for functions that return 0 on success and negative
on error. This makes the return value convention immediately clear to code
readers.
For CHECK macro usage in virtio-gpu-rutabaga.c, keep the original
'CHECK(!ret, cmd)' pattern as it is more concise and consistent with other
error checks in the same file.
Updated locations:
- hw/display/virtio-gpu-virgl.c: virgl_resource_attach_backing()
- hw/display/virtio-gpu-virgl.c: virgl_cmd_resource_create_blob()
- hw/display/virtio-gpu.c: virtio_gpu_resource_create_blob()
- hw/display/virtio-gpu.c: virtio_gpu_resource_attach_backing()
Signed-off-by: Honglei Huang <honghuan@amd.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113015203.3643608-3-honghuan@amd.com>
Fix inverted error check in virgl_cmd_resource_create_blob() that causes
the function to return error when virtio_gpu_create_mapping_iov() succeeds.
virtio_gpu_create_mapping_iov() returns 0 on success and negative values
on error. The check 'if (!ret)' incorrectly treats success (ret=0) as an
error condition, causing the function to fail when it should succeed.
Change the condition to 'if (ret != 0)' to properly detect errors.
Fixes: 7c092f17cc ("virtio-gpu: Handle resource blob commands")
Signed-off-by: Honglei Huang <honghuan@amd.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113015203.3643608-2-honghuan@amd.com>