CXL 3.2 8.2.4 Table 8-22 defines which capabilities are mandatory, not
permitted, or optional for each type of device.
cxl_component_register_init_common() uses a rather odd 'fall through'
mechanism to define each component register set. This assumes that any
device or capability being added builds on the previous devices
capabilities. This is not true as there are mutually exclusive
capabilities defined. For example, downstream ports can not have snoop
but it can have Back Invalidate capable decoders.
Refactor this code to make it easier to add individual capabilities as
defined by a device type. Any capability which is not specified by the
type is left NULL'ed out which complies with the packed nature of the
register array.
Update all spec references to 3.2.
No functional changes should be seen with this patch.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
[rebased, no RAS for HBs, r3.2 references]
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260204170936.43959-3-Jonathan.Cameron@huawei.com>
PCIe Flit Mode, introduced with the PCIe 6.0 specification, is a
fundamental change in how data is transmitted over the bus to
improve transfer rates. It shifts from variable-sized Transaction
Layer Packets (TLPs) to fixed 256-byte Flow Control Units (FLITs).
As with the link speed and width training, have ad-hoc property for
setting the flit mode and allow CXL components to make use of it.
For the CXL root port and dsp cases, always report flit mode but
the actual value after 'training' will depend on the downstream
device configuration.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260204170936.43959-2-Jonathan.Cameron@huawei.com>
Memory sparing is defined as a repair function that replaces a portion of
memory with a portion of functional memory at that same DPA. The
subclasses for this operation vary in terms of the scope of the sparing
being performed. The Cacheline sparing subclass refers to a sparing
action that can replace a full cacheline. Row sparing is provided as an
alternative to PPR sparing functions and its scope is that of a single
DDR row. Bank sparing allows an entire bank to be replaced. Rank sparing
is defined as an operation in which an entire DDR rank is replaced.
Memory sparing maintenance operations may be supported by CXL devices
that implement CXL.mem protocol. A sparing maintenance operation requests
the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support memory sparing
features may implement sparing Maintenance operations.
The host may issue a query command by setting Query Resources flag in the
Input Payload (CXL Spec 3.2 Table 8-120) to determine availability of
sparing resources for a given address. In response to a query request,
the device shall report the resource availability by producing the Memory
Sparing Event Record (CXL Spec 3.2 Table 8-60) in which the Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
of the values specified in the request.
During the execution of a sparing maintenance operation, a CXL memory
device:
- May or may not retain data
- May or may not be able to process CXL.mem requests correctly.
These CXL memory device capabilities are specified by restriction flags
in the memory sparing feature readable attributes.
When a CXL device identifies error on a memory component, the device
may inform the host about the need for a memory sparing maintenance
operation by using DRAM event record, where the 'maintenance needed' flag
may set. The event record contains some of the DPA, Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
should be repaired. The userspace tool requests for maintenance operation
if the 'maintenance needed' flag set in the CXL DRAM error record.
CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature
discovery and configuration.
CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing
maintenance operation feature.
Add emulation for CXL memory device memory sparing control feature
and memory sparing maintenance operation command.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20250917143330.294698-4-Jonathan.Cameron@huawei.com>
This adds initial support for the Maintenance command, specifically
the soft and hard PPR operations on a dpa. The implementation allows
to be executed at runtime, therefore semantically, data is retained
and CXL.mem requests are correctly processed.
Keep track of the requests upon a general media or DRAM event.
Post Package Repair (PPR) maintenance operations may be supported by CXL
devices that implement CXL.mem protocol. A PPR maintenance operation
requests the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support PPR features
may implement PPR Maintenance operations. DRAM components may support two
types of PPR, hard PPR (hPPR), for a permanent row repair, and Soft PPR
(sPPR), for a temporary row repair. Soft PPR is much faster than hPPR,
but the repair is lost with a power cycle.
CXL spec 3.2 section 8.2.10.7.1.2 describes the device's sPPR (soft PPR)
maintenance operation and section 8.2.10.7.1.3 describes the device's
hPPR (hard PPR) maintenance operation feature.
CXL spec 3.2 section 8.2.10.7.2.1 describes the sPPR feature discovery and
configuration.
CXL spec 3.2 section 8.2.10.7.2.2 describes the hPPR feature discovery and
configuration.
CXL spec 3.2 section 8.2.10.2.1.4 Table 8-60 describes the Memory Sparing
Event Record.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Co-developed-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20250917143330.294698-3-Jonathan.Cameron@huawei.com>
CXL spec rev3.2 section 8.2.10.2.1.3 Table 8-59, memory module
event record has updated with following new fields.
1. Validity Flags
2. Component Identifier
3. Device Event Sub-Type
Add updates for the above spec changes in the CXL memory module
event reporting and QMP command to inject memory module event.
Updated all references for this command to the CXL r3.2
specification.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260205112350.60681-6-Jonathan.Cameron@huawei.com>
CXL spec rev3.2 section 8.2.10.2.1.2 Table 8-58, DRAM event record
has updated with following new fields.
1. Component Identifier
2. Sub-channel of the memory event location
3. Advanced Programmable Corrected Memory Error Threshold Event Flags
4. Corrected Volatile Memory Error Count at Event
5. Memory Event Sub-Type
Add updates for the above spec changes in the CXL DRAM event
reporting and QMP command to inject DRAM event.
In order to ensure consistency update all specification references
for this command to CXL r3.2.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260205112350.60681-5-Jonathan.Cameron@huawei.com>
CXL spec rev3.2 section 8.2.10.2.1.1 Table 8-57, general media event
table has updated with following new fields.
1. Advanced Programmable Corrected Memory Error Threshold Event Flags
2. Corrected Memory Error Count at Event
3. Memory Event Sub-Type
4. Support for component ID in the PLDM format.
Add updates for the above spec changes in the CXL general media event
reporting and QMP command to inject general media event.
In order to have one consistent source of references, update all to
references for this command to CXL r3.2.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260205112350.60681-4-Jonathan.Cameron@huawei.com>
CXL spec 3.2 section 8.2.9.2.1 Table 8-55, Common Event Record
format has updated with optional Maintenance Operation Subclass,
LD ID and ID of the device head information.
Add updates for the above optional parameters in the related
CXL events reporting and in the QMP commands to inject CXL events.
Update all related specification references to CXL r3.2 to ensure
one consistent source.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260205112350.60681-3-Jonathan.Cameron@huawei.com>
virtio_reset() expects a VirtIODevice pointer, which
is what the single caller - virtio_bus_reset - passes.
Promote the opaque argument to a plain VirtIODevice.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260201223929.78790-1-philmd@linaro.org>
Media players needs meaningful latency_bytes update but it is
zero now most of the time. This adds stream-wise latency_bytes
calculation so that to improve the situation.
Signed-off-by: Yanfeng Liu <p-liuyanfeng9@xiaomi.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <tencent_66E8C146EA79CD00E966DEDAEF8CACD97D05@qq.com>
This commit removes TARGET_INSN_START_EXTRA_WORDS and force all arch to
call the same version of tcg_gen_insn_start, with additional 0 arguments
if needed. Since all arch have a single call site (in translate.c), this
is as good documentation as having a single define.
The notable exception is target/arm, which has two different translate
files for 32/64 bits. Since it's the only one, we accept to have two
call sites for this.
As well, we update parameter type to use uint64_t instead of
target_ulong, so it can be called from common code.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20260219040150.2098396-15-pierrick.bouvier@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In next commit, we'll apply same helper pattern for base helpers
remaining.
Our new helper pattern always include helper-*-common.h, which ends up
including include/tcg/tcg.h, which contains one occurrence of
CONFIG_USER_ONLY.
Thus, common files not being duplicated between system and target
relying on helpers will fail to compile. Existing occurrences are:
- target/arm/tcg/arith_helper.c
- target/arm/tcg/crypto_helper.c
This occurrence of CONFIG_USER_ONLY is for defining variable
tcg_use_softmmu, and we rely on dead code elimination with it in various
tcg-target.c.inc.
Thus, move its definition to tcg/tcg-internal.h, so helpers can be
included by common files. Also, change it to a define, as it has fixed
values for now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20260219040150.2098396-8-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add documentation for vfio_device_get_region_info() and clarify the
expectations around its usage.
Cc: Alex Williamson <alex@shazbot.org>
Cc: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Link: https://lore.kernel.org/qemu-devel/20260210070155.1176081-8-vivek.kasireddy@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
All three events are shared between precopy and postcopy, rather than
precopy specific.
For example, both precopy and postcopy will go through a SETUP process.
Meanwhile, both FAILED and DONE notifiers will be notified for either
precopy or postcopy on completions / failures.
Rename them to make them match what they do, and shorter.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260126213614.3815900-6-peterx@redhat.com
[fixed-up entry in scsi-disk.c that got merged first]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Migration notifiers will notify at any of three places: (1) SETUP
phase, (2) migration completes, (3) migration fails.
There's actually a special case for spice: one can refer to
b82fc321bf ("Postcopy+spice: Pass spice migration data earlier"). It
doesn't need another 4th event because in commit 9d9babf78d ("migration:
MigrationEvent for notifiers") we merged it together with the DONE event.
The merge makes some sense if we treat "switchover" of postcopy as "DONE",
however that also means for postcopy we'll notify DONE twice.. The other
one at the end of postcopy when migration_cleanup().
In reality, the current code base will also notify FAILED for postcopy
twice. It's because an (maybe accidental) change in commit
4af667f87c ("migration: notifier error checking").
First of all, we still need that notification when switchover as stated in
Dave's commit, however that's only needed for spice. To fix it, introduce
POSTCOPY_START event to differenciate it from DONE. Use that instead in
postcopy_start(). Then spice will need to capture this event too.
Then we remove the extra FAILED notification in postcopy_start().
If one wonder if other DONE users should also monitor POSTCOPY_START
event.. We have two more DONE users:
- kvm_arm_gicv3_notifier
- cpr_exec_notifier
Both of them do not need a notification for POSTCOPY_START, but only when
migration completed. Actually, both of them are used in CPR, which doesn't
support postcopy.
When at this, update the notifier transition graph in the comment, and move
it from migration_add_notifier() to be closer to where the enum is defined.
I didn't attach Fixes: because I am not aware of any real bug on such
double reporting. I'm wildly guessing the 2nd notify might be silently
ignored in many cases. However this is still worth fixing.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/20260126213614.3815900-3-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This change removes userland code that worked around a restriction
in the mshv driver in the 6.18 kernel: regions from userland
couldn't be mapped to multiple regions in the kernel. We maintained a
shadow mapping table in qemu and used a heuristic to swap in a requested
region in case of UNMAPPED_GPA exits.
However, this heuristic wasn't reliable in all cases, since HyperV
behaviour is not 100% reliable across versions. HyperV itself doesn't
prohibit to map regions at multiple places into the guest, so the
restriction has been removed in the mshv driver.
Hence we can remove the remapping code. Effectively this will mandate a
6.19 kernel, if the workload attempt to map e.g. BIOS to multiple
reagions. I still think it's the right call to remove this logic:
- The workaround only seems to work reliably with a certain revision
of HyperV as a nested hypervisor.
- We expect Direct Virtualization (L1VH) to be the main platform for
the mshv accelerator, which also requires a 6.19 kernel
This reverts commit efc4093358.
Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com>
Acked-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Tested-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260113153708.448968-1-magnuskulke@linux.microsoft.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add an option to inject timestamps into serial log file.
That simplifies debugging a lot, when you can simply compare
QEMU logs with guest console logs.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20260201173633.413934-4-vsementsov@yandex-team.ru>
To be reused in the following commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20260201173633.413934-3-vsementsov@yandex-team.ru>
Instead of checking, did backend set the filename state or not, let's
be stateless: filename is needed rarely, so, let's just have a generic
function (with optional implementation by backends) to get it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[ Marc-André - fix leak in ivshmem-pci.c ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-10-vsementsov@yandex-team.ru>
Currently we do two wrong things:
1. Abuse s->filename to get pty_name from it
2. Violate layering with help of CHARDEV_IS_PTY()
Let's get rid of both, and introduce correct way to get pty name in
generic code, if available.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-9-vsementsov@yandex-team.ru>
Add boolean return value to follow common recommendations for functions
with errrp in include/qapi/error.h
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-7-vsementsov@yandex-team.ru>
The logic around the parameter is rather tricky. Let's instead
explicitly send CHR_EVENT_OPENED in all backends where needed.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[ Marc-André - add CHR_EVENT_OPENED in udp_chr_open() ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-6-vsementsov@yandex-team.ru>
Most handlers have name prefixed with "chr_". That's a good practice
which helps to grep them. Convert the rest: .parse, .open,
get/set_msgfds.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-4-vsementsov@yandex-team.ru>
Since previous commit it is always 1. Let's just drop it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-3-vsementsov@yandex-team.ru>
For major distributions we have now:
Debian 13: 0.15.2
Ubuntu 22.04: 0.15.0
RHEL-9/CentOS Stream 9: SPICE is removed
Fedora 42: 0.15.1
OpenSUSE Leap 15.4: 0.15.0
Time to update the dependancy in QEMU and drop almost all
SPICE_SERVER_VERSION checks.
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20260115144606.233252-2-vsementsov@yandex-team.ru>
Add silicon revision definitions for AST2700 A2, and include
them in the list of supported Aspeed silicon revisions.
This allows newer AST27x0 A2 silicon to be correctly identified via
the SCU silicon revision register.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260211021527.119674-3-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Several legacy Aspeed SoC silicon revision definitions are no longer
used by any machine models or runtime logic.
Remove unused silicon revision macros and corresponding entries from
the silicon revision table to reduce dead code and improve
maintainability.
No functional change intended.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260211021527.119674-2-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
According to the AST2700 A1 datasheet, the register space for each I2C
device instance has been expanded from 0x80 bytes to 0xA0 bytes.
Update the AST2700 I2C controller configuration to reflect the new
register layout by increasing the per-device register size to 0xA0
and adjusting the register gap size accordingly.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Fixes: 4f53de2f10 ("hw/arm/aspeed_ast27x0: Remove ast2700-a0 SOC")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260210024331.3984696-3-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The ASPEED I2C controller exposes a per-bus MMIO window of 0x80 bytes on
AST2600/AST1030/AST2700, but the backing regs[] array was sized for only
28 dwords (0x70 bytes). This allows guest reads in the range [0x70..0x7f]
to index past the end of regs[].
Fix this by:
- Sizing ASPEED_I2C_NEW_NUM_REG to match the 0x80-byte window
(0x80 >> 2 = 32 dwords).
- Avoiding an unconditional pre-read from regs[] in the legacy/new read
handlers. Initialize the return value to -1 and only read regs[] for
offsets that are explicitly handled/valid, leaving invalid offsets to
return -1 with a guest error log.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3290
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20260210024331.3984696-2-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
DIVIDE TO INTEGER computes floating point remainder and is used by
LuaJIT, so add it to QEMU.
Put the main logic into fpu/, because it is way more convenient to
operate on FloatParts than to convert floats back-and-forth.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20260210214044.1174699-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This resets non-architectural state to allow for reboots to succeed.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Code taken from HVF and adapted for WHPX use. Note that WHPX doesn't
have a default vs maximum IPA distinction.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Change terminology to match the KVM one, as APIC is x86-specific.
And move out whpx_irqchip_in_kernel() to make it usable from common
code even when not compiling with WHPX support.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As of why: WHPX on arm64 doesn't have debug trap support as of today.
Keep the exception bitmap interface for now - despite that being entirely unavailable on arm64 too.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
winhvemulation is x86_64 only.
In the future, we might want to get rid of winhvemulation usage
entirely.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some code can be shared between x86_64 and arm64 WHPX. Do so as much as reasonable.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Switch to a design where we can share whpx code between x86 and AArch64 when it makes sense to do so.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce a -M msi= argument to be able to control MSI-X support independently
from ITS, as part of supporting GICv3 + GICv2m platforms.
Remove vms->its as it's no longer needed after that change.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the minimal change beginning with TARGET_ARCH in
configs/targets/or1k-* from openrisc to or1k, then adjust
TARGET_OR1K, QEMU_ARCH_OR1K, directory names,
and meson.build to match.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20260205030244.266447-2-richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add a vmstate subsection to SCSIDiskState so that scsi-block devices can
transfer their reservation state during live migration. Upon loading the
subsection, the destination QEMU invokes the PERSISTENT RESERVE OUT
command's PREEMPT service action to atomically move the reservation from
the source I_T nexus to the destination I_T nexus. This results in
transparent live migration of SCSI reservations.
This approach is incomplete since SCSI reservations are cooperative and
other hosts could interfere. Neither the source QEMU nor the destination
QEMU are aware of changes made by other hosts. The assumption is that
reservation is not taken over by a third host without cooperation from
the source host.
I considered adding the vmstate subsection to SCSIDevice instead of
SCSIDiskState, since reservations are part of the SCSI Primary Commands
that other devices apart from disks could support. However, due to
fragility of migrating reservations, we will probably limit support to
scsi-block and maybe scsi-disk in the future. In the end, I think it
makes sense to place this within scsi-disk.c.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20260129212035.219676-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
SCSI Persistent Reservations are stateful and external to the guest. In
order to transparently move reservations to the destination host during
live migration, it is necessary to track the state built up on the
source host before migration. Only then can the destination host ensure
an equivalent state is restored upon migration.
Snoop on successful PERSISTENT RESERVE OUT commands and save the
reservation key and reservation type. This will allow registered keys
and reservations to be migrated.
Also patch PERSISTENT RESERVE IN replies with the REPORT CAPABILITIES
service action since features that involve the physical SCSI bus target
ports must not be exposed to the guest (it sees a virtual SCSI bus).
Usually this plays out as follows:
1. The guest invokes the REGISTER service action to register a
reservation key on its I_T nexus.
2. The guest invokes the RESERVE service action to create a reservation
using the previously-registered key.
This commit implements the snooping and stores the reservation key and
type (if any) for each LUN. The snooped PR state and the migrate_pr flag
to enable PR migration will be used in later commits.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20260129212035.219676-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Report the details of the SG_IO ioctl failure if an Error pointer is
provided. This information aids troubleshooting and will be used by the
SCSI Persistent Reservations migration code.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20260129212035.219676-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a direction argument so that scsi_SG_IO() can be used for
SG_DXFER_FROM_DEV and SG_DXFER_TO_DEV transfers.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20260129212035.219676-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The VIRTIO_CONSOLE_F_EMERG_WRITE feature bit was only set
in the hw_compat_2_7[] array, via the 'emergency-write=off'
property. We removed all machines using that array, lets remove
that property. All instances have this feature bit set and
it can not be disabled. VirtIOSerial::host_features mask is
now unused, remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-28-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VirtIOPCIProxy::ignore_backend_features boolean was only set
in the hw_compat_2_7[] array, via the 'x-ignore-backend-features=on'
property. We removed all machines using that array, lets remove
that property, simplify by only using the default version.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-27-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The IntelIOMMUState::buggy_eim boolean was only set in
the hw_compat_2_7[] array, via the 'x-buggy-eim=true'
property. We removed all machines using that array, lets
remove that property, simplifying vtd_decide_config().
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-26-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hw_compat_2_7[] array was only used by the pc-q35-2.7 and
pc-i440fx-2.7 machines, which got removed. Remove it.
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20260108033051.777361-25-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>