Add EGPRs in monitor_defs[] to allow HMP to access EGPRs.
For example,
(qemu) print $r16
Since monitor_defs[] is used for read-only case, no need to consider
xstate synchronization issues that might be caused by modifying EGPRs
(like what gdbstub did).
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add i386-64bit-apx.xml from gdb to allow QEMU gdbstub parse APX EGPRs,
and implement the callbacks to allow gdbstub access EGPRs of guest.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before expanding the number of elements in the CPUX86State.regs array,
first use VMSTATE_UINTTL_SUB_ARRAY for the regs' vmstate to avoid the
type_check_array failure.
VMSTATE_UINTTL_SUB_ARRAY will also be used for subsequently added elements
in regs array.
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
APX feature bit is in CPUID_7_1_EDX[21], and APX has EGPR component with
index 19 in xstate area, EGPR component has 16 64bit regs. Add EGRP
component into xstate area.
Note, APX re-uses the 128-byte XSAVE area that had been previously
allocated by MPX which has been deprecated on Intel processors, so check
whether APX and MPX are set at the same for Guest, if this case happens,
mask off them both to avoid conflict for xsave area.
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So that it can be configured in TD guest.
And considerring CET_U and CET_S bits are always same in supported
XFAM reported by TDX module, i.e., either 00 or 11. So, only need to
choose one of them.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-23-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The checkpatch.pl always complains: "ERROR: space required after that
close brace '}'".
Fix this issue.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-22-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add new versioned CPU models for Sapphire Rapids, Sierra Forest, Granite
Rapids and Clearwater Forest, to enable shadow stack and indirect branch
tracking.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-21-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add SHSTK and IBT flags in feature words with entry/exit
control flags.
CET SHSTK and IBT feature are enumerated via CPUID(EAX=7,ECX=0)
ECX[bit 7] and EDX[bit 20]. CET states load/restore at vmentry/
vmexit are controlled by VMX_ENTRY_CTLS[bit 20] and VMX_EXIT_CTLS[bit 28].
Enable these flags so that KVM can enumerate the features properly.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-20-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cet-u and cet-s are supervisor xstates. Their states are saved/loaded by
saving/loading related CET MSRs. And there're the "vmstate_cet" and
"vmstate_pl0_ssp" to migrate these MSRs.
Thus, it's safe to mark them as migratable.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-19-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both FRED and CET-SHSTK need MSR_IA32_PL0_SSP, so add the vmstate for
this MSR.
When CET-SHSTK is not supported, MSR_IA32_PL0_SSP keeps accessible, but
its value doesn't take effect. Therefore, treat this vmstate as a
subsection rather than a fix for the previous FRED vmstate.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-17-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CET provides a new architectural register, shadow stack pointer (SSP),
which cannot be directly encoded as a source, destination or memory
operand in instructions. But Intel VMCS & VMCB provide fields to
save/load guest & host's ssp.
It's necessary to save & restore Guest's ssp before & after migration.
To support this, KVM implements Guest's SSP as a special KVM internal
register - KVM_REG_GUEST_SSP, and allows QEMU to save & load it via
KVM_GET_ONE_REG/KVM_SET_ONE_REG.
Cache KVM_REG_GUEST_SSP in X86CPUState.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-16-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CET (architectural) MSRs include:
MSR_IA32_U_CET - user mode CET control bits.
MSR_IA32_S_CET - supervisor mode CET control bits.
MSR_IA32_PL{0,1,2,3}_SSP - linear addresses of SSPs for user/kernel modes.
MSR_IA32_INT_SSP_TAB - linear address of interrupt SSP table
Since FRED also needs to save/restore MSR_IA32_PL0_SSP, to avoid duplicate
operations, make FRED only save/restore MSR_IA32_PL0_SSP when CET-SHSTK
is not enumerated.
And considerring MSR_IA32_SSP_TBL_ADDR is only presented on 64 bit
processor, wrap it with TARGET_X86_64 macro.
For other MSRs, add save/restore support directly.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Suggested-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-15-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both FRED and CET shadow stack define the MSR MSR_IA32_PL0_SSP (aka
MSR_IA32_FRED_SSP0 in FRED spec).
MSR_IA32_PL0_SSP is a FRED SSP MSR, so that if a processor doesn't
support CET shadow stack, FRED transitions won't use MSR_IA32_PL0_SSP,
but this MSR would still be accessible using MSR-access instructions
(e.g., RDMSR, WRMSR).
Therefore, save/restore SSP0 MSR for FRED.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-14-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CR4.CET bit (bit 23) is as master enable for CET.
Check and adjust CR4.CET bit based on CET CPUIDs.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-13-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add CET_U/S bits in xstate area and report support in xstate
feature mask.
MSR_XSS[bit 11] corresponds to CET user mode states.
MSR_XSS[bit 12] corresponds to CET supervisor mode states.
CET Shadow Stack(SHSTK) and Indirect Branch Tracking(IBT) features
are enumerated via CPUID.(EAX=07H,ECX=0H):ECX[7] and EDX[20]
respectively, two features share the same state bits in XSS, so
if either of the features is enabled, set CET_U and CET_S bits
together.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-12-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xtile-cfg & xtile-data are both user xstates. Their xstates are cached
in X86CPUState, and there's a related vmsd "vmstate_amx_xtile", so that
it's safe to mark them as migratable.
Arch lbr xstate is a supervisor xstate, and it is save & load by saving
& loading related arch lbr MSRs, which are cached in X86CPUState, and
there's a related vmsd "vmstate_arch_lbr". So it should be migratable.
PT is still unmigratable since KVM disabled it and there's no vmsd and
no other emulation/simulation support.
Note, though the migratable_flags get fixed,
x86_cpu_enable_xsave_components() still overrides supported xstates
bitmaps regardless the masking of migratable_flags. This is another
issue, and would be fixed in follow-up refactoring.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Arch lbr is a supervisor xstate, but its area is not covered in
x86_cpu_init_xsave().
Fix it by checking supported xss bitmap.
In addition, drop the (uint64_t) type casts for supported_xcr0 since
x86_cpu_get_supported_feature_word() returns uint64_t so that the cast
is not needed. Then ensure line length is within 90 characters.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since CPUID_7_0_EDX_ARCH_LBR will be masked off if pmu is disabled,
there's no need to check CPUID_7_0_EDX_ARCH_LBR feature with pmu.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The arch lbr state has 2 dependencies:
* Arch lbr feature bit (CPUID 0x7.0x0:EDX[bit 19]):
This bit also depends on pmu property. Mask it off if pmu is disabled
in x86_cpu_expand_features(), so that it is not needed to repeatedly
check whether this bit is set as well as pmu is enabled.
Note this doesn't need compat option, since even KVM hasn't support
arch lbr yet.
The supported xstate is constructed based such dependency in
cpuid_has_xsave_feature(), so if pmu is disabled and arch lbr bit is
masked off, then arch lbr state won't be included in supported
xstates.
Thus it's safe to drop the check on arch lbr bit in CPUID 0xD
encoding.
* XSAVES feature bit (CPUID 0xD.0x1.EAX[bit 3]):
Arch lbr state is a supervisor state, which requires the XSAVES
feature support. Enumerate supported supervisor state based on XSAVES
feature bit in x86_cpu_enable_xsave_components().
Then it's safe to drop the check on XSAVES feature support during
CPUID 0XD encoding.
Suggested-by: Zide Chen <zide.chen@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The x86_ext_save_areas[] is expected to be well initialized by
accelerators and its xstate detail information cannot be changed by
user. So use x86_ext_save_areas[] to encode CPUID.0XD subleaves directly
without other hardcoding & masking.
And for arch LBR, KVM fills its xstate in x86_ext_save_areas[] via
host_cpuid(). The info obtained this way matches what would be retrieved
from x86_cpu_get_supported_cpuid() (since KVM just fills CPUID with the
host xstate info directly anyway). So just use the initialized
x86_ext_save_areas[] instead of calling x86_cpu_get_supported_cpuid().
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With feature array in ExtSaveArea, add avx10 as the second dependency
for Opmask/ZMM_Hi256/Hi16_ZMM xsave components, and drop the special
check in cpuid_has_xsave_feature().
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some XSAVE components depend on multiple features. For example, Opmask/
ZMM_Hi256/Hi16_ZMM depend on avx512f OR avx10, and for CET (which will
be supported later), cet_u/cet_s will depend on shstk OR ibt.
Although previously there's the special check for the dependencies of
AVX512F OR AVX10 on their respective XSAVE components (in
cpuid_has_xsave_feature()), to make the code more general and avoid
adding more special cases, make ExtSaveArea store a features array
instead of a single feature, so that it can describe multiple
dependencies.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Move ARCH_LBR_NR_ENTRIES macro and LBREntry definition before XSAVE
areas definitions.
- Reorder XSavesArchLBR (area 15) between XSavePKRU (area 9) and
XSaveXTILECFG (area 17), and reorder the related QEMU_BUILD_BUG_ON
check to keep the same ordering.
This makes xsave structures to be organized together and makes them
clearer.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Arch LBR state is area 15, not 19. Fix this comment. And considerring
other areas don't mention user or supervisor state, for consistent
style, remove "Supervisor mode" from its comment.
Moreover, rename XSavesArchLBR to XSaveArchLBR since there's no need to
emphasize XSAVES in naming; the XSAVE related structure is mainly
used to represent memory layout.
In addition, arch lbr specifies its offset of xsave component as 0. But
this cannot help on anything. The offset of ExtSaveArea is initialized
by accelerators (e.g., hvf_cpu_xsave_init(), kvm_cpu_xsave_init() and
x86_tcg_cpu_xsave_init()), so explicitly setting the offset doesn't
work and CPUID 0xD encoding has already ensure supervisor states won't
have non-zero offsets. Drop the offset initialization and its comment
from the xsave area of arch lbr.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The indentation style in `x86_ext_save_areas[]` is extremely
inconsistent. Clean it up to ensure a uniform style.
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-M is used heavily in documentation and scripts, but isn't actually
documented anywhere.
Document it as equivalent to -machine, also moving to -machine the sole
suboption that was documented under -M.
Reported-by: Julian Andres Klode <jak@jak-linux.org>
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Link: https://lore.kernel.org/r/20251203131511.153460-1-dave@treblig.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement DTrace/SystemTap SDT by emitting the following:
- The probe crate's probe!() macro is used to emit a DTrace/SystemTap
SDT probe.
- Every trace event gets a corresponding trace_<name>_enabled() -> bool
generated function that Rust code can use to avoid expensive
computation when a trace event is disabled. This API works for other
trace backends too.
`#[allow(dead_code)]` additions are necessary for QEMU's dstate in
generated trace-<dir>.rs files since they are unused by the dtrace
backend. `./configure --enable-trace-backends=` can enable multiple
backends, so keep it simple and just silence the warning instead of
trying to detect the condition when generating the dstate code can be
skipped.
The tracetool tests are updated. Take a look at
tests/tracetool/dtrace.rs to see what the new generated code looks like.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20251119205200.173170-5-stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The probe crate (https://crates.io/crates/probe) provides a probe!()
macro that defines SystemTap SDT probes on Linux hosts or does nothing
on other host OSes.
This crate will be used to implement DTrace support for Rust.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20251119205200.173170-4-stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before using Mutex<> to protect HPETRegisters, it's necessary to apply
Migratable<> wrapper and ToMigrationState first since there's no
pre-defined VMState for Mutex<>.
In addition, this allows to move data from HPETRegisters' vmstate
to HPETTimer's, so as to preserve the original migration format of the C
implementation. To do that, HPETTimer is wrapped with Migratable<>
as well but the implementation of ToMigrationStateShared is
hand-written.
Note that even though the HPETRegistersMigration struct is
generated by ToMigrationState macro, its VMState still needs to be
implemented by hand.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-21-zhao1.liu@intel.com
[Added HPETTimer implementation and restored compatible migration format. - Paolo]
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Timer is a complex struct, allow adding it to a struct that
uses #[derive(ToMigrationState)]; similar to vmstate_timer, only
the expiration time has to be preserved.
In fact, because it is thread-safe, ToMigrationStateShared can
also be implemented without needing a cell or mutex that wraps
the timer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are needed to implement ToMigrationStateShared for timers,
and thus allow including them in Migratable<> structs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Likewise, do not separate hpet_offset from the other registers.
However, because it is migrated in a subsection it is necessary
to copy it out of HPETRegisters and into a BqlCell<>.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HPETTimer now has all of its state stored in HPETRegisters, so it does not
need its own BqlRefCell anymore.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not separate visible and hidden state; both of them are used in the
same circumstances and it's easiest to place both of them under the
same BqlRefCell.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lockless IO requires to lock the registers during MMIO access. So it's
necessary to get (or borrow) registers data at top level, and not to
borrow again in child function calls.
Change the argument types from BqlRefCell<HPETRegisters> to
&HPETRegisters/&mut HPETRegisters in child methods, and do borrow the
data once at top level.
This allows BqlRefCell<HPETRegisters> to be directly replaced with
Mutex<HPETRegisters> in subsequent steps without causing lock reentrancy
issues.
Note, passing reference instead of BqlRef/BqlRefMut because BqlRefMut
cannot be re-borrowed as BqlRef, though BqlRef/BqlRefMut themselves play
as the "guard". Passing reference is directly and the extra
bql::is_locked check could help to consolidate safety guarantee.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-19-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lockless IO requires holding a single lock during MMIO access, so that
it's necessary to maintain timer N's registers (HPETTimerRegisters) with
global register in one place.
Therefore, move HPETTimerRegisters to HPETRegisters from HPETTimer, and
access timer registers from HPETRegisters struct for the whole HPET
code.
This changes HPETTimer and HPETRegisters, and the layout of VMState has
changed, which makes it incompatible to migrate with previous versions.
Thus, bump up the version IDs in VMStates of HPETState and HPETTimer.
The VMState version IDs of HPETRegisters doesn't need to change since
it's a newly added struct and its version IDs doesn't affect the
compatibility of HPETState's VMState.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-18-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently in HPETTimer context, the global registers are accessed by
dereferring a HPETState raw pointer stored in NonNull<>, and then
borrowing the BqlRefCel<>.
This blocks borrowing HPETRegisters once during MMIO access, and
furthermore prevents replacing BqlRefCell<> with Mutex<>.
Therefore, do not access global registers through NonNull<HPETState>
and instead passing &BqlRefCell<HPETRegisters> as argument in
function calls within MMIO access.
But there's one special case that is timer handler, which still needs
to access HPETRegistsers through NonNull<HPETState>. It's okay for now
since this case doesn't have any repeated borrow() or lock reentrancy
issues.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-17-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly initialize more fields which are complex structures.
For simple types (bool/u32/usize), they can be omitted since C has
already initialized memory to all zeros and this is the valid
initialization for those simple types.
Previously such complex fields (InterruptSource/BqlCell/BqlRefCell) were
not explicitly initialized in init() and it's fine, because simply
setting all memory to zero aligns with their default initialization
behavior. However, this behavior is not robust. When adding new complex
struct or modifying the initial values of existing structs, this default
behavior can easily be broken.
Thus, do explicit initialization for HPET to become a good example.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-16-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Timers in post_load() access the same HPETState, which is the "self"
HPETState.
So there's no need to access HPETState from child HPETTimer again and
again. Instead, just cache and borrow HPETState.regs at the beginning,
and this could save some CPU cycles and reduce borrow() calls.
It's safe, because post_load() is called with BQL protection, so that
there's no other chance to modify the regs.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-15-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement helper accessors as methods of HPETRegisters. Then
HPETRegisters can be accessed without going through HPETState.
In subsequent refactoring, coarser-grained BQL lock protection will be
implemented. Specifically, BqlRefCell<HPETRegisters> will be borrowed
only once during MMIO accesses, and the scope of borrowed `regs` will
be extended to cover the entire MMIO access. Consequently, repeated
borrow() attempts within function calls will no longer be allowed.
Therefore, refactor the accessors of HPETRegisters to bypass HPETState,
which help to reduce borrow() in deep function calls.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-14-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Place all HPET (global) timer block registers in a HPETRegisters struct,
and wrap the whole register struct with a BqlRefCell<>.
This allows to elevate the Bql check from individual register access to
register structure access, making the Bql check more coarse-grained. But
in current step, just treat BqlRefCell as BqlCell while maintaining
fine-grained BQL protection. This approach helps to use HPETRegisters
struct clearly without introducing the "already borrowed" around
BqlRefCell.
HPETRegisters struct makes it possible to take a Mutex<> to replace
BqlRefCell<>, like C side did.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-13-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement helper accessors as methods of HPETTimerRegisters. Then
HPETTimerRegisters can be accessed without going through HPETTimer or
HPETState.
In subsequent refactoring, HPETTimerRegisters will be maintained at the
HPETState level. However, accessing it through HPETState requires the
lock (lock BQL or mutex), which would cause troublesome nested locks or
reentrancy issues.
Therefore, refactor the accessors of HPETTimerRegisters to bypass
HPETTimer or HPETState.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-12-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Place all timer N's registers in a HPETTimerRegisters struct.
This allows all Timer N registers to be grouped together with global
registers and managed using a single lock (BqlRefCell or Mutex) in
future. And this makes it easier to apply ToMigrationState macro.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HPETAddrDecode has a `reg` field so that there're many variables named
"reg" in MMIO read/write/decode functions.
In the future, there'll be other HPETRegisters/HPETTimerRegisters
structs containing values of HPET registers, and related variables or
arguments will be named as "regs".
To avoid potential confusion between many "reg" and "regs", rename
HPETAddrDecode::reg to HPETAddrDecode::target, and rename decoding
related variables from "reg" to "target".
"target" is picked as the name since this clearly reflects the field or
variable is the target decoded register.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HPETRegister represents the layout of register spaces of HPET timer
block and timer N, and is used to decode register address into register
enumeration.
To avoid confusion with the subsequently introduced HPETRegisters (that
is used to maintain values of HPET registers), rename HPETRegister to
DecodedRegister.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>