riscv/qemu - qemu - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Paolo Bonzini	7f548b8f23	include: reorganize memory API headers Move RAMBlock functions out of ram_addr.h and cpu-common.h; move memory API headers out of include/exec and into include/system. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	4 months ago
Paolo Bonzini	1942b61b74	include: move hw/boards.h to hw/core/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	4 months ago
Chuang Xu	e8a6d158db	migration: merge fragmented clear_dirty ioctls In our long-term experience in Bytedance, we've found that under the same load, live migration of larger VMs with more devices is often more difficult to converge (requiring a larger downtime limit). Through some testing and calculations, we conclude that bitmap sync time affects the calculation of live migration bandwidth. When the addresses processed are not aligned, a large number of clear_dirty ioctl occur (e.g. a 4MB misaligned memory can generate 2048 clear_dirty ioctls from two different memory_listener), which increases the time required for bitmap_sync and makes it more difficult for dirty pages to converge. For a 64C256G vm with 8 vhost-user-net(32 queue per nic) and 16 vhost-user-blk(4 queue per blk), the sync time is as high as 73ms (tested with 10GBps dirty rate, the sync time increases as the dirty page rate increases), Here are each part of the sync time: - sync from kvm to ram_list: 2.5ms - vhost_log_sync:3ms - sync aligned memory from ram_list to RAMBlock: 5ms - sync misaligned memory from ram_list to RAMBlock: 61ms Attempt to merge those fragmented clear_dirty ioctls, then syncing misaligned memory from ram_list to RAMBlock takes only about 1ms, and the total sync time is only 12ms. Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20251218114220.83354-1-xuchuangxclwt@bytedance.com [peterx: drop var "offset" in physical_memory_sync_dirty_bitmap] Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Fabiano Rosas	1a739d3012	migration: Do away with usage of QERR_INVALID_PARAMETER_VALUE The QERR_INVALID_PARAMETER_VALUE macro is documented as not to be used in new code. Remove the usage from migration/options.c. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20251215220041.12657-12-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Fabiano Rosas	6ab968d5e9	migration: Run a post update routine after setting parameters Some migration parameters are updated immediately once they are set via migrate-set-parameters. Move that work outside of migrate_params_apply() and leave that function with the single responsibility of setting s->parameters and not doing any side-effects. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20251215220041.12657-9-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Peter Xu	d4fd83c9b5	migration: Replace migrate_set_error() with migrate_error_propagate() migrate_set_error() currently doesn't take ownership of the error being passed in. It's not aligned with the error API and meanwhile it also makes most of the caller free the error explicitly. Change the API to take the ownership of the Error object instead. This should save a lot of error_copy() invocations. Reviewed-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20251201194510.1121221-8-peterx@redhat.com [peterx: break line for qemu_savevm_send_packaged, per markus] Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Pawel Zmarzly	0b510b51b6	migration: Fix writing mapped_ram + ignore_shared snapshots Currently if you set these flags and have any shared memory object, saving a snapshot will fail with: Failed to write bitmap to file: Unable to write to file: Bad address We need to skip writing RAMBlocks that are backed by shared objects. Also, we should mark these RAMBlocks as skipped, so the snapshot format stays readable to tools that later don't know QEMU's command line (for example scripts/analyze-migration.py). I used bitmap_offset=0 pages_offset=0 for this. This minor change to snapshot format should be safe, as offset=0 should not have ever been possible. Signed-off-by: Pawel Zmarzly <pzmarzly0@gmail.com> Link: https://lore.kernel.org/r/20251126154734.940066-1-pzmarzly0@gmail.com Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Pawel Zmarzly	b043f6df27	migration: fix parsing snapshots with x-ignore-shared flag Snapshots made with mapped-ram and x-ignore-shared flags are not parsed properly. The ignore-shared feature adds and extra field in the stream, which needs to be consumed on the destination side. Even though mapped-ram has a fixed header format, the ignore-shared is part of the "generic" stream infomation so the mapped-ram code is currently skipping that be64 read which incorrectly offsets every subsequent read from the stream. The current ignore-shared handling can simply be moved earlier in the code to encompass mapped-ram as well since the ignore-shared doubleword is the first one read when parsing the ramblock section of the stream. Co-authored-by: Peter Xu <peterx@redhat.com> Signed-off-by: Pawel Zmarzly <pzmarzly0@gmail.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20251126121233.542473-1-pzmarzly0@gmail.com [peterx: enhance commit log per fabiano] Signed-off-by: Peter Xu <peterx@redhat.com>	4 months ago
Marco Cavenati	0ecd285824	migration: mapped-ram: handle zero pages Make mapped-ram compatible with loadvm snapshot restoring by explicitly zeroing memory pages in this case. Skip zeroing for -incoming and -loadvm migrations to preserve performance. Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr> Link: https://lore.kernel.org/r/20251010115954.1995298-3-Marco.Cavenati@eurecom.fr Signed-off-by: Peter Xu <peterx@redhat.com>	6 months ago
Marco Cavenati	e5423828d6	migration/ram: fix docs of ram_handle_zero Remove outdated 'ch' parameter from the function documentation. Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20251001161823.2032399-3-Marco.Cavenati@eurecom.fr Signed-off-by: Peter Xu <peterx@redhat.com>	6 months ago
Philippe Mathieu-Daudé	4db362f68c	system/physmem: Extract API out of 'system/ram_addr.h' header Very few files use the Physical Memory API. Declare its methods in their own header: "system/physmem.h". Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20251001175448.18933-19-philmd@linaro.org>	6 months ago
Philippe Mathieu-Daudé	aa60bdb700	system/physmem: Drop 'cpu_' prefix in Physical Memory API The functions related to the Physical Memory API declared in "system/ram_addr.h" do not operate on vCPU. Remove the 'cpu_' prefix. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20251001175448.18933-18-philmd@linaro.org>	6 months ago
Philippe Mathieu-Daudé	8bf3a88308	system/physmem: Reduce cpu_physical_memory_sync_dirty_bitmap() scope cpu_physical_memory_sync_dirty_bitmap() is now only called within system/physmem.c, by ramblock_sync_dirty_bitmap(). Reduce its scope by making it internal to this file. Since it doesn't involve any CPU, remove the 'cpu_' prefix. Remove the now unneeded "qemu/rcu.h" and "system/memory.h" headers. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20251001175448.18933-17-philmd@linaro.org>	6 months ago
Philippe Mathieu-Daudé	34f9b0ad08	system/ramblock: Move ram_block_is_pmem() declaration Move ramblock_is_pmem() along with the RAM Block API exposed by the "system/ramblock.h" header. Rename as ram_block_is_pmem() to keep API prefix consistency. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Peter Xu <peterx@redhat.com> Message-Id: <20251002032812.26069-3-philmd@linaro.org>	6 months ago
Steve Sistare	a3eae205c6	migration: cpr-exec mode Add the cpr-exec migration mode. Usage: qemu-system-$arch -machine aux-ram-share=on ... migrate_set_parameter mode cpr-exec migrate_set_parameter cpr-exec-command \ <arg1> <arg2> ... -incoming <uri-1> \ migrate -d <uri-1> The migrate command stops the VM, saves state to uri-1, directly exec's a new version of QEMU on the same host, replacing the original process while retaining its PID, and loads state from uri-1. Guest RAM is preserved in place, albeit with new virtual addresses. The new QEMU process is started by exec'ing the command specified by the @cpr-exec-command parameter. The first word of the command is the binary, and the remaining words are its arguments. The command may be a direct invocation of new QEMU, or may be a non-QEMU command that exec's the new QEMU binary. This mode creates a second migration channel that is not visible to the user. At the start of migration, old QEMU saves CPR state to the second channel, and at the end of migration, it tells the main loop to call cpr_exec. New QEMU loads CPR state early, before objects are created. Because old QEMU terminates when new QEMU starts, one cannot stream data between the two, so uri-1 must be a type, such as a file, that accepts all data before old QEMU exits. Otherwise, old QEMU may quietly block writing to the channel. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine aux-ram-share=on' option, which allows anonymous memory to be transferred in place to the new process. The memfds are kept open across exec by clearing the close-on-exec flag, their values are saved in CPR state, and they are mmap'd in new QEMU. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/1759332851-370353-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	6 months ago
Arun Menon	d865e4aabd	migration: push Error **errp into loadvm_process_enable_colo() This is an incremental step in converting vmstate loading code to report error via Error objects instead of directly printing it to console/monitor. It is ensured that loadvm_process_enable_colo() must report an error in errp, in case of failure. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Arun Menon <armenon@redhat.com> Tested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-21-36f11a6fb9d3@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	7 months ago
Arun Menon	d9d7c8d813	migration: Return -1 on memory allocation failure in ram.c The function colo_init_ram_cache() currently returns -errno if qemu_anon_ram_alloc() fails. However, the subsequent cleanup loop that calls qemu_anon_ram_free() could potentially alter the value of errno. This would cause the function to return a value that does not accurately represent the original allocation failure. This commit changes the return value to -1 on memory allocation failure. This ensures that the return value is consistent and is not affected by any errno changes that may occur during the free process. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Arun Menon <armenon@redhat.com> Tested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-20-36f11a6fb9d3@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	7 months ago
Arun Menon	44cdbaa98e	migration: push Error **errp into ram_postcopy_incoming_init() This is an incremental step in converting vmstate loading code to report error via Error objects instead of directly printing it to console/monitor. It is ensured that ram_postcopy_incoming_init() must report an error in errp, in case of failure. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Arun Menon <armenon@redhat.com> Tested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-14-36f11a6fb9d3@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	7 months ago
Peter Xu	adb13d6e42	migration/postcopy: Avoid clearing dirty bitmap for postcopy too This is a follow up on the other commit "migration/ram: avoid to do log clear in the last round" but for postcopy. https://lore.kernel.org/r/20250514115827.3216082-1-yanfei.xu@bytedance.com I can observe more than 10% reduction of average page fault latency during postcopy phase with this optimization: Before: 268.00us (+-1.87%) After: 232.67us (+-2.01%) The test was done with a 16GB VM with 80 vCPUs, running a workload that busy random writes to 13GB memory. Cc: Yanfei Xu <yanfei.xu@bytedance.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250613140801.474264-12-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	10 months ago
Peter Xu	f1549da610	migration/ram: Add tracepoints for ram_save_complete() Take notes on start/end state of dirty pages for the whole system. Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250613140801.474264-10-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	10 months ago
Peter Xu	ff9dfc41d9	migration/ram: One less indent for ram_find_and_save_block() The check over PAGE_DIRTY_FOUND isn't necessary. We could indent one less and assert that instead. Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250613140801.474264-9-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	10 months ago
Peter Xu	57c43e52bd	migration: Rename save_live_complete_precopy to save_complete Now after merging the precopy and postcopy version of complete() hook, rename the precopy version from save_live_complete_precopy() to save_complete(). Dropping the "live" when at it, because it's in most cases not live when happening (in precopy). No functional change intended. Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250613140801.474264-7-peterx@redhat.com [peterx: squash the fixup that covers a few more doc spots, per Juraj] Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	10 months ago
Peter Xu	d7530a9682	migration: Drop save_live_complete_postcopy hook The hook is only defined in two vmstate users ("ram" and "block dirty bitmap"), meanwhile both of them define the hook exactly the same as the precopy version. Hence, this postcopy version isn't needed. No functional change intended. Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250613140801.474264-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	10 months ago
Chenyi Qiang	2205b84667	memory: Unify the definiton of ReplayRamPopulate() and ReplayRamDiscard() Update ReplayRamDiscard() function to return the result and unify the ReplayRamPopulate() and ReplayRamDiscard() to ReplayRamDiscardState() at the same time due to their identical definitions. This unification simplifies related structures, such as VirtIOMEMReplayData, which makes it cleaner. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Link: https://lore.kernel.org/r/20250612082747.51539-4-chenyi.qiang@intel.com Signed-off-by: Peter Xu <peterx@redhat.com>	10 months ago
Chaney, Ben	983899eab4	migration: Don't sync volatile memory after migration completes Syncing volatile memory provides no benefit, instead it can cause performance issues in some cases. Only sync memory that is marked as non-volatile after migration completes on destination. Signed-off-by: Ben Chaney <bchaney@akamai.com> Fixes: `bd108a44bc` (migration: ram: Switch to ram block writeback) Link: https://lore.kernel.org/r/1CC43F59-336F-4A12-84AD-DB89E0A17A95@akamai.com Signed-off-by: Peter Xu <peterx@redhat.com>	10 months ago
Yanfei Xu	fd0377150d	migration/ram: avoid to do log clear in the last round There won't be any ram sync after the stage of save_complete, therefore it's unnecessary to do manually protect for dirty pages being sent. Skip to do this in last round can reduce noticeable downtime. Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> Tested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250514115827.3216082-1-yanfei.xu@bytedance.com [peterx: add comments] Signed-off-by: Peter Xu <peterx@redhat.com>	11 months ago
Prasad Pandit	e274188612	migration: enable multifd and postcopy together Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. The Precopy and Multifd threads work during the initial guest RAM transfer. When migration moves to the Postcopy phase, the multifd threads cease to send data on multifd channels and Postcopy threads on the destination request/pull data from the source side. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Link: https://lore.kernel.org/r/20250512125124.147064-3-ppandit@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	11 months ago
Peter Xu	20d8262281	migration/postcopy: Spatial locality page hint for preempt mode The preempt mode postcopy has been introduced for a while. From latency POV, it should always win the vanilla postcopy. However there's one thing missing when preempt mode is enabled right now, which is the spatial locality hint when there're page requests from the destination side. In vanilla postcopy, as long as a page request was unqueued, it will update the PSS of the precopy background stream, so that after a page request the background thread will move the pages after whatever was requested. It's pretty much a natural behavior when there's only one channel anyway, and one scanner to send the pages. Preempt mode didn't follow that, because preempt mode has its own channel and its own PSS (which doesn't linearly scan the guest memory, but dedicated to resolve page requested from destination). So the page request process and the background migration process are completely separate. This patch adds the hint explicitly for preempt mode. With that, whenever the preempt mode receives a page request on the source, it will service the remote page fault in the return path, then it'll provide a hint to the background thread so that we'll start sending the pages right after the requested ones in the background, assuming the follow up pages have a higher chance to be accessed later. NOTE: since the background migration thread and return path thread run completely concurrently, it doesn't always mean the hint will be applied every single time. For example, it's possible that the return path thread receives multiple page requests in a row without the background thread getting the chance to consume one. In such case, the preempt thread only provide the hint if the previous hint has been consumed. After all, there's no point queuing hints when we only have one linear scanner. This could measureably improve the simple sequential memory access pattern during postcopy (when preempt is on). For random accesses, I can measure a slight increase of remote page fault latency from ~500us -> ~600us, that could be a trade-off to have such hint mechanism, and after all that's still greatly improved comparing to vanilla postcopy on random (~10ms). The patch is verified by our QE team in a video streaming test case, to reduce the pause of the video from ~1min to a few seconds when switching over to postcopy with preempt mode. Reported-by: Xiaohui Li <xiaohli@redhat.com> Tested-by: Xiaohui Li <xiaohli@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250424220705.195544-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	11 months ago
Peter Xu	ad8d82ffbb	migration/ram: Implement save_postcopy_prepare() Implement save_postcopy_prepare(), preparing for the enablement of both multifd and postcopy. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20250411114534.3370816-5-ppandit@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	12 months ago
Li Zhijian	5e7ca4a7d7	migration: Unfold control_save_page() control_save_page() is for RDMA only, unfold it to make the code more clear. In addition: - Similar to other branches style in ram_save_target_page(), involve RDMA only if the condition 'migrate_rdma()' is true. - Further simplify the code by removing the RAM_SAVE_CONTROL_NOT_SUPP. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20250305062825.772629-6-lizhijian@fujitsu.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Markus Armbruster	8a2b516ba2	cleanup: Drop pointless return at end of function A few functions now end with a label. The next commit will clean them up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250407082643.2310002-3-armbru@redhat.com> [Straightforward conflict with commit `988ad4cceb` (hw/loongarch/virt: Fix cpuslot::cpu set at last in virt_cpu_plug()) resolved]	1 year ago
Richard Henderson	4705a71db5	include/system: Move exec/ram_addr.h to system/ram_addr.h Convert the existing includes with sed. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	1 year ago
Li Zhijian	baa41af1c0	migration: Prioritize RDMA in ram_save_target_page() Address an error in RDMA-based migration by ensuring RDMA is prioritized when saving pages in `ram_save_target_page()`. Previously, the RDMA protocol's page-saving step was placed after other protocols due to a refactoring in commit `bc38dc2f5f`. This led to migration failures characterized by unknown control messages and state loading errors destination: (qemu) qemu-system-x86_64: Unknown control message QEMU FILE qemu-system-x86_64: error while loading state section id 1(ram) qemu-system-x86_64: load of migration failed: Operation not permitted source: (qemu) qemu-system-x86_64: RDMA is in an error state waiting migration to abort! qemu-system-x86_64: failed to save SaveStateEntry with id(name): 1(ram): -1 qemu-system-x86_64: rdma migration: recv polling control error! qemu-system-x86_64: warning: Early error. Sending error. qemu-system-x86_64: warning: rdma migration: send polling control error RDMA migration implemented its own protocol/method to send pages to destination side, hand over to RDMA first to prevent pages being saved by other protocol. Fixes: `bc38dc2f5f` ("migration: refactor ram_save_target_page functions") Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20250305062825.772629-2-lizhijian@fujitsu.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Fabiano Rosas	a47f0cfba8	migration: Set migration error outside of migrate_cancel There's no point passing the error into migration cancel only for it to call migrate_set_error(). Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20250213175927.19642-2-farosas@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Prasad Pandit	bc38dc2f5f	migration: refactor ram_save_target_page functions Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it frees 'migration_ops' object from usage, so it is expunged. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20250127120823.144949-3-ppandit@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Steve Sistare	624e6e654e	migration: cpr-transfer mode Add the cpr-transfer migration mode, which allows the user to transfer a guest to a new QEMU instance on the same host with minimal guest pause time, by preserving guest RAM in place, albeit with new virtual addresses in new QEMU, and by preserving device file descriptors. Pages that were locked in memory for DMA in old QEMU remain locked in new QEMU, because the descriptor of the device that locked them remains open. cpr-transfer preserves memory and devices descriptors by sending them to new QEMU over a unix domain socket using SCM_RIGHTS. Such CPR state cannot be sent over the normal migration channel, because devices and backends are created prior to reading the channel, so this mode sends CPR state over a second "cpr" migration channel. New QEMU reads the cpr channel prior to creating devices or backends. The user specifies the cpr channel in the channel arguments on the outgoing side, and in a second -incoming command-line parameter on the incoming side. The user must start old QEMU with the the '-machine aux-ram-share=on' option, which allows anonymous memory to be transferred in place to the new process by transferring a memory descriptor for each ram block. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The user starts new QEMU on the same host as old QEMU, with command-line arguments to create the same machine, plus the -incoming option for the main migration channel, like normal live migration. In addition, the user adds a second -incoming option with channel type "cpr". This CPR channel must support file descriptor transfer with SCM_RIGHTS, i.e. it must be a UNIX domain socket. To initiate CPR, the user issues a migrate command to old QEMU, adding a second migration channel of type "cpr" in the channels argument. Old QEMU stops the VM, saves state to the migration channels, and enters the postmigrate state. New QEMU mmap's memory descriptors, and execution resumes. The implementation splits qmp_migrate into start and finish functions. Start sends CPR state to new QEMU, which responds by closing the CPR channel. Old QEMU detects the HUP then calls finish, which connects the main migration channel. In summary, the usage is: qemu-system-$arch -machine aux-ram-share=on ... start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>" Issue commands to old QEMU: migrate_set_parameter mode cpr-transfer {"execute": "migrate", ... {"channel-type": "main"...}, {"channel-type": "cpr"...} ... } Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-17-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	baab4473db	migration/multifd: Document the reason to sync for save_setup() It's not straightforward to see why src QEMU needs to sync multifd during setup() phase. After all, there's no page queued at that point. For old QEMUs, there's a solid reason: EOS requires it to work. While it's clueless on the new QEMUs which do not take EOS message as sync requests. One will figure that out only when this is conditionally removed. In fact, the author did try it out. Logically we could still avoid doing this on new machine types, however that needs a separate compat field and that can be an overkill in some trivial overhead in setup() phase. Let's instead document it completely, to avoid someone else tries this again and do the debug one more time, or anyone confused on why this ever existed. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20241206224755.1108686-8-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	1aa81c3098	migration/multifd: Cleanup src flushes on condition check The src flush condition check is over complicated, and it's getting more out of control if postcopy will be involved. In general, we have two modes to do the sync: legacy or modern ways. Legacy uses per-section flush, modern uses per-round flush. Mapped-ram always uses the modern, which is per-round. Introduce two helpers, which can greatly simplify the code, and hopefully make it readable again. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20241206224755.1108686-7-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	de695b1399	migration/multifd: Remove sync processing on postcopy Multifd never worked with postcopy, at least yet so far. Remove the sync processing there, because it's confusing, and they should never appear. Now if RAM_SAVE_FLAG_MULTIFD_FLUSH is observed, we fail hard instead of trying to invoke multifd code. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20241206224755.1108686-6-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	e5f14aa5fe	migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages RAM_SAVE_FLAG_MULTIFD_FLUSH message should always be correlated to a sync request on src. Unify such message into one place, and conditionally send the message only if necessary. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20241206224755.1108686-5-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	604b4749c5	migration/ram: Move RAM_SAVE_FLAG* into ram.h Firstly, we're going to use the multifd flag soon in multifd code, so ram.c isn't gonna work. Secondly, we have a separate RDMA flag dangling around, which is definitely not obvious. There's one comment that helps, but not too much. Put all RAM save flags altogether, so nothing will get overlooked. Add a section explain why we can't use bits over 0x200. Remove RAM_SAVE_FLAG_FULL as it's already not used in QEMU, as the comment explained. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20241206224755.1108686-4-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Peter Xu	1d457daf86	migration/multifd: Further remove the SYNC on complete Commit `637280aeb2` ("migration/multifd: Avoid the final FLUSH in complete()") stopped sending the RAM_SAVE_FLAG_MULTIFD_FLUSH flag at ram_save_complete(), because the sync on the destination side is not needed due to the last iteration of find_dirty_block() having already done it. However, that commit overlooked that multifd_ram_flush_and_sync() on the source side is also not needed at ram_save_complete(), for the same reason. Moreover, removing the RAM_SAVE_FLAG_MULTIFD_FLUSH but keeping the multifd_ram_flush_and_sync() means that currently the recv threads will hang when receiving the MULTIFD_FLAG_SYNC message, waiting for the destination sync which only happens when RAM_SAVE_FLAG_MULTIFD_FLUSH is received. Luckily, multifd is still all working fine because recv side cleanup code (mostly multifd_recv_sync_main()) is smart enough to make sure even if recv threads are stuck at SYNC it'll get kicked out. And since this is the completion phase of migration, nothing else will be sent after the SYNCs. This needs to be fixed because in the future VFIO will have data to push after ram_save_complete() and we don't want the recv thread to be stuck in the MULTIFD_FLAG_SYNC message. Remove the unnecessary (and buggy) invocation of multifd_ram_flush_and_sync(). For very old binaries (multifd_flush_after_each_section==true), the flush_and_sync is still needed because each EOS received on destination will enforce all-channel sync once. Stable branches do not need this patch, as no real bug I can think of that will go wrong there.. so not attaching Fixes to be clear on the backport not needed. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20241206224755.1108686-2-peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	1 year ago
Philippe Mathieu-Daudé	32cad1ffb8	include: Rename sysemu/ -> system/ Headers in include/sysemu/ are not only related to system emulation, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>	1 year ago
Maciej S. Szmigiero	b0350c5195	migration/ram: Add load start trace event There's a RAM load complete trace event but there wasn't its start equivalent. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/94ddfa7ecb83a78f73b82867dd30c8767592d257.1730203967.git.maciej.szmigiero@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	1 year ago
Peter Xu	34a8892dec	migration: Drop migration_is_idle() Now with the current migration_is_running(), it will report exactly the opposite of what will be reported by migration_is_idle(). Drop migration_is_idle(), instead use "!migration_is_running()" which should be identical on functionality. In reality, most of the idle check is inverted, so it's even easier to write with "migrate_is_running()" check. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	1 year ago
Peter Xu	f018eb62b2	migration: Drop migration_is_setup_or_active() This helper is mostly the same as migration_is_running(), except that one has COLO reported as true, the other has CANCELLING reported as true. Per my past years experience on the state changes, none of them should matter. To make it slightly safer, report both COLO \|\| CANCELLING to be true in migration_is_running(), then drop the other one. We kept the 1st only because the name is simpler, and clear enough. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	1 year ago
Hyman Huang	52ac968ab2	migration: Support periodic RAMBlock dirty bitmap sync When VM is configured with huge memory, the current throttle logic doesn't look like to scale, because migration_trigger_throttle() is only called for each iteration, so it won't be invoked for a long time if one iteration can take a long time. The periodic dirty sync aims to fix the above issue by synchronizing the ramblock from remote dirty bitmap and, when necessary, triggering the CPU throttle multiple times during a long iteration. This is a trade-off between synchronization overhead and CPU throttle impact. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/f61f1b3653f2acf026901103e1c73d157d38b08f.1729146786.git.yong.huang@smartx.com [peterx: make prev_cnt global, and reset for each migration] Signed-off-by: Peter Xu <peterx@redhat.com>	1 year ago
Hyman Huang	6a39ba7cab	migration: Remove "rs" parameter in migration_bitmap_sync_precopy The global static variable ram_state in fact is referred to by the "rs" parameter in migration_bitmap_sync_precopy. For ease of calling by the callees, use the global variable directly in migration_bitmap_sync_precopy and remove "rs" parameter. The migration_bitmap_sync_precopy will be exported in the next commit. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/283c335d61463bf477160da91b24da45cdaf3e43.1729146786.git.yong.huang@smartx.com Signed-off-by: Peter Xu <peterx@redhat.com>	1 year ago
Marc-André Lureau	85f99eb2cb	migration: fix -Werror=maybe-uninitialized false-positive ../migration/ram.c:1873:23: error: ‘dirty’ may be used uninitialized [-Werror=maybe-uninitialized] When 'block' != NULL, 'dirty' is initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Peter Xu <peterx@redhat.com>	2 years ago
Pierrick Bouvier	d13526f77a	migration: remove return after g_assert_not_reached() This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240919044641.386068-31-pierrick.bouvier@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2 years ago

1 2 3 4 5 ...

690 Commits (7d7654a643fad89610cd70142a73e4b9df7700ad)