Tree:
1a1787ab6b
10.1-testing
99888-virtio-zero-init-c9s
block
coverity
master
stable-0.10
stable-0.11
stable-0.12
stable-0.13
stable-0.14
stable-0.15
stable-1.0
stable-1.1
stable-1.2
stable-1.3
stable-1.4
stable-1.5
stable-1.6
stable-1.7
stable-10.0
stable-10.1
stable-10.2
stable-2.0
stable-2.1
stable-2.10
stable-2.11
stable-2.12
stable-2.2
stable-2.3
stable-2.4
stable-2.5
stable-2.6
stable-2.7
stable-2.8
stable-2.9
stable-3.0
stable-3.1
stable-4.0
stable-4.1
stable-4.2
stable-5.0
stable-6.0
stable-6.0-staging
stable-6.1
stable-7.2
stable-7.2-staging
stable-8.0
stable-8.0-staging
stable-8.1
stable-8.2
stable-9.0
stable-9.1
stable-9.2
staging
staging-0.0
staging-10.0
staging-10.1
staging-10.2
staging-7.2
staging-8.0
staging-8.1
staging-8.2
staging-9.0
staging-9.1
staging-9.2
staging-mjt-test
stsquad-hotfix
tracing
initial
release_0_10_0
release_0_10_1
release_0_10_2
release_0_5_1
release_0_6_0
release_0_6_1
release_0_7_0
release_0_7_1
release_0_8_1
release_0_8_2
release_0_9_0
release_0_9_1
staging-mjt-test
trivial-patches-pull-request
v0.1.0
v0.1.1
v0.1.3
v0.1.4
v0.1.5
v0.1.6
v0.10.0
v0.10.1
v0.10.2
v0.10.3
v0.10.4
v0.10.5
v0.10.6
v0.11.0
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.1
v0.12.0
v0.12.0-rc0
v0.12.0-rc1
v0.12.0-rc2
v0.12.1
v0.12.2
v0.12.3
v0.12.4
v0.12.5
v0.13.0
v0.13.0-rc0
v0.13.0-rc1
v0.13.0-rc2
v0.13.0-rc3
v0.14.0
v0.14.0-rc0
v0.14.0-rc1
v0.14.0-rc2
v0.14.1
v0.15.0
v0.15.0-rc0
v0.15.0-rc1
v0.15.0-rc2
v0.15.1
v0.2.0
v0.3.0
v0.4.0
v0.4.1
v0.4.2
v0.4.3
v0.4.4
v0.5.0
v0.5.1
v0.6.0
v0.6.1
v0.7.0
v0.7.1
v0.8.1
v0.8.2
v0.9.0
v0.9.1
v1.0
v1.0-rc0
v1.0-rc1
v1.0-rc2
v1.0-rc3
v1.0-rc4
v1.0.1
v1.1-rc0
v1.1-rc1
v1.1-rc2
v1.1.0
v1.1.0-rc2
v1.1.0-rc3
v1.1.0-rc4
v1.1.1
v1.1.2
v1.2.0
v1.2.0-rc0
v1.2.0-rc1
v1.2.0-rc2
v1.2.0-rc3
v1.2.1
v1.2.2
v1.3.0
v1.3.0-rc0
v1.3.0-rc1
v1.3.0-rc2
v1.3.1
v1.4.0
v1.4.0-rc0
v1.4.0-rc1
v1.4.0-rc2
v1.4.1
v1.4.2
v1.5.0
v1.5.0-rc0
v1.5.0-rc1
v1.5.0-rc2
v1.5.0-rc3
v1.5.1
v1.5.2
v1.5.3
v1.6.0
v1.6.0-rc0
v1.6.0-rc1
v1.6.0-rc2
v1.6.0-rc3
v1.6.1
v1.6.2
v1.7.0
v1.7.0-rc0
v1.7.0-rc1
v1.7.0-rc2
v1.7.1
v1.7.2
v10.0.0
v10.0.0-rc0
v10.0.0-rc1
v10.0.0-rc2
v10.0.0-rc3
v10.0.0-rc4
v10.0.1
v10.0.2
v10.0.3
v10.0.4
v10.0.5
v10.0.6
v10.0.7
v10.0.8
v10.1.0
v10.1.0-rc0
v10.1.0-rc1
v10.1.0-rc2
v10.1.0-rc3
v10.1.0-rc4
v10.1.1
v10.1.2
v10.1.3
v10.1.4
v10.2.0
v10.2.0-rc1
v10.2.0-rc2
v10.2.0-rc3
v10.2.0-rc4
v10.2.1
v2.0.0
v2.0.0-rc0
v2.0.0-rc1
v2.0.0-rc2
v2.0.0-rc3
v2.0.1
v2.0.2
v2.1.0
v2.1.0-rc0
v2.1.0-rc1
v2.1.0-rc2
v2.1.0-rc3
v2.1.0-rc4
v2.1.0-rc5
v2.1.1
v2.1.2
v2.1.3
v2.10.0
v2.10.0-rc0
v2.10.0-rc1
v2.10.0-rc2
v2.10.0-rc3
v2.10.0-rc4
v2.10.1
v2.10.2
v2.11.0
v2.11.0-rc0
v2.11.0-rc1
v2.11.0-rc2
v2.11.0-rc3
v2.11.0-rc4
v2.11.0-rc5
v2.11.1
v2.11.2
v2.12.0
v2.12.0-rc0
v2.12.0-rc1
v2.12.0-rc2
v2.12.0-rc3
v2.12.0-rc4
v2.12.1
v2.2.0
v2.2.0-rc0
v2.2.0-rc1
v2.2.0-rc2
v2.2.0-rc3
v2.2.0-rc4
v2.2.0-rc5
v2.2.1
v2.3.0
v2.3.0-rc0
v2.3.0-rc1
v2.3.0-rc2
v2.3.0-rc3
v2.3.0-rc4
v2.3.1
v2.4.0
v2.4.0-rc0
v2.4.0-rc1
v2.4.0-rc2
v2.4.0-rc3
v2.4.0-rc4
v2.4.0.1
v2.4.1
v2.5.0
v2.5.0-rc0
v2.5.0-rc1
v2.5.0-rc2
v2.5.0-rc3
v2.5.0-rc4
v2.5.1
v2.5.1.1
v2.6.0
v2.6.0-rc0
v2.6.0-rc1
v2.6.0-rc2
v2.6.0-rc3
v2.6.0-rc4
v2.6.0-rc5
v2.6.1
v2.6.2
v2.7.0
v2.7.0-rc0
v2.7.0-rc1
v2.7.0-rc2
v2.7.0-rc3
v2.7.0-rc4
v2.7.0-rc5
v2.7.1
v2.8.0
v2.8.0-rc0
v2.8.0-rc1
v2.8.0-rc2
v2.8.0-rc3
v2.8.0-rc4
v2.8.1
v2.8.1.1
v2.9.0
v2.9.0-rc0
v2.9.0-rc1
v2.9.0-rc2
v2.9.0-rc3
v2.9.0-rc4
v2.9.0-rc5
v2.9.1
v3.0.0
v3.0.0-rc0
v3.0.0-rc1
v3.0.0-rc2
v3.0.0-rc3
v3.0.0-rc4
v3.0.1
v3.1.0
v3.1.0-rc0
v3.1.0-rc1
v3.1.0-rc2
v3.1.0-rc3
v3.1.0-rc4
v3.1.0-rc5
v3.1.1
v3.1.1.1
v4.0.0
v4.0.0-rc0
v4.0.0-rc1
v4.0.0-rc2
v4.0.0-rc3
v4.0.0-rc4
v4.0.1
v4.1.0
v4.1.0-rc0
v4.1.0-rc1
v4.1.0-rc2
v4.1.0-rc3
v4.1.0-rc4
v4.1.0-rc5
v4.1.1
v4.2.0
v4.2.0-rc0
v4.2.0-rc1
v4.2.0-rc2
v4.2.0-rc3
v4.2.0-rc4
v4.2.0-rc5
v4.2.1
v5.0.0
v5.0.0-rc0
v5.0.0-rc1
v5.0.0-rc2
v5.0.0-rc3
v5.0.0-rc4
v5.0.1
v5.1.0
v5.1.0-rc0
v5.1.0-rc1
v5.1.0-rc2
v5.1.0-rc3
v5.2.0
v5.2.0-rc0
v5.2.0-rc1
v5.2.0-rc2
v5.2.0-rc3
v5.2.0-rc4
v6.0.0
v6.0.0-rc0
v6.0.0-rc1
v6.0.0-rc2
v6.0.0-rc3
v6.0.0-rc4
v6.0.0-rc5
v6.0.1
v6.1.0
v6.1.0-rc0
v6.1.0-rc1
v6.1.0-rc2
v6.1.0-rc3
v6.1.0-rc4
v6.1.1
v6.2.0
v6.2.0-rc0
v6.2.0-rc1
v6.2.0-rc2
v6.2.0-rc3
v6.2.0-rc4
v7.0.0
v7.0.0-rc0
v7.0.0-rc1
v7.0.0-rc2
v7.0.0-rc3
v7.0.0-rc4
v7.1.0
v7.1.0-rc0
v7.1.0-rc1
v7.1.0-rc2
v7.1.0-rc3
v7.1.0-rc4
v7.2.0
v7.2.0-rc0
v7.2.0-rc1
v7.2.0-rc2
v7.2.0-rc3
v7.2.0-rc4
v7.2.1
v7.2.10
v7.2.11
v7.2.12
v7.2.13
v7.2.14
v7.2.15
v7.2.16
v7.2.17
v7.2.18
v7.2.19
v7.2.2
v7.2.20
v7.2.21
v7.2.22
v7.2.3
v7.2.4
v7.2.5
v7.2.6
v7.2.7
v7.2.8
v7.2.9
v8.0.0
v8.0.0-rc0
v8.0.0-rc1
v8.0.0-rc2
v8.0.0-rc3
v8.0.0-rc4
v8.0.1
v8.0.2
v8.0.3
v8.0.4
v8.0.5
v8.1.0
v8.1.0-rc0
v8.1.0-rc1
v8.1.0-rc2
v8.1.0-rc3
v8.1.0-rc4
v8.1.1
v8.1.2
v8.1.3
v8.1.4
v8.1.5
v8.2.0
v8.2.0-rc0
v8.2.0-rc1
v8.2.0-rc2
v8.2.0-rc3
v8.2.0-rc4
v8.2.1
v8.2.10
v8.2.2
v8.2.3
v8.2.4
v8.2.5
v8.2.6
v8.2.7
v8.2.8
v8.2.9
v9.0.0
v9.0.0-rc0
v9.0.0-rc1
v9.0.0-rc2
v9.0.0-rc3
v9.0.0-rc4
v9.0.1
v9.0.2
v9.0.3
v9.0.4
v9.1.0
v9.1.0-rc0
v9.1.0-rc1
v9.1.0-rc2
v9.1.0-rc3
v9.1.0-rc4
v9.1.1
v9.1.2
v9.1.3
v9.2.0
v9.2.0-rc0
v9.2.0-rc1
v9.2.0-rc2
v9.2.0-rc3
v9.2.1
v9.2.2
v9.2.3
v9.2.4
${ noResults }
235 Commits (1a1787ab6bbc3b329b81e14f25dfbc11e44ff2e0)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
a4cda04107 |
error: error_free(NULL) is safe, drop unnecessary conditionals
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20251119130855.105479-5-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> |
4 months ago |
|
|
12e50722e4 |
block: rename block/aio-wait.h to qemu/aio-wait.h
AIO_WAIT_WHILE is used even outside the block layer; move the header file out of block/ just like the implementation is in util/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
4 months ago |
|
|
de252f7993 |
qio: Add QIONetListener API for using AioContext
The user calling himself "John Doe" reported a deadlock when attempting to use qemu-storage-daemon to serve both a base file over NBD, and a qcow2 file with that NBD export as its backing file, from the same process, even though it worked just fine when there were two q-s-d processes. The bulk of the NBD server code properly uses coroutines to make progress in an event-driven manner, but the code for spawning a new coroutine at the point when listen(2) detects a new client was hard-coded to use the global GMainContext; in other words, the callback that triggers nbd_client_new to let the server start the negotiation sequence with the client requires the main loop to be making progress. However, the code for bdrv_open of a qcow2 image with an NBD backing file uses an AIO_WAIT_WHILE nested event loop to ensure that the entire qcow2 backing chain is either fully loaded or rejected, without any side effects from the main loop causing unwanted changes to the disk being loaded (in short, an AioContext represents the set of actions that are known to be safe while handling block layer I/O, while excluding any other pending actions in the global main loop with potentially larger risk of unwanted side effects). This creates a classic case of deadlock: the server can't progress to the point of accept(2)ing the client to write to the NBD socket because the main loop is being starved until the AIO_WAIT_WHILE completes the bdrv_open, but the AIO_WAIT_WHILE can't progress because it is blocked on the client coroutine stuck in a read() of the expected magic number from the server side of the socket. This patch adds a new API to allow clients to opt in to listening via an AioContext rather than a GMainContext. This will allow NBD to fix the deadlock by performing all actions during bdrv_open in the main loop AioContext. Technical debt warning: I would have loved to utilize a notify function with AioContext to guarantee that we don't finalize listener due to an object_unref if there is any callback still running (the way GSource does), but wiring up notify functions into AioContext is a bigger task that will be deferred to a later QEMU release. But for solving the NBD deadlock, it is sufficient to note that the QMP commands for enabling and disabling the NBD server are really the only points where we want to change the listener's callback. Furthermore, those commands are serviced in the main loop, which is the same AioContext that is also listening for connections. Since a thread cannot interrupt itself, we are ensured that at the point where we are changing the watch, there are no callbacks active. This is NOT as powerful as the GSource cross-thread safety, but sufficient for the needs of today. An upcoming patch will then add a unit test (kept separate to make it easier to rearrange the series to demonstrate the deadlock without this patch). Fixes: https://gitlab.com/qemu-project/qemu/-/issues/3169 Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-26-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> |
5 months ago |
|
|
cc0faf8273 |
qio: Prepare NetListener to use AioContext
For ease of review, this patch adds an AioContext pointer to the QIONetListener struct, the code to trace it, and refactors listener->io_source to instead be an array of utility structs; but the aio_context pointer is always NULL until the next patch adds an API to set it. There should be no semantic change in this patch. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20251113011625.878876-25-eblake@redhat.com> |
5 months ago |
|
|
ec59a65a4d |
qio: Provide accessor around QIONetListener->sioc
An upcoming patch needs to pass more than just sioc as the opaque pointer to an AioContext; but since our AioContext code in general (and its QIO Channel wrapper code) lacks a notify callback present with GSource, we do not have the trivial option of just g_malloc'ing a small struct to hold all that data coupled with a notify of g_free. Instead, the data pointer must outlive the registered handler; in fact, having the data pointer have the same lifetime as QIONetListener is adequate. But the cleanest way to stick such a helper struct in QIONetListener will be to rearrange internal struct members. And that in turn means that all existing code that currently directly accesses listener->nsioc and listener->sioc[] should instead go through accessor functions, to be immune to the upcoming struct layout changes. So this patch adds accessor methods qio_net_listener_nsioc() and qio_net_listener_sioc(), and puts them to use. While at it, notice that the pattern of grabbing an sioc from the listener only to turn around can call qio_channel_socket_get_local_address is common enough to also warrant the helper of qio_net_listener_get_local_address, and fix a copy-paste error in the corresponding documentation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-24-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> |
5 months ago |
|
|
e685dd26c7 |
qio: Factor out helpers qio_net_listener_[un]watch
The code had three similar repetitions of an iteration over one or all of nsiocs to set up a GSource, and likewise for teardown. Since an upcoming patch wants to tweak whether GSource or AioContext is used, it's better to consolidate that into one helper function for fewer places to edit later. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-22-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> |
5 months ago |
|
|
36769aeedf |
qio: Minor optimization when callback function is unchanged
In qemu-nbd and other NBD server setups where parallel clients are supported, it is common that the caller will re-register the same callback function as long as it has not reached its limit on simultaneous clients. In that case, there is no need to tear down and reinstall GSource watches in the GMainContext. In practice, all existing callers currently pass NULL for notify, and no caller ever changes context across calls (for async uses, either the caller consistently uses qio_net_listener_set_client_func_full with the same context, or the caller consistently uses only qio_net_listener_set_client_func which always uses the global context); but the time spent checking these two fields in addition to the more important func and data is still less than the savings of not churning through extra GSource manipulations when the result will be unchanged. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20251113011625.878876-21-eblake@redhat.com> |
5 months ago |
|
|
9d86181874 |
qio: Protect NetListener callback with mutex
Without a mutex, NetListener can run into this data race between a
thread changing the async callback callback function to use when a
client connects, and the thread servicing polling of the listening
sockets:
Thread 1:
qio_net_listener_set_client_func(lstnr, f1, ...);
=> foreach sock: socket
=> object_ref(lstnr)
=> sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
Thread 2:
poll()
=> event POLLIN on socket
=> ref(GSourceCallback)
=> if (lstnr->io_func) // while lstnr->io_func is f1
...interrupt..
Thread 1:
qio_net_listener_set_client_func(lstnr, f2, ...);
=> foreach sock: socket
=> g_source_unref(sock_src)
=> foreach sock: socket
=> object_ref(lstnr)
=> sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
Thread 2:
=> call lstnr->io_func(lstnr->io_data) // now sees f2
=> return dispatch(sock)
=> unref(GSourceCallback)
=> destroy-notify
=> object_unref
Found by inspection; I did not spend the time trying to add sleeps or
execute under gdb to try and actually trigger the race in practice.
This is a SEGFAULT waiting to happen if f2 can become NULL because
thread 1 deregisters the user's callback while thread 2 is trying to
service the callback. Other messes are also theoretically possible,
such as running callback f1 with an opaque pointer that should only be
passed to f2 (if the client code were to use more than just a binary
choice between a single async function or NULL).
Mitigating factor: if the code that modifies the QIONetListener can
only be reached by the same thread that is executing the polling and
async callbacks, then we are not in a two-thread race documented above
(even though poll can see two clients trying to connect in the same
window of time, any changes made to the listener by the first async
callback will be completed before the thread moves on to the second
client). However, QEMU is complex enough that this is hard to
generically analyze. If QMP commands (like nbd-server-stop) are run
in the main loop and the listener uses the main loop, things should be
okay. But when a client uses an alternative GMainContext, or if
servicing a QMP command hands off to a coroutine to avoid blocking, I
am unable to state with certainty whether a given net listener can be
modified by a thread different from the polling thread running
callbacks.
At any rate, it is worth having the API be robust. To ensure that
modifying a NetListener can be safely done from any thread, add a
mutex that guarantees atomicity to all members of a listener object
related to callbacks. This problem has been present since
QIONetListener was introduced.
Note that this does NOT prevent the case of a second round of the
user's old async callback being invoked with the old opaque data, even
when the user has already tried to change the async callback during
the first async callback; it is only about ensuring that there is no
sharding (the eventual io_func(io_data) call that does get made will
correspond to a particular combination that the user had requested at
some point in time, and not be sharded to a combination that never
existed in practice). In other words, this patch maintains the status
quo that a user's async callback function already needs to be robust
to parallel clients landing in the same window of poll servicing, even
when only one client is desired, if that particular listener can be
amended in a thread other than the one doing the polling.
CC: qemu-stable@nongnu.org
Fixes:
|
5 months ago |
|
|
b5676493a0 |
qio: Remember context of qio_net_listener_set_client_func_full
io/net-listener.c has two modes of use: asynchronous (the user calls qio_net_listener_set_client_func to wake up the callback via the global GMainContext, or qio_net_listener_set_client_func_full to wake up the callback via the caller's own alternative GMainContext), and synchronous (the user calls qio_net_listener_wait_client which creates its own GMainContext and waits for the first client connection before returning, with no need for a user's callback). But commit |
5 months ago |
|
|
6e03d5cdc9 |
qio: Unwatch before notify in QIONetListener
When changing the callback registered with QIONetListener, the code
was calling notify on the old opaque data prior to actually removing
the old GSource objects still pointing to that data. Similarly,
during finalize, it called notify before tearing down the various
GSource objects tied to the data.
In practice, a grep of the QEMU code base found that every existing
client of QIONetListener passes in a NULL notifier (the opaque data,
if non-NULL, outlives the NetListener and so does not need cleanup
when the NetListener is torn down), so this patch has no impact. And
even if a caller had passed in a reference-counted object with a
notifier of object_unref but kept its own reference on the data, then
the early notify would merely reduce a refcount from (say) 2 to 1, but
not free the object. However, it is a latent bug waiting to bite any
future caller that passes in data where the notifier actually frees
the object, because the GSource could then trigger a use-after-free if
it loses the race on a last-minute client connection resulting in the
data being passed to one final use of the async callback.
Better is to delay the notify call until after all GSource that have
been given a copy of the opaque data are torn down.
CC: qemu-stable@nongnu.org
Fixes:
|
5 months ago |
|
|
59506e59e0 |
qio: Add trace points to net_listener
Upcoming patches will adjust how net_listener watches for new client connections; adding trace points now makes it easier to debug that the changes work as intended. For example, adding --trace='qio_net_listener*' to the qemu-storage-daemon command line before --nbd-server will track when the server first starts listening for clients. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20251113011625.878876-17-eblake@redhat.com> |
5 months ago |
|
|
1edf0df284 |
io: Add qio_channel_wait_cond() helper
Add the helper to wait for QIO channel's IO availability in any context (coroutine, or non-coroutine). Use it tree-wide for three occurences. Cc: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20251022192612.2737648-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> |
5 months ago |
|
|
84005f4a2b |
io: flush zerocopy socket error queue on sendmsg failure due to ENOBUF
The kernel allocates extra metadata SKBs in case of a zerocopy send,
eventually used for zerocopy's notification mechanism. This metadata
memory is accounted for in the OPTMEM limit. The kernel queues
completion notifications on the socket error queue and this error queue
is freed when userspace reads it.
Usually, in the case of in-order processing, the kernel will batch the
notifications and merge the metadata into a single SKB and free the
rest. As a result, it never exceeds the OPTMEM limit. However, if there
is any out-of-order processing or intermittent zerocopy failures, this
error chain can grow significantly, exhausting the OPTMEM limit. As a
result, all new sendmsg requests fail to allocate any new SKB, leading
to an ENOBUF error. Depending on the amount of data queued before the
flush (i.e., large live migration iterations), even large OPTMEM limits
are prone to failure.
To work around this, if we encounter an ENOBUF error with a zerocopy
sendmsg, flush the error queue and retry once more.
Co-authored-by: Manish Mishra <manish.mishra@nutanix.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[DB: change TRUE/FALSE to true/false for 'bool' type;
add more #ifdef QEMU_MSG_ZEROCOPY blocks]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
5 months ago |
|
|
e52d822716 |
io: add a "blocking" field to QIOChannelSocket
Add a 'blocking' boolean field to QIOChannelSocket to track whether the underlying socket is in blocking or non-blocking mode. Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
5 months ago |
|
|
989221c0c7 |
io/channel: Have read/write functions take void * buffer argument
I/O channel read/write functions can operate on any area of memory, regardless of the content their represent. Do not restrict to array of char, use the void* type, which is also the type of the underlying iovec::iov_base field. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> [DB: also adapt test-crypto-tlssession.c func signatures] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
5 months ago |
|
|
b7a1f2ca45 |
io: fix use after free in websocket handshake code
If the QIOChannelWebsock object is freed while it is waiting to
complete a handshake, a GSource is leaked. This can lead to the
callback firing later on and triggering a use-after-free in the
use of the channel. This was observed in the VNC server with the
following trace from valgrind:
==2523108== Invalid read of size 4
==2523108== at 0x4054A24: vnc_disconnect_start (vnc.c:1296)
==2523108== by 0x4054A24: vnc_client_error (vnc.c:1392)
==2523108== by 0x4068A09: vncws_handshake_done (vnc-ws.c:105)
==2523108== by 0x44863B4: qio_task_complete (task.c:197)
==2523108== by 0x448343D: qio_channel_websock_handshake_io (channel-websock.c:588)
==2523108== by 0x6EDB862: UnknownInlinedFun (gmain.c:3398)
==2523108== by 0x6EDB862: g_main_context_dispatch_unlocked.lto_priv.0 (gmain.c:4249)
==2523108== by 0x6EDBAE4: g_main_context_dispatch (gmain.c:4237)
==2523108== by 0x45EC79F: glib_pollfds_poll (main-loop.c:287)
==2523108== by 0x45EC79F: os_host_main_loop_wait (main-loop.c:310)
==2523108== by 0x45EC79F: main_loop_wait (main-loop.c:589)
==2523108== by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108== by 0x454F300: qemu_default_main (main.c:37)
==2523108== by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108== Address 0x57a6e0dc is 28 bytes inside a block of size 103,608 free'd
==2523108== at 0x5F2FE43: free (vg_replace_malloc.c:989)
==2523108== by 0x6EDC444: g_free (gmem.c:208)
==2523108== by 0x4053F23: vnc_update_client (vnc.c:1153)
==2523108== by 0x4053F23: vnc_refresh (vnc.c:3225)
==2523108== by 0x4042881: dpy_refresh (console.c:880)
==2523108== by 0x4042881: gui_update (console.c:90)
==2523108== by 0x45EFA1B: timerlist_run_timers.part.0 (qemu-timer.c:562)
==2523108== by 0x45EFC8F: timerlist_run_timers (qemu-timer.c:495)
==2523108== by 0x45EFC8F: qemu_clock_run_timers (qemu-timer.c:576)
==2523108== by 0x45EFC8F: qemu_clock_run_all_timers (qemu-timer.c:663)
==2523108== by 0x45EC765: main_loop_wait (main-loop.c:600)
==2523108== by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108== by 0x454F300: qemu_default_main (main.c:37)
==2523108== by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108== Block was alloc'd at
==2523108== at 0x5F343F3: calloc (vg_replace_malloc.c:1675)
==2523108== by 0x6EE2F81: g_malloc0 (gmem.c:133)
==2523108== by 0x4057DA3: vnc_connect (vnc.c:3245)
==2523108== by 0x448591B: qio_net_listener_channel_func (net-listener.c:54)
==2523108== by 0x6EDB862: UnknownInlinedFun (gmain.c:3398)
==2523108== by 0x6EDB862: g_main_context_dispatch_unlocked.lto_priv.0 (gmain.c:4249)
==2523108== by 0x6EDBAE4: g_main_context_dispatch (gmain.c:4237)
==2523108== by 0x45EC79F: glib_pollfds_poll (main-loop.c:287)
==2523108== by 0x45EC79F: os_host_main_loop_wait (main-loop.c:310)
==2523108== by 0x45EC79F: main_loop_wait (main-loop.c:589)
==2523108== by 0x423A56D: qemu_main_loop (runstate.c:835)
==2523108== by 0x454F300: qemu_default_main (main.c:37)
==2523108== by 0x73D6574: (below main) (libc_start_call_main.h:58)
==2523108==
The above can be reproduced by launching QEMU with
$ qemu-system-x86_64 -vnc localhost:0,websocket=5700
and then repeatedly running:
for i in {1..100}; do
(echo -n "GET / HTTP/1.1" && sleep 0.05) | nc -w 1 localhost 5700 &
done
CVE-2025-11234
Reported-by: Grant Millar | Cylo <rid@cylo.io>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
6 months ago |
|
|
322c3c4f3a |
io: move websock resource release to close method
The QIOChannelWebsock object releases all its resources in the finalize callback. This is later than desired, as callers expect to be able to call qio_channel_close() to fully close a channel and release resources related to I/O. The logic in the finalize method is at most a failsafe to handle cases where a consumer forgets to call qio_channel_close. This adds equivalent logic to the close method to release the resources, using g_clear_handle_id/g_clear_pointer to be robust against repeated invocations. The finalize method is tweaked so that the GSource is removed before releasing the underlying channel. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
6 months ago |
|
|
2c147611cf |
io: release active GSource in TLS channel finalizer
While code is supposed to call qio_channel_close() before releasing the last reference on an QIOChannel, this is not guaranteed. QIOChannelFile and QIOChannelSocket both cleanup resources in their finalizer if the close operation was missed. This ensures the TLS channel will do the same failsafe cleanup. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
6 months ago |
|
|
6a9e81b705 |
crypto: propagate Error object on premature termination
The way that premature termination was handled in TLS connections was
changed to handle an ordering problem during graceful shutdown in the
migration code.
Unfortunately one of the codepaths returned -1 to indicate an error
condition, but failed to set the 'errp' parameter.
This broke error handling in the qio_channel_tls_handshake function,
as the QTask callback would no longer see that an error was raised.
As a result, the client will go on to try to use the already closed
TLS connection, resulting in misleading errors.
This was evidenced in the I/O test 233 which showed changes such as
-qemu-nbd: Certificate does not match the hostname localhost
+qemu-nbd: Failed to read initial magic: Unable to read from socket: Connection reset by peer
Fixes:
|
6 months ago |
|
|
7e0c22d585 |
io/crypto: Move tls premature termination handling into QIO layer
QCryptoTLSSession allows TLS premature termination in two cases, one of the case is when the channel shutdown() is invoked on READ side. It's possible the shutdown() happened after the read thread blocked at gnutls_record_recv(). In this case, we should allow the premature termination to happen. The problem is by the time qcrypto_tls_session_read() was invoked, tioc->shutdown may not have been set, so this may instead be treated as an error if there is concurrent shutdown() calls. To allow the flag to reflect the latest status of tioc->shutdown, move the check upper into the QIOChannel level, so as to read the flag only after QEMU gets an GNUTLS_E_PREMATURE_TERMINATION. When at it, introduce qio_channel_tls_allow_premature_termination() helper to make the condition checks easier to read. When doing so, change the qatomic_load_acquire() to qatomic_read(): here we don't need any ordering of memory accesses, but reading a flag. qatomic_read() would suffice because it guarantees fetching from memory. Nothing else we should need to order on memory access. This patch will fix a qemu qtest warning when running the preempt tls test, reporting premature termination: QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --full -r /x86_64/migration/postcopy/preempt/tls/psk ... qemu-kvm: Cannot read from TLS channel: The TLS connection was non-properly terminated. ... In this specific case, the error was set by postcopy_preempt_thread, which normally will be concurrently shutdown()ed by the main thread. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250918203937.200833-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> |
6 months ago |
|
|
bcb536cabe |
error: Kill @error_warn
We added @error_warn some two years ago in commit
|
6 months ago |
|
|
5bd58f04b8 |
util/oslib-win32: Do not treat null @errp as &error_warn
qemu_socket_select() and its wrapper qemu_socket_unselect() treat a
null @errp as &error_warn. This is wildly inappropriate. A caller
passing null @errp specifies that errors are to be ignored. If
warnings are wanted, the caller must pass &error_warn.
Change callers to do that, and drop the inappropriate treatment of
null @errp.
This assumes that warnings are wanted. I'm not familiar with the
calling code, so I can't say whether it will work when the socket is
invalid, or WSAEventSelect() fails. If it doesn't, then this should
be an error instead of a warning. Invalid socket might even be a
programming error.
These warnings were introduced in commit
|
6 months ago |
|
|
6f607941b1 |
treewide: use qemu_set_blocking instead of g_unix_set_fd_nonblocking
Instead of open-coded g_unix_set_fd_nonblocking() calls, use QEMU wrapper qemu_set_blocking(). Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> [DB: fix missing closing ) in tap-bsd.c, remove now unused GError var] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
7 months ago |
|
|
d14c8cc69d |
io/channel-socket: rework qio_channel_socket_copy_fds()
We want to switch from qemu_socket_set_block() to newer qemu_set_blocking(), which provides return status of operation, to handle errors. Still, we want to keep qio_channel_socket_readv() interface clean, as currently it allocate @fds only on success. So, in case of error, we should close all incoming fds and keep user's @fds untouched or zero. Let's make separate functions qio_channel_handle_fds() and qio_channel_cleanup_fds(), to achieve what we want. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
7 months ago |
|
|
8cb17f9c36 |
util: drop qemu_socket_set_nonblock()
Use common qemu_set_blocking() instead. Note that pre-patch the behavior of Win32 and Linux realizations are inconsistent: we ignore failure for Win32, and assert success for Linux. How do we convert the callers? 1. Most of callers call qemu_socket_set_nonblock() on a freshly created socket fd, in conditions when we may simply report an error. Seems correct switching to error handling both for Windows (pre-patch error is ignored) and Linux (pre-patch we assert success). Anyway, we normally don't expect errors in these cases. Still in tests let's use &error_abort for simplicity. What are exclusions? 2. hw/virtio/vhost-user.c - we are inside #ifdef CONFIG_LINUX, so no damage in switching to error handling from assertion. 3. io/channel-socket.c: here we convert both old calls to qemu_socket_set_nonblock() and qemu_socket_set_block() to one new call. Pre-patch we assert success for Linux in qemu_socket_set_nonblock(), and ignore all other errors here. So, for Windows switch is a bit dangerous: we may get new errors or crashes(when error_abort is passed) in cases where we have silently ignored the error before (was it correct in all such cases, if they were?) Still, there is no other way to stricter API than take this risk. 4. util/vhost-user-server - compiled only for Linux (see util/meson.build), so we are safe, switching from assertion to &error_abort. Note: In qga/channel-posix.c we use g_warning(), where g_printerr() would actually be a better choice. Still let's for now follow common style of qga, where g_warning() is commonly used to print such messages, and no call to g_printerr(). Converting everything to use g_printerr() should better be another series. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
7 months ago |
|
|
1ed8903916 |
treewide: handle result of qio_channel_set_blocking()
Currently, we just always pass NULL as errp argument. That doesn't
look good.
Some realizations of interface may actually report errors.
Channel-socket realization actually either ignore or crash on
errors, but we are going to straighten it out to always reporting
an errp in further commits.
So, convert all callers to either handle the error (where environment
allows) or explicitly use &error_abort.
Take also a chance to change the return value to more convenient
bool (keeping also in mind, that underlying realizations may
return -1 on failure, not -errno).
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[DB: fix return type mismatch in TLS/websocket channel
impls for qio_channel_set_blocking]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
7 months ago |
|
|
7bc2cbe330 |
migration/qemu-file: don't make incoming fds blocking again
In migration we want to pass fd "as is", not changing its blocking status. The only current user of these fds is CPR state (through VMSTATE_FD), which of-course doesn't want to modify fds on target when source is still running and use these fds. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
7 months ago |
|
|
edea818371 |
io: add support for activating TLS thread safety workaround
Add a QIO_CHANNEL_FEATURE_CONCURRENT_IO feature flag. If this is set on a QIOChannelTLS session object, the TLS session will be marked as requiring thread safety, which will activate the workaround for GNUTLS bug 1717 if needed. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/qemu-devel/20250718150514.2635338-3-berrange@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de> |
9 months ago |
|
|
e2e360db7a |
io: Add helper for setting socket send buffer size
Testing reading and writing from qemu-nbd using a unix domain socket shows that the platform default send buffer size is too low, leading to poor performance and hight cpu usage. Add a helper for setting socket send buffer size to be used in NBD code. It can also be used in other contexts. We don't need a helper for receive buffer size since it is not used with unix domain sockets. This is documented for Linux, and not documented for macOS. Failing to set the socket buffer size is not a fatal error, but the caller may want to warn about the failure. Signed-off-by: Nir Soffer <nirsof@gmail.com> Message-ID: <20250517201154.88456-2-nirsof@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> |
11 months ago |
|
|
0dc051aa85 |
io: Fix partial struct copy in qio_dns_resolver_lookup_sync_inet()
Commit |
10 months ago |
|
|
12d1a768bd |
qom: Have class_init() take a const data argument
Mechanical change using gsed, then style manually adapted to pass checkpatch.pl script. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-4-philmd@linaro.org> |
1 year ago |
|
|
322d873b63 |
io: Add a read flag for relaxed EOF
Add a read flag that can inform a channel that it's ok to receive an EOF at any moment. Channels that have some form of strict EOF tracking, such as TLS session termination, may choose to ignore EOF errors with the use of this flag. This is being added for compatibility with older migration streams that do not include a TLS termination step. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> |
1 year ago |
|
|
a25b013019 |
io: Add flags argument to qio_channel_readv_full_all_eof
We want to pass flags into qio_channel_tls_readv() but qio_channel_readv_full_all_eof() doesn't take a flags argument. No functional change. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> |
1 year ago |
|
|
0b8a70d70f |
crypto: Remove qcrypto_tls_session_get_handshake_status
The correct way of calling qcrypto_tls_session_handshake() requires calling qcrypto_tls_session_get_handshake_status() right after it so there's no reason to have a separate method. Refactor qcrypto_tls_session_handshake() to inform the status in its own return value and alter the callers accordingly. No functional change. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> |
1 year ago |
|
|
30ee88622e |
io: tls: Add qio_channel_tls_bye
Add a task dispatcher for gnutls_bye similar to the qio_channel_tls_handshake_task(). The gnutls_bye() call might be interrupted and so it needs to be rescheduled. The migration code will make use of this to help the migration destination identify a premature EOF. Once the session termination is in place, any EOF that happens before the source issued gnutls_bye() will be considered an error. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> |
1 year ago |
|
|
ef834aa2b2 |
qapi/crypto: Rename QCryptoHashAlgorithm to *Algo, and drop prefix
QAPI's 'prefix' feature can make the connection between enumeration
type and its constants less than obvious. It's best used with
restraint.
QCryptoHashAlgorithm has a 'prefix' that overrides the generated
enumeration constants' prefix to QCRYPTO_HASH_ALG.
We could simply drop 'prefix', but then the prefix becomes
QCRYPTO_HASH_ALGORITHM, which is rather long.
We could additionally rename the type to QCryptoHashAlg, but I think
the abbreviation "alg" is less than clear.
Rename the type to QCryptoHashAlgo instead. The prefix becomes to
QCRYPTO_HASH_ALGO.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240904111836.3273842-12-armbru@redhat.com>
[Conflicts with merge commit
|
2 years ago |
|
|
97f7bf113e |
crypto: propagate errors from TLS session I/O callbacks
GNUTLS doesn't know how to perform I/O on anything other than plain FDs, so the TLS session provides it with some I/O callbacks. The GNUTLS API design requires these callbacks to return a unix errno value, which means we're currently loosing the useful QEMU "Error" object. This changes the I/O callbacks in QEMU to stash the "Error" object in the QCryptoTLSSession class, and fetch it when seeing an I/O error returned from GNUTLS, thus preserving useful error messages. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
2 years ago |
|
|
57941c9c86 |
crypto: push error reporting into TLS session I/O APIs
The current TLS session I/O APIs just return a synthetic errno value on error, which has been translated from a gnutls error value. This looses a large amount of valuable information that distinguishes different scenarios. Pushing population of the "Error *errp" object into the TLS session I/O APIs gives more detailed error information. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
2 years ago |
|
|
95fa0c79a0 |
qio: add support for SO_PEERCRED for socket channel
The function qio_channel_get_peercred() returns a pointer to the
credentials of the peer process connected to this socket.
This credentials structure is defined in <sys/socket.h> as follows:
struct ucred {
pid_t pid; /* Process ID of the sending process */
uid_t uid; /* User ID of the sending process */
gid_t gid; /* Group ID of the sending process */
};
The use of this function is possible only for connected AF_UNIX stream
sockets and for AF_UNIX stream and datagram socket pairs.
On platform other than Linux, the function return 0.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
2 years ago |
|
|
7b1070a7e1 |
Revert "meson: Propagate gnutls dependency"
This reverts commit
|
2 years ago |
|
|
46cec74c1b |
io: Stop using qemu_open_old in channel-file
We want to make use of the Error object to report fdset errors from qemu_open_internal() and passing the error pointer to qemu_open_old() would require changing all callers. Move the file channel to the new API instead. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> |
2 years ago |
|
|
199e84de1c |
qio: Inherit follow_coroutine_ctx across TLS
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:
qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.
It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner. (I'll add that in the next patch)
But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Daniel P. Berrangé <berrange@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes:
|
2 years ago |
|
|
4760cedc61 |
io: Introduce qio_channel_file_new_dupfd
Add a new helper function for creating a QIOChannelFile channel with a duplicated file descriptor. This saves the calling code from having to do error checking on the dup() call. Suggested-by: "Daniel P. Berrangé" <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Link: https://lore.kernel.org/r/20240311233335.17299-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> |
2 years ago |
|
|
61dec06082 |
migration/multifd: Don't fsync when closing QIOChannelFile
Commit bc38feddeb ("io: fsync before closing a file channel") added a
fsync/fdatasync at the closing point of the QIOChannelFile to ensure
integrity of the migration stream in case of QEMU crash.
The decision to do the sync at qio_channel_close() was not the best
since that function runs in the main thread and the fsync can cause
QEMU to hang for several minutes, depending on the migration size and
disk speed.
To fix the hang, remove the fsync from qio_channel_file_close().
At this moment, the migration code is the only user of the fsync and
we're taking the tradeoff of not having a sync at all, leaving the
responsibility to the upper layers.
Fixes: bc38feddeb ("io: fsync before closing a file channel")
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240305195629.9922-1-farosas@suse.de
Link: https://lore.kernel.org/r/20240305174332.2553-1-farosas@suse.de
[peterx: add more comment to the qio_channel_close()]
Signed-off-by: Peter Xu <peterx@redhat.com>
|
2 years ago |
|
|
c05dfcb7f2 |
io: fsync before closing a file channel
Make sure the data is flushed to disk before closing file channels. This is to ensure data is on disk and not lost in the event of a host crash. This is currently being implemented to affect the migration code when migrating to a file, but all QIOChannelFile users should benefit from the change. Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Acked-by: "Daniel P. Berrangé" <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> |
2 years ago |
|
|
0478b030fa |
io: implement io_pwritev/preadv for QIOChannelFile
The upcoming 'mapped-ram' feature will require qemu to write data to (and restore from) specific offsets of the migration file. Add a minimal implementation of pwritev/preadv and expose them via the io_pwritev and io_preadv interfaces. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> |
2 years ago |
|
|
f1cfe39418 |
io: Add generic pwritev/preadv interface
Introduce basic pwritev/preadv support in the generic channel layer. Specific implementation will follow for the file channel as this is required in order to support migration streams with fixed location of each ram page. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-4-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> |
2 years ago |
|
|
401e311ff7 |
io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
Add a generic QIOChannel feature SEEKABLE which would be used by the qemu_file* apis. For the time being this will be only implemented for file channels. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> |
2 years ago |
|
|
003f15369d |
io: add trace event when cancelling TLS handshake
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
2 years ago |
|
|
9c636e0f96 |
io: Stop appending -listen to net listeners
All callers of qio_net_listener_set_name() already add some sort of "listen" or "listener" suffix. For intance, we currently have "migration-socket-listener-listen" and "vnc-listen-listen" as ioc names. Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> |
3 years ago |