Browse Source

block/nfs: Do not enter coroutine from CB

The reasoning I gave for why it would be safe to call aio_co_wake()
despite holding the mutex was wrong: It is true that the current request
will not re-acquire the mutex, but a subsequent request in the same
coroutine can.  Because the mutex is a non-coroutine mutex, this will
result in a deadlock.

Therefore, we must either not enter the coroutine here (only scheduling
it), or release the mutex around aio_co_wake().  I opt for the former,
as it is the behavior prior to the offending commit, and so seems safe
to do.

Fixes: deb35c129b
       ("nfs: Run co BH CB in the coroutine’s AioContext")
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/2622#note_2965097035
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20260102153246.154207-1-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
master
Hanna Czenczek 3 months ago
committed by Kevin Wolf
parent
commit
1d6610099b
  1. 19
      block/nfs.c

19
block/nfs.c

@ -249,14 +249,15 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
}
/*
* Safe to call: nfs_service(), which called us, is only run from the FD
* handlers, never from the request coroutine. The request coroutine in
* turn will yield unconditionally.
* No need to release the lock, even if we directly enter the coroutine, as
* the lock is never re-taken after yielding. (Note: If we do enter the
* coroutine, @task will probably be dangling once aio_co_wake() returns.)
* Using aio_co_wake() here could re-enter the coroutine directly, while we
* still hold the mutex. The current request will not attempt to re-take
* the mutex, so that is fine; but if the same coroutine then goes on to
* submit another request, that new request will try to re-take the mutex,
* resulting in a deadlock.
* To prevent that, only schedule the coroutine so it will be entered later,
* with the mutex released.
*/
aio_co_wake(task->co);
aio_co_schedule(qemu_coroutine_get_aio_context(task->co), task->co);
}
static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset,
@ -716,8 +717,8 @@ nfs_get_allocated_file_size_cb(int ret, struct nfs_context *nfs, void *data,
if (task->ret < 0) {
error_report("NFS Error: %s", nfs_get_error(nfs));
}
/* Safe to call, see nfs_co_generic_cb() */
aio_co_wake(task->co);
/* Must not use aio_co_wake(), see nfs_co_generic_cb() */
aio_co_schedule(qemu_coroutine_get_aio_context(task->co), task->co);
}
static int64_t coroutine_fn nfs_co_get_allocated_file_size(BlockDriverState *bs)

Loading…
Cancel
Save