IOU_OK means that the request ownership is now handed back to core
io_uring and it has to complete it using the result provided in
req->cqe. Same is true for multishot and IOU_STOP_MULTISHOT.
Rename it into IOU_COMPLETE to avoid confusion and use for both modes.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6a5b2edb0eb9558acb1c8f1db38ac45fee95491.1741453534.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Multishot errors can be mapped 1:1 to normal errors, but there are not
identical. It leads to a peculiar situation where all multishot requests
has to check in what context they're run and return different codes.
Unify them starting with EAGAIN / IOU_ISSUE_SKIP_COMPLETE(EIOCBQUEUED)
pair, which mean that core io_uring still owns the request and it should
be retried. In case of multishot it's naturally just continues to poll,
otherwise it might poll, use iowq or do any other kind of allowed
blocking. Introduce IOU_RETRY aliased to -EAGAIN for that.
Apart from obvious upsides, multishot can now also check for misuse of
IOU_ISSUE_SKIP_COMPLETE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/da117b79ce72ecc3ab488c744e29fae9ba54e23b.1741453534.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add io_import_reg_vec(), which will be responsible for importing
vectored registered buffers. The function might reallocate the vector,
but it'd try to do the conversion in place first, which is why it's
required of the user to pad the iovec to the right border of the cache.
Overlapping also depends on struct iovec being larger than bvec, which
is not the case on e.g. 32 bit architectures. Don't try to complicate
this case and make sure vectors never overlap, it'll be improved later.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/60bd246b1249476a6996407c1dbc38ef6febad14.1741362889.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* for-6.15/io_uring: (80 commits)
io_uring: introduce io_cache_free() helper
io_uring/rsrc: skip NULL file/buffer checks in io_free_rsrc_node()
io_uring/rsrc: avoid NULL node check on io_sqe_buffer_register() failure
io_uring/rsrc: call io_free_node() on io_sqe_buffer_register() failure
io_uring/rsrc: free io_rsrc_node using kfree()
io_uring/rsrc: split out io_free_node() helper
io_uring/rsrc: include io_uring_types.h in rsrc.h
ublk: don't cast registered buffer index to int
io_uring/nop: use io_find_buf_node()
io_uring/rsrc: declare io_find_buf_node() in header file
io_uring/ublk: report error when unregister operation fails
io_uring: convert cmd_to_io_kiocb() macro to function
io_uring/uring_cmd: specify io_uring_cmd_import_fixed() pointer type
io_uring/rsrc: use rq_data_dir() to compute bvec dir
selftests: ublk: add ublk zero copy test
selftests: ublk: add file backed ublk
selftests: ublk: add kernel selftests for ublk
io_uring: cache nodes and mapped buffers
ublk: zc register/unregister bvec
io_uring: add support for kernel registered bvecs
...
Add a helper function io_cache_free() that returns an allocation to a
io_alloc_cache, falling back on kfree() if the io_alloc_cache is full.
This is the inverse of io_cache_alloc(), which takes an allocation from
an io_alloc_cache and falls back on kmalloc() if the cache is empty.
Convert 4 callers to use the helper.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20250304194814.2346705-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_rsrc_node's of type IORING_RSRC_FILE always have a file attached
immediately after they are allocated. IORING_RSRC_BUFFER nodes won't be
returned from io_sqe_buffer_register()/io_buffer_register_bvec() until
they have a io_mapped_ubuf attached.
So remove the checks for a NULL file/buffer in io_free_rsrc_node().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-5-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_sqe_buffer_register() currently calls io_put_rsrc_node() if it fails
to fully set up the io_rsrc_node. io_put_rsrc_node() is more involved
than necessary, since we already know the reference count will reach 0
and no io_mapped_ubuf has been attached to the node yet.
So just call io_free_node() to release the node's memory. This also
avoids the need to temporarily set the node's buf pointer to NULL.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-3-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_rsrc_node_alloc() calls io_cache_alloc(), which uses kmalloc() to
allocate the node. So it can be freed with kfree() instead of kvfree().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-2-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Split the freeing of the io_rsrc_node from io_free_rsrc_node(), for use
with nodes that haven't been fully initialized.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring/rsrc.h uses several types from include/linux/io_uring_types.h.
Include io_uring_types.h explicitly in rsrc.h to avoid depending on
users of rsrc.h including io_uring_types.h first.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20250301183612.937529-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_buffer_register_bvec() takes index as an unsigned int argument, but
ublk_register_io_buf() casts ub_cmd->addr (a u64) to int. Remove the
misleading cast and instead pass index as an unsigned value to
ublk_register_io_buf() and ublk_unregister_io_buf().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250301190317.950208-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Driver fixes for:
- tegra210 adma div_u64 divison and max page fixes
- Qualcomm Revert of unavailable register workaround which is causing
regression, fixes have been proposed but still gaps are present so revert
this for now
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmfElFMACgkQfBQHDyUj
g0e98A//d9ArTTSCR0LlQOVW1EZXxSY8UpT5KP3RmmIdQ4tYHNrFb+BCpXDJtK9E
PtRS9Oja3/1uHAfHi7Kmw6Fux8kVnqPE0irs6s6ULrKgqfYM9I0pBljzV2vJHQYr
sHQvSc6Gtc9iA5L6XtQC8a8u09I9uKciWikfypc//tZXyvgKpDTFEpcd5pJkIpAM
pmQNJJuuTmbEbmxbkax1GJigs++qrBjMDuhBOFDZbQjR7xa+vpmNd1HsThppHTYR
u9AFXh0rl4CnkQddWDukTkexg8/G8OlBKSjO8JacdlMOdcrnC5rl4w4DXA80FLpX
HYawyPANqk5w/x1olraBdkpsbnZIbc+GiDFOiML4B6jNPCb9CEqBWE9/X8/QYIkY
JW4/ERQ+xM4LRUDfdIPZKtHhUkQZW8tu1ewYIzQEqEl8HXRHskKTbEWsluPiLCg1
h360MXGfPPvBoI22uQGjLna567bfPq0pvzbPdvmHcq2MZ2iW1bY4T70qI7qlKukr
BOCVTOBz+idwvrMPJGeOJyYrXaAFl2NT1oPl8e/lmDrIKjDX+ZMuqY/utVXTmosw
w7/2moTLP6dKhH2JGZfRY9SEekAM6LmfHBUevd1Z7yz/5/8FOXplUGDlrzPWz3RB
Dc4M+NzOZlT3uUXsVB1WZQEWChROpp3ijJkaC/bBtjaSh9xGAWc=
=FdVA
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-fix-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
- tegra210 div_u64 divison and max page fixes
- revert Qualcomm unavailable register workaround which is causing
regression, fixes have been proposed but still gaps are present so
revert this for now
* tag 'dmaengine-fix-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: Revert "dmaengine: qcom: bam_dma: Avoid writing unavailable register"
dmaengine: tegra210-adma: check for adma max page
dmaengine: tegra210-adma: Use div_u64 for 64 bit division
- rockchip phy kconfig dependency fix with USB_COMMON and regression fix
for old DT
- stm32 phy overflow assertion fix
- exonysfs phy refclk masks fix and power gate on exit fix
- freescale fix for clock dividor valid range
- TI regmap syscon register fix
- tegra reset registers on init fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmfEkhsACgkQfBQHDyUj
g0empg//Q+LYrbQbrmEWBPV7Ku5cOlWk3cGXHmzZyohr76NKDNci936gjEpjIkuG
77wuCil5dYw05Blbmo5oJySfsf0YFsIwOQPN5t+tEgxT3g+5KYddlhLRCKNEyg8l
6/LF/HdaZxFv6Ell9VSDMjlVLe63xpL2ZtYO9mJco22TNkDSQlutiMcq6GLDhsnk
exv15A25oiGZMIJ9BgpdfAa096Ze47KHVhof/WaH7q8rAgXoO+3wxJORzCuCqEEz
4ff/EVL+AZrBvOlEBDtN06a2Uj1KQkMNpdYvfrlWWhO5xGMBOnot4yZXTpGwHnre
j3g860vl1G1XjFStkgHxnnhbtIlqyepTEMgj+SShoD8oiG6eP9jeNJ5dtEzKDGNf
RxbH8Cf7tt0Va8Inibg5HzgFLfR5JMKQTKkPDpBErZlYEEnPdjUoJKavb6tKzvMY
i6/AeVfJGKabi5mPFEPOz007qbLW2a8wAXqJh/ynIanU/QwQFDpec/pavPY9MNax
//Zh6SQzaIcmVSmQop1sXzHCx/n0oBFFMod14aTaHRBGxx9tlxlwHt5suizzxVY6
ltfFh+iAOF1DzB0luCHKmlLk4HphpU5hq4ypEgmI9RjMoXFj6x6vajBSb4nq5zJm
064+0rKM4olEdWYEtnEYPPjQufQ+JQHnwBY8U03v2RK3QdhG+R4=
=fMZ6
-----END PGP SIGNATURE-----
Merge tag 'phy-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
- rockchip phy kconfig dependency fix with USB_COMMON and regression
fix for old DT
- stm32 phy overflow assertion fix
- exonysfs phy refclk masks fix and power gate on exit fix
- freescale fix for clock dividor valid range
- TI regmap syscon register fix
- tegra reset registers on init fix
* tag 'phy-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: tegra: xusb: reset VBUS & ID OVERRIDE
phy: ti: gmii-sel: Do not use syscon helper to build regmap
phy: exynos5-usbdrd: gs101: ensure power is gated to SS phy in phy_exit()
phy: freescale: fsl-samsung-hdmi: Limit PLL lock detection clock divider to valid range
phy: exynos5-usbdrd: fix MPLL_MULTIPLIER and SSC_REFCLKSEL masks in refclk
phy: stm32: Fix constant-value overflow assertion
phy: rockchip: naneng-combphy: compatible reset with old DT
phy: rockchip: fix Kconfig dependency more
- Fix a sporadic boot failure due to incorrect randomization of the
linear map on systems that support it
- Fix the zapping (both clearing the entries *and* invalidating the TLB)
of hugetlb PTEs constructed using the contiguous bit
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmfDdBIQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNN0GB/9gmEOX1GwMU6wFjPYqvjWlkGCFDwrldO84
uF9jEUbPaw3P4xHTOFyPCfEWidktqa+yDVbe90mB7GVOM+1eEZ81em1k1hYBEXbz
Q73Nl5VrNzxX4BjOrdxxoTSaR/TKklUh5mqWfIzy1RxEnBfpr/GuDPtUn1GViCAs
sU16Ju12UdYXn3tyHFDHpjZS9WYZskfnrvS0QvXinz0LahZrCkeaH+ptYHrTjMFx
hxyrRQwOlqLnZWvjLOegH9AC6uyRkKDinXKhXqHYvUfcfEkQsKwM7Fpc6cviUD0Q
X2npLNegnYxPniwmLpXfNXazPDnKVMzxb9lpqw1fZS3nAuh8XOde
=RqDZ
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Ryan's been hard at work finding and fixing mm bugs in the arm64 code,
so here's a small crop of fixes for -rc5.
The main changes are to fix our zapping of non-present PTEs for
hugetlb entries created using the contiguous bit in the page-table
rather than a block entry at the level above. Prior to these fixes, we
were pulling the contiguous bit back out of the PTE in order to
determine the size of the hugetlb page but this is clearly bogus if
the thing isn't present and consequently both the clearing of the
PTE(s) and the TLB invalidation were unreliable.
Although the problem was found by code inspection, we really don't
want this sitting around waiting to trigger and the changes are CC'd
to stable accordingly.
Note that the diffstat looks a lot worse than it really is;
huge_ptep_get_and_clear() now takes a size argument from the core code
and so all the arch implementations of that have been updated in a
pretty mechanical fashion.
- Fix a sporadic boot failure due to incorrect randomization of the
linear map on systems that support it
- Fix the zapping (both clearing the entries *and* invalidating the
TLB) of hugetlb PTEs constructed using the contiguous bit"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptes
mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear()
arm64/mm: Fix Boot panic on Ampere Altra
- Fix a regression where the enablement of the PHYs would be skipped
for device trees without any port child nodes. (me)
- Revert ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives, as it stops
systems from entering lower package states. LPM works on newer
firmware versions. We will need a more refined quirk that only
targets the older firmware versions. (me)
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRN+ES/c4tHlMch3DzJZDGjmcZNcgUCZ8LWcAAKCRDJZDGjmcZN
cgnDAP4gp/4Rly/E09WeSCFDtysqa6EriaUliSeNBZBCtZVIfgEAkTk/MjLxa4SR
qmfUe0XtjqZlFs/WyKvqwD+lSSxOKwA=
=LZ2Z
-----END PGP SIGNATURE-----
Merge tag 'ata-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Fix a regression where the enablement of the PHYs would be skipped
for device trees without any port child nodes (me)
- Revert ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives, as it stops
systems from entering lower package states. LPM works on newer
firmware versions. We will need a more refined quirk that only
targets the older firmware versions (me)
* tag 'ata-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
Revert "ata: libata-core: Add ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives"
ata: ahci: Make ahci_ignore_port() handle empty mask_port_map
* Fix TCR_EL2 configuration to not use the ASID in TTBR1_EL2
and not mess-up T1SZ/PS by using the HCR_EL2.E2H==0 layout.
* Bring back the VMID allocation to the vcpu_load phase, ensuring
that we only setup VTTBR_EL2 once on VHE. This cures an ugly
race that would lead to running with an unallocated VMID.
RISC-V:
* Fix hart status check in SBI HSM extension
* Fix hart suspend_type usage in SBI HSM extension
* Fix error returned by SBI IPI and TIME extensions for
unsupported function IDs
* Fix suspend_type usage in SBI SUSP extension
* Remove unnecessary vcpu kick after injecting interrupt
via IMSIC guest file
x86:
* Fix an nVMX bug where KVM fails to detect that, after nested
VM-Exit, L1 has a pending IRQ (or NMI).
* To avoid freeing the PIC while vCPUs are still around, which
would cause a NULL pointer access with the previous patch,
destroy vCPUs before any VM-level destruction.
* Handle failures to create vhost_tasks
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfCvVsUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPqGwf9FOWQRd/yCKHiufjPDefD1Og0DmgB
Dgk0nmHxaxbyPw+5vYlhn/J3vZ54sNngBpmUekE5OuBMZ9EsxXAK/myByHkzNnV9
cyLm4vYwpb9OQmbQ5MMdDlptYsjV40EmSfwwIJpBxjdkwAI3f7NgeHvG8EwkJgch
C+X4JMrLu2+BGo7BUhuE/xrB8h0CBRnhalB5aK1wuF+ey8v06zcU0zdQCRLUpOsx
mW9S0OpSpSlecvcblr0AhuajjHjwFaTFOQofaXaQFBW6kv3dXmSq/JRABEfx0TBb
MTUDQtnnaYvPy/RWwZIzBpgfASLQNQNxSJ7DIw9C8IG7k6rK25BSRwTmSw==
=afMB
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix TCR_EL2 configuration to not use the ASID in TTBR1_EL2 and not
mess-up T1SZ/PS by using the HCR_EL2.E2H==0 layout.
- Bring back the VMID allocation to the vcpu_load phase, ensuring
that we only setup VTTBR_EL2 once on VHE. This cures an ugly race
that would lead to running with an unallocated VMID.
RISC-V:
- Fix hart status check in SBI HSM extension
- Fix hart suspend_type usage in SBI HSM extension
- Fix error returned by SBI IPI and TIME extensions for unsupported
function IDs
- Fix suspend_type usage in SBI SUSP extension
- Remove unnecessary vcpu kick after injecting interrupt via IMSIC
guest file
x86:
- Fix an nVMX bug where KVM fails to detect that, after nested
VM-Exit, L1 has a pending IRQ (or NMI).
- To avoid freeing the PIC while vCPUs are still around, which would
cause a NULL pointer access with the previous patch, destroy vCPUs
before any VM-level destruction.
- Handle failures to create vhost_tasks"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: retry nx_huge_page_recovery_thread creation
vhost: return task creation error instead of NULL
KVM: nVMX: Process events on nested VM-Exit if injectable IRQ or NMI is pending
KVM: x86: Free vCPUs before freeing VM state
riscv: KVM: Remove unnecessary vcpu kick
KVM: arm64: Ensure a VMID is allocated before programming VTTBR_EL2
KVM: arm64: Fix tcr_el2 initialisation in hVHE mode
riscv: KVM: Fix SBI sleep_type use
riscv: KVM: Fix SBI TIME error generation
riscv: KVM: Fix SBI IPI error generation
riscv: KVM: Fix hart suspend_type use
riscv: KVM: Fix hart suspend status check
This reverts commit cc77e2ce18.
It was reported that adding ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives
breaks entering lower package states for certain systems.
It turns out that Samsung SSD 870 QVO actually has working LPM when using
a recent SSD firmware version.
The author of commit cc77e2ce18 ("ata: libata-core: Add ATA_QUIRK_NOLPM
for Samsung SSD 870 QVO drives") reported himself that only older SSD
firmware versions have broken LPM:
https://lore.kernel.org/stable/93c10d38-718c-459d-84a5-4d87680b4da7@debian.org/
Unfortunately, he did not specify which older firmware version he was using
which had broken LPM.
Let's revert this quirk, which has FW version field specified as NULL
(which means that it applies for all Samsung SSD 870 QVO firmware versions)
for now. Once the author reports which older firmware version(s) that are
broken, we can create a more fine grained quirk, which populates the FW
version field accordingly.
Fixes: cc77e2ce18 ("ata: libata-core: Add ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives")
Reported-by: Dieter Mummenschanz <dmummenschanz@web.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219747
Link: https://lore.kernel.org/r/20250228122603.91814-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
A VMM may send a non-fatal signal to its threads, including vCPU tasks,
at any time, and thus may signal vCPU tasks during KVM_RUN. If a vCPU
task receives the signal while its trying to spawn the huge page recovery
vhost task, then KVM_RUN will fail due to copy_process() returning
-ERESTARTNOINTR.
Rework call_once() to mark the call complete if and only if the called
function succeeds, and plumb the function's true error code back to the
call_once() invoker. This provides userspace with the correct, non-fatal
error code so that the VMM doesn't terminate the VM on -ENOMEM, and allows
subsequent KVM_RUN a succeed by virtue of retrying creation of the NX huge
page task.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
[implemented the kvm user side]
Signed-off-by: Keith Busch <kbusch@kernel.org>
Message-ID: <20250227230631.303431-3-kbusch@meta.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lets callers distinguish why the vhost task creation failed. No one
currently cares why it failed, so no real runtime change from this
patch, but that will not be the case for long.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Message-ID: <20250227230631.303431-2-kbusch@meta.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Declare io_find_buf_node() in io_uring/rsrc.h so it can be called from
other files.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250301001610.678223-1-csander@purestorage.com
[axboe: keep the inline for local hot path usage]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Indicate to userspace applications if a UBLK_IO_UNREGISTER_IO_BUF
command specifies an invalid buffer index by returning an error code.
Return -EINVAL if no buffer is registered with the given index, and
-EBUSY if the registered buffer is not a kernel bvec.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228231432.642417-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The cmd_to_io_kiocb() macro applies a pointer cast to its input without
parenthesizing it. Currently all inputs are variable names, so this has
the intended effect. But since casts have relatively high precedence,
the macro would apply the cast to the wrong value if the input was a
pointer addition, for example.
Turn the macro into a static inline function to ensure the pointer cast
is applied to the full input value.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228230305.630885-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring_cmd_import_fixed() takes a struct io_uring_cmd *, but the type
of the ioucmd parameter is void *. Make the pointer type explicit so the
compiler can type check it.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228221514.604350-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The macro rq_data_dir() already computes a request's data direction.
Use it in place of the if-else to set imu->dir.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228223057.615284-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Fix parsing cooling-maps in DT for trip points with more than one
cooling device (Rafael Wysocki).
- Fix granted_power computation in the Power Allocator thermal
governor and make it update total_weight on configuration changes
after the thermal zone has been registered (Yu-Che Cheng).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmfCI+wSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxiaQP/jo3HmmpvrN8ZNURzAFKa5lx8i1eNHdp
npEN9FBL22IcTjoZKjoFUoniC1PBU1NRGz3v9Wz0jXLDQNiAx/apYwuUTFFafWIN
hDjssxANylbuEIe4yyyiGDo+YoZ7ANUCutJJL53Y2f4BqD4x3iQbAnTY8HGdEKdB
WmRMYiC/1JnBW+LaFKl0d8n7Q7Yk39FvgECiELT1HPyZmLGh3pkEVzURjEyzRsgF
Sb85SWO2YOd5Vth2Ujwe2cbUkNBN+jUnLaMqY+NxQG5Ee57gKgLF9XvUaV2X1WzG
ceH5gy+P7/eYHeosjacCyCtVKrgVzVz4RLhoYsvJMvWcEUYmhmaB4eqUmSEuQuCX
W/WsXxvAsks+ZlzGlIASksxfvkNwegXAOOUdPuUocrc3Ft1nK4w0UOnrGHBHy/lJ
BrDRvIt6a4vtQrLGugvB96rUVRrx3TqjePrU/DZ3vrpLqVyXSdADtf8eRdmV9Aow
53iQNf3Q1u4UTzomKB0ilPEo79PAfdOwZIYQ6KF+nF/wf/lgllrlaWvQhx0EG7QX
Dcyg/Z0cZSOfRr/06cWd/cXJVOqL8lb+nTn8wWKE0KHOM59sBHhlN1Nk6LdSSSMq
Duj7t0fDT1F9kN9UUM9CZtw+vZmL2igouAscugq81Czuai4hvXUibFygAraMJ995
tQKsbZvhlgQC
=0H8e
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"These fix the processing of DT thermal properties and the Power
Allocator thermal governor:
- Fix parsing cooling-maps in DT for trip points with more than one
cooling device (Rafael Wysocki)
- Fix granted_power computation in the Power Allocator thermal
governor and make it update total_weight on configuration changes
after the thermal zone has been registered (Yu-Che Cheng)"
* tag 'thermal-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: gov_power_allocator: Update total_weight on bind and cdev updates
thermal/of: Fix cdev lookup in thermal_of_should_bind()
thermal: gov_power_allocator: Fix incorrect calculation in divvy_up_power()
Fix the handling of processors that stop the TSC in deeper C-states in
the intel_idle driver (Thomas Gleixner).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmfCJR4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxVtYP+wVfduKYmvsgbsgqTjx/jnt8HA7WaYCk
CgYp/D3sWPk5RHNpse1fpx1Ls4FPAxu/KV1bp3PHOc0AAU9dIvw5aIaKF4BPaFUN
zQhcIPD0HDAEkYBsevCReKrjB+Woo40locdwjDhrGoH1EQ5vsGfhEN3aj/c9GxcU
KUFxMoxReMWxcDb6mQNuGayXaCZm+eIb2EotFqG/Upq2o9JGARurl/gPArbbpdO6
wgP65CVZK5xEnmLYGf2dz3Zxl+IwyUGEaThr4ssx5UzTiWpMNKsduKyCdJXMBgSG
3Z0wVG/l4RIcYTYW9rbTfHSnfKS017KJQwc6HyhY5wYNQfC3636OFADCtG24+qzP
FChmnlujfTBo1BBEbd6rLIciw07VJBLdunK5gYcOMs7Ess0SCy38fSdL8tJyQ8gK
Hvil52mOdwsQKkXL5tXaXy9kFbnafLtqoETF3PKCdMzcBDH6KTgB6GXhFDJbKuKK
S2gRaHjxhoZMsOkpuWFFEGr/mGF8mzX6u+ewUxlgpsIHUnx5LQFYuq/i3rX2h923
eGGj/LFRNOh2H1lq8CjYWpohFtIjnw1ATLpE9fOjrQi9GRICsCPXAKfUTN4QKKpY
bcyR/DfnQ1AeIJbT4Dfo4IHPl067hGZ2r84AKPmn1WsGmSXjruzxYxh4C4wGw6n+
JBLB/9Oh8Y9W
=Qqmz
-----END PGP SIGNATURE-----
Merge tag 'pm-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix the handling of processors that stop the TSC in deeper C-states in
the intel_idle driver (Thomas Gleixner)"
* tag 'pm-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_idle: Handle older CPUs, which stop the TSC in deeper C states, correctly
- Fix conflicts between devicetree and ACPI SMP discovery & setup
- Fix a warm-boot lockup on AMD SC1100 SoC systems
- Fix a W=1 build warning related to x86 IRQ trace event setup
- Fix a kernel-doc warning
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfCDUQRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1ggUw/8Dh5RMxsOyqlZSoBeUTLpeN/+A31TlPZZ
QlwvkiJUpcfh2ySg6fIBSotVtaq4/6sHBwbskZBLsYfwxDJOt4/y+pijH1lr9Slg
5hAU199PS+snZxmaBFeCzhsBjp+iEK5HWQmmYuDLb6M7To6fpjFKDkTpBaRHpPhb
jiZxE19/xv50W+DrVc1f3WJliPT/+t6i87riH5R0Cdd9rdc8deI4vHlrUNF30d5C
Lq/akzq1tzvj15YO6LcEUiEtyu+05KMym74kCPWlCFo9ZqLgcrsA6I+jDiR111rv
XyXj7XutrFGFvm/CkbzqF9CmvImRvZWYWssIaZ4v6ealZZhUFEHzK76UVBd+Nv1t
WmJ427UjoJc4zAD3wHa4acsBZFOagea4I0pwumpZa6dnUxPYgIYbuhvvbUQK6De0
RBgk1oI4rMQE3+W6uW9xec7m3UP997r/uu7EivWR5eb4R0x/x0Foo9mwoMu0krbE
bk0lQ1D0BN5gFSOa7mFrHziqOJGRHIOdPzejElp6FqS+50jsYMzHp7uXuf2gf/zX
RfzHyclMPGzAr0oL9Nk9/edEP9uaHgR2Nwecs0lcAVVn5nuBCuZ8GcD7Mrk5hIJc
Vp97vhJC4+3cRRAdQUFks/PEQs+dBUdRhY5k2ZpT3rEe2IMi7k53CFdXRqQtwga4
TCHglg7rIxk=
=ueF7
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix conflicts between devicetree and ACPI SMP discovery & setup
- Fix a warm-boot lockup on AMD SC1100 SoC systems
- Fix a W=1 build warning related to x86 IRQ trace event setup
- Fix a kernel-doc warning
* tag 'x86-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Fix kernel-doc warning
x86/irq: Define trace events conditionally
x86/CPU: Fix warm boot hang regression on AMD SC1100 SoC systems
x86/of: Don't use DTB for SMP setup if ACPI is enabled
on PREEMPT_NONE and PREEMPT_VOLUNTARY kernels.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfCDDMRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gf+RAAvFNXelLgrNbILZ6ckp/ikWnjCbf2QOIk
aCm6JMQm7WrFvgo1u6CM4vQQYZdEqf8+KiEjJJnoq2P4jvYzhO1/1pLfEDNaHeiH
GneosmKAwSMR8lgDlw5DXxhXsfeuYYhG5VMe2ia+kyiIA83TUF6hl9jpawWB3dsw
+xB6CAg3JLoR2v44E/Mf1PdGaGrF90fYxp+X5RNSqxVXcN54cgVx2G9lHeTIWcnp
SjIiWo5mply50de+dxD5dNUB9mj/k+yLQaiuPfUDGo/ZOjFyBnsP5VlD+ySbhkIa
Rwdw6olLqXLcX5D5RsPIuePm/XdmAQXr6GXxJjdhtV1oWTP3Bejev3upQ/kxHQ50
DQa+aSTqNx9bNlwphUafCmVo1OZap4mViOSWP7r96HhFwehLGGmkjEaU9eFuUl0P
kG+qGq28U+Nnz0r6/pEkwic1B6wbq2x1XRbtJqxXnBcQvMxMgDWNrTIj1ytDcSBb
3Qo0shRrtjH7DN1ly8IBllLQ0wXXI5O6GwjI7absEyEjpdoxFyMsHpaFONlTWRdi
NgR2+5MWTxExeWaDRPAJM+THzwucfWVTeZVXJFMRfQnNIBj7TpO3X3Y4xzP9Vl/Y
2HEz8voSDZUVN6Ejxx/am7kb68WpWw46xmj59wWT7nf9SVEEm+R4Pfe3O9+0yvQV
V4l6tN4yfEU=
=RknP
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Prevent cond_resched() based preemption when interrupts are disabled,
on PREEMPT_NONE and PREEMPT_VOLUNTARY kernels"
* tag 'sched-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Prevent rescheduling when interrupts are disabled
- Fix missing RCU protection in perf_iterate_ctx()
- Fix pmu_ctx_list ordering bug
- Reject the zero page in uprobes
- Fix a family of bugs related to low frequency sampling
- Add Intel Arrow Lake U CPUs to the generic Arrow Lake
RAPL support table
- Fix a lockdep-assert false positive in uretprobes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfCCxQRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iT1RAAtG0sbai0gJ2OMEOIAZROKKwkFMfx67Vl
ZrcnPCkXK7mL6WQNErFTjdi7Cv2fsMP2WfMHv92ddgrlQZ02EosMKKro1q8ZEd18
DpJmfujNGDkM0VDHd1Lg4/vPQw1RuEY85kqxRKIr5xtoKFR1sxNNNFwsWWeECbKW
QnfzJsk1nFQUPHcD+FlLeyTnb6MgnLdcPMnXWLC4qVRBTHxCi/TS1XXhFHah31Rv
kiCdEHMVUA1WXrPl+1I0DW/EjugcTWTB6cXat9YBZpsR2ZsVNrfNgdBtjCn0zEuf
U3g8gQ/jm9GaZ1Q0ozTsklZlcH8JtOskYOaYiinN7lh5QWYlI2AWTnl6EZxrIKmV
sw2LCl1BQLQocCr9GC+99Golv3U5FvxvRgTIBTzJs2t2WZtjF5Ceg1gwy12zLTKw
VSGlLQZz55uHsgl3g37oNhNA0q4BbtuINlZWU6hHWjUEEeogVTjbSucv+8zFI+Dk
0tupuNF5xQB55D5KZ2EhCFgmSFWvjq1K9piM0HuHk8yrFYhHWoSPp5rg4XyYFpBC
o3nJfkOL5hEVGJoeV2wo1CTs6SZNgWBNuV+9MyCS/sTDM2Ggj0x8Vl+d/ewVi7iO
WE2Xksp5awRPv/m+a/XIPc+xQMecnOELVj3RrQZ8AzNUSvfKmv01BqjqcOa0wdgW
9EJeG6U2msQ=
=e7a/
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Miscellaneous perf events fixes and a minor HW enablement change:
- Fix missing RCU protection in perf_iterate_ctx()
- Fix pmu_ctx_list ordering bug
- Reject the zero page in uprobes
- Fix a family of bugs related to low frequency sampling
- Add Intel Arrow Lake U CPUs to the generic Arrow Lake RAPL support
table
- Fix a lockdep-assert false positive in uretprobes"
* tag 'perf-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
uprobes: Remove too strict lockdep_assert() condition in hprobe_expire()
perf/x86/rapl: Add support for Intel Arrow Lake U
perf/x86/intel: Use better start period for frequency mode
perf/core: Fix low freq setting via IOC_PERIOD
perf/x86: Fix low freqency setting issue
uprobes: Reject the shared zeropage in uprobe_write_opcode()
perf/core: Order the PMU list to fix warning about unordered pmu_ctx_list
perf/core: Add RCU read lock protection to perf_iterate_ctx()
build warnings that happens on PIE-enabled architectures
such as LoongArch.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfCCV0RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jHsA//RLrKlu/nvMO0l03s9ndWaunz0Nih/Z54
JJm4Ig+NLYY8VWLH+q6xfII8LT0oUeLh4OtmRj1vMhDbi2DnWozzDA1cvh6Fhyfj
YG9Xgzej7ZPU43AVlItl80TbG/xfEdD5yxZx8WC6/WUI17XcpTkDC9/NG/NfrUjY
2a+A8nJk/qa/M+qCHf7ugDxMQFrOVRDDtdBSRAUbifqgygKO+AtXCIyeBfbtVkyV
GEWusjO4lsyiY1dzG911h+ROIl/hp6Y3B940PSXyiwjs8JFJJplECnrnwROvRhTm
xUDFeszEh40i6zHu2dzFf0up9vFCFZGpjzvMTc0Dt50i4Z3OxbxXW+Ust2YjpSms
Ux+MHsH850UoXS/4QB7R2RJndqLTvsYkcu8uhOc2izVkWvTORba0+SMguKuTu1xI
MDKozZxtH1DPCtMcZtNIgVlQEwsioKaHUS/PgXhdqw8+fg3ur9FEXp8Q2ubUm7XJ
VJm6nF0RvFZNiIIDm81O1Z1RmmxUuAlAHWIOREaNzTSHu3ptBjBOtRrBTem2WAMa
9g1n4GoB7HI96TG36ik0/1oZhEAhawEwT/HhehVpPJyKNQooADYqhhPpfysyGJzq
mp+oOdf0QYfD0+M4oqmEGN0fhIlNobK7ap2O+t7HPcJwO1Om3h6WCkyijixj6xPt
cOuqhG+GKSk=
=16p4
-----END PGP SIGNATURE-----
Merge tag 'objtool-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar:
"Fix an objtool false positive, and objtool related build warnings that
happens on PIE-enabled architectures such as LoongArch"
* tag 'objtool-urgent-2025-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add bch2_trans_unlocked_or_in_restart_error() to bcachefs noreturns
objtool: Fix C jump table annotations for Clang
vmlinux.lds: Ensure that const vars with relocations are mapped R/O
- Fix crash from bad histogram entry
An error path in the histogram creation could leave an entry
in a link list that gets freed. Then when a new entry is added
it can cause a u-a-f bug. This is fixed by restructuring the code
so that the histogram is consistent on failure and everything is
cleaned up appropriately.
- Fix fprobe self test
The fprobe self test relies on no function being attached by ftrace.
BPF programs can attach to functions via ftrace and systemd now
does so. This causes those functions to appear in the enabled_functions
list which holds all functions attached by ftrace. The selftest also
uses that file to see if functions are being connected correctly.
It counts the functions in the file, but if there's already functions
in the file, it fails. Instead, add the number of functions in the file
at the start of the test to all the calculations during the test.
- Fix potential division by zero of the function profiler stddev
The calculated divisor that calculates the standard deviation of
the function times can overflow. If the overflow happens to land
on zero, that can cause a division by zero. Check for zero from
the calculation before doing the division.
TODO: Catch when it ever overflows and report it accordingly.
For now, just prevent the system from crashing.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ8HqYBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qpoXAP90gvO2LfjItjZVjBYudr4GOzcsjAAK
cZ2vL2LJp3hT4QD+Kud2YaZqzrV8tvFFBikO7FvEV3zZpnw48895pIgcoww=
=NLe0
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix crash from bad histogram entry
An error path in the histogram creation could leave an entry in a
link list that gets freed. Then when a new entry is added it can
cause a u-a-f bug. This is fixed by restructuring the code so that
the histogram is consistent on failure and everything is cleaned up
appropriately.
- Fix fprobe self test
The fprobe self test relies on no function being attached by ftrace.
BPF programs can attach to functions via ftrace and systemd now does
so. This causes those functions to appear in the enabled_functions
list which holds all functions attached by ftrace. The selftest also
uses that file to see if functions are being connected correctly. It
counts the functions in the file, but if there's already functions in
the file, it fails. Instead, add the number of functions in the file
at the start of the test to all the calculations during the test.
- Fix potential division by zero of the function profiler stddev
The calculated divisor that calculates the standard deviation of the
function times can overflow. If the overflow happens to land on zero,
that can cause a division by zero. Check for zero from the
calculation before doing the division.
TODO: Catch when it ever overflows and report it accordingly. For
now, just prevent the system from crashing.
* tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Avoid potential division by zero in function_stat_show()
selftests/ftrace: Let fprobe test consider already enabled functions
tracing: Fix bad hist from corrupting named_triggers list