forked from mirrors/linux
		
	Documentation/core-api: add swiotlb documentation
There's currently no documentation for the swiotlb. Add documentation describing usage scenarios, the key APIs, and implementation details. Group the new documentation with other DMA-related documentation. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>
This commit is contained in:
		
							parent
							
								
									e67572cd22
								
							
						
					
					
						commit
						c93f261dfc
					
				
					 2 changed files with 322 additions and 0 deletions
				
			
		| 
						 | 
				
			
			@ -102,6 +102,7 @@ more memory-management documentation in Documentation/mm/index.rst.
 | 
			
		|||
   dma-api-howto
 | 
			
		||||
   dma-attributes
 | 
			
		||||
   dma-isa-lpc
 | 
			
		||||
   swiotlb
 | 
			
		||||
   mm-api
 | 
			
		||||
   genalloc
 | 
			
		||||
   pin_user_pages
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
							
								
								
									
										321
									
								
								Documentation/core-api/swiotlb.rst
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										321
									
								
								Documentation/core-api/swiotlb.rst
									
									
									
									
									
										Normal file
									
								
							| 
						 | 
				
			
			@ -0,0 +1,321 @@
 | 
			
		|||
.. SPDX-License-Identifier: GPL-2.0
 | 
			
		||||
 | 
			
		||||
===============
 | 
			
		||||
DMA and swiotlb
 | 
			
		||||
===============
 | 
			
		||||
 | 
			
		||||
swiotlb is a memory buffer allocator used by the Linux kernel DMA layer. It is
 | 
			
		||||
typically used when a device doing DMA can't directly access the target memory
 | 
			
		||||
buffer because of hardware limitations or other requirements. In such a case,
 | 
			
		||||
the DMA layer calls swiotlb to allocate a temporary memory buffer that conforms
 | 
			
		||||
to the limitations. The DMA is done to/from this temporary memory buffer, and
 | 
			
		||||
the CPU copies the data between the temporary buffer and the original target
 | 
			
		||||
memory buffer. This approach is generically called "bounce buffering", and the
 | 
			
		||||
temporary memory buffer is called a "bounce buffer".
 | 
			
		||||
 | 
			
		||||
Device drivers don't interact directly with swiotlb. Instead, drivers inform
 | 
			
		||||
the DMA layer of the DMA attributes of the devices they are managing, and use
 | 
			
		||||
the normal DMA map, unmap, and sync APIs when programming a device to do DMA.
 | 
			
		||||
These APIs use the device DMA attributes and kernel-wide settings to determine
 | 
			
		||||
if bounce buffering is necessary. If so, the DMA layer manages the allocation,
 | 
			
		||||
freeing, and sync'ing of bounce buffers. Since the DMA attributes are per
 | 
			
		||||
device, some devices in a system may use bounce buffering while others do not.
 | 
			
		||||
 | 
			
		||||
Because the CPU copies data between the bounce buffer and the original target
 | 
			
		||||
memory buffer, doing bounce buffering is slower than doing DMA directly to the
 | 
			
		||||
original memory buffer, and it consumes more CPU resources. So it is used only
 | 
			
		||||
when necessary for providing DMA functionality.
 | 
			
		||||
 | 
			
		||||
Usage Scenarios
 | 
			
		||||
---------------
 | 
			
		||||
swiotlb was originally created to handle DMA for devices with addressing
 | 
			
		||||
limitations. As physical memory sizes grew beyond 4 GiB, some devices could
 | 
			
		||||
only provide 32-bit DMA addresses. By allocating bounce buffer memory below
 | 
			
		||||
the 4 GiB line, these devices with addressing limitations could still work and
 | 
			
		||||
do DMA.
 | 
			
		||||
 | 
			
		||||
More recently, Confidential Computing (CoCo) VMs have the guest VM's memory
 | 
			
		||||
encrypted by default, and the memory is not accessible by the host hypervisor
 | 
			
		||||
and VMM. For the host to do I/O on behalf of the guest, the I/O must be
 | 
			
		||||
directed to guest memory that is unencrypted. CoCo VMs set a kernel-wide option
 | 
			
		||||
to force all DMA I/O to use bounce buffers, and the bounce buffer memory is set
 | 
			
		||||
up as unencrypted. The host does DMA I/O to/from the bounce buffer memory, and
 | 
			
		||||
the Linux kernel DMA layer does "sync" operations to cause the CPU to copy the
 | 
			
		||||
data to/from the original target memory buffer. The CPU copying bridges between
 | 
			
		||||
the unencrypted and the encrypted memory. This use of bounce buffers allows
 | 
			
		||||
device drivers to "just work" in a CoCo VM, with no modifications
 | 
			
		||||
needed to handle the memory encryption complexity.
 | 
			
		||||
 | 
			
		||||
Other edge case scenarios arise for bounce buffers. For example, when IOMMU
 | 
			
		||||
mappings are set up for a DMA operation to/from a device that is considered
 | 
			
		||||
"untrusted", the device should be given access only to the memory containing
 | 
			
		||||
the data being transferred. But if that memory occupies only part of an IOMMU
 | 
			
		||||
granule, other parts of the granule may contain unrelated kernel data. Since
 | 
			
		||||
IOMMU access control is per-granule, the untrusted device can gain access to
 | 
			
		||||
the unrelated kernel data. This problem is solved by bounce buffering the DMA
 | 
			
		||||
operation and ensuring that unused portions of the bounce buffers do not
 | 
			
		||||
contain any unrelated kernel data.
 | 
			
		||||
 | 
			
		||||
Core Functionality
 | 
			
		||||
------------------
 | 
			
		||||
The primary swiotlb APIs are swiotlb_tbl_map_single() and
 | 
			
		||||
swiotlb_tbl_unmap_single(). The "map" API allocates a bounce buffer of a
 | 
			
		||||
specified size in bytes and returns the physical address of the buffer. The
 | 
			
		||||
buffer memory is physically contiguous. The expectation is that the DMA layer
 | 
			
		||||
maps the physical memory address to a DMA address, and returns the DMA address
 | 
			
		||||
to the driver for programming into the device. If a DMA operation specifies
 | 
			
		||||
multiple memory buffer segments, a separate bounce buffer must be allocated for
 | 
			
		||||
each segment. swiotlb_tbl_map_single() always does a "sync" operation (i.e., a
 | 
			
		||||
CPU copy) to initialize the bounce buffer to match the contents of the original
 | 
			
		||||
buffer.
 | 
			
		||||
 | 
			
		||||
swiotlb_tbl_unmap_single() does the reverse. If the DMA operation might have
 | 
			
		||||
updated the bounce buffer memory and DMA_ATTR_SKIP_CPU_SYNC is not set, the
 | 
			
		||||
unmap does a "sync" operation to cause a CPU copy of the data from the bounce
 | 
			
		||||
buffer back to the original buffer. Then the bounce buffer memory is freed.
 | 
			
		||||
 | 
			
		||||
swiotlb also provides "sync" APIs that correspond to the dma_sync_*() APIs that
 | 
			
		||||
a driver may use when control of a buffer transitions between the CPU and the
 | 
			
		||||
device. The swiotlb "sync" APIs cause a CPU copy of the data between the
 | 
			
		||||
original buffer and the bounce buffer. Like the dma_sync_*() APIs, the swiotlb
 | 
			
		||||
"sync" APIs support doing a partial sync, where only a subset of the bounce
 | 
			
		||||
buffer is copied to/from the original buffer.
 | 
			
		||||
 | 
			
		||||
Core Functionality Constraints
 | 
			
		||||
------------------------------
 | 
			
		||||
The swiotlb map/unmap/sync APIs must operate without blocking, as they are
 | 
			
		||||
called by the corresponding DMA APIs which may run in contexts that cannot
 | 
			
		||||
block. Hence the default memory pool for swiotlb allocations must be
 | 
			
		||||
pre-allocated at boot time (but see Dynamic swiotlb below). Because swiotlb
 | 
			
		||||
allocations must be physically contiguous, the entire default memory pool is
 | 
			
		||||
allocated as a single contiguous block.
 | 
			
		||||
 | 
			
		||||
The need to pre-allocate the default swiotlb pool creates a boot-time tradeoff.
 | 
			
		||||
The pool should be large enough to ensure that bounce buffer requests can
 | 
			
		||||
always be satisfied, as the non-blocking requirement means requests can't wait
 | 
			
		||||
for space to become available. But a large pool potentially wastes memory, as
 | 
			
		||||
this pre-allocated memory is not available for other uses in the system. The
 | 
			
		||||
tradeoff is particularly acute in CoCo VMs that use bounce buffers for all DMA
 | 
			
		||||
I/O. These VMs use a heuristic to set the default pool size to ~6% of memory,
 | 
			
		||||
with a max of 1 GiB, which has the potential to be very wasteful of memory.
 | 
			
		||||
Conversely, the heuristic might produce a size that is insufficient, depending
 | 
			
		||||
on the I/O patterns of the workload in the VM. The dynamic swiotlb feature
 | 
			
		||||
described below can help, but has limitations. Better management of the swiotlb
 | 
			
		||||
default memory pool size remains an open issue.
 | 
			
		||||
 | 
			
		||||
A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE
 | 
			
		||||
bytes, which is 256 KiB with current definitions. When a device's DMA settings
 | 
			
		||||
are such that the device might use swiotlb, the maximum size of a DMA segment
 | 
			
		||||
must be limited to that 256 KiB. This value is communicated to higher-level
 | 
			
		||||
kernel code via dma_map_mapping_size() and swiotlb_max_mapping_size(). If the
 | 
			
		||||
higher-level code fails to account for this limit, it may make requests that
 | 
			
		||||
are too large for swiotlb, and get a "swiotlb full" error.
 | 
			
		||||
 | 
			
		||||
A key device DMA setting is "min_align_mask", which is a power of 2 minus 1
 | 
			
		||||
so that some number of low order bits are set, or it may be zero. swiotlb
 | 
			
		||||
allocations ensure these min_align_mask bits of the physical address of the
 | 
			
		||||
bounce buffer match the same bits in the address of the original buffer. When
 | 
			
		||||
min_align_mask is non-zero, it may produce an "alignment offset" in the address
 | 
			
		||||
of the bounce buffer that slightly reduces the maximum size of an allocation.
 | 
			
		||||
This potential alignment offset is reflected in the value returned by
 | 
			
		||||
swiotlb_max_mapping_size(), which can show up in places like
 | 
			
		||||
/sys/block/<device>/queue/max_sectors_kb. For example, if a device does not use
 | 
			
		||||
swiotlb, max_sectors_kb might be 512 KiB or larger. If a device might use
 | 
			
		||||
swiotlb, max_sectors_kb will be 256 KiB. When min_align_mask is non-zero,
 | 
			
		||||
max_sectors_kb might be even smaller, such as 252 KiB.
 | 
			
		||||
 | 
			
		||||
swiotlb_tbl_map_single() also takes an "alloc_align_mask" parameter. This
 | 
			
		||||
parameter specifies the allocation of bounce buffer space must start at a
 | 
			
		||||
physical address with the alloc_align_mask bits set to zero. But the actual
 | 
			
		||||
bounce buffer might start at a larger address if min_align_mask is non-zero.
 | 
			
		||||
Hence there may be pre-padding space that is allocated prior to the start of
 | 
			
		||||
the bounce buffer. Similarly, the end of the bounce buffer is rounded up to an
 | 
			
		||||
alloc_align_mask boundary, potentially resulting in post-padding space. Any
 | 
			
		||||
pre-padding or post-padding space is not initialized by swiotlb code. The
 | 
			
		||||
"alloc_align_mask" parameter is used by IOMMU code when mapping for untrusted
 | 
			
		||||
devices. It is set to the granule size - 1 so that the bounce buffer is
 | 
			
		||||
allocated entirely from granules that are not used for any other purpose.
 | 
			
		||||
 | 
			
		||||
Data structures concepts
 | 
			
		||||
------------------------
 | 
			
		||||
Memory used for swiotlb bounce buffers is allocated from overall system memory
 | 
			
		||||
as one or more "pools". The default pool is allocated during system boot with a
 | 
			
		||||
default size of 64 MiB. The default pool size may be modified with the
 | 
			
		||||
"swiotlb=" kernel boot line parameter. The default size may also be adjusted
 | 
			
		||||
due to other conditions, such as running in a CoCo VM, as described above. If
 | 
			
		||||
CONFIG_SWIOTLB_DYNAMIC is enabled, additional pools may be allocated later in
 | 
			
		||||
the life of the system. Each pool must be a contiguous range of physical
 | 
			
		||||
memory. The default pool is allocated below the 4 GiB physical address line so
 | 
			
		||||
it works for devices that can only address 32-bits of physical memory (unless
 | 
			
		||||
architecture-specific code provides the SWIOTLB_ANY flag). In a CoCo VM, the
 | 
			
		||||
pool memory must be decrypted before swiotlb is used.
 | 
			
		||||
 | 
			
		||||
Each pool is divided into "slots" of size IO_TLB_SIZE, which is 2 KiB with
 | 
			
		||||
current definitions. IO_TLB_SEGSIZE contiguous slots (128 slots) constitute
 | 
			
		||||
what might be called a "slot set". When a bounce buffer is allocated, it
 | 
			
		||||
occupies one or more contiguous slots. A slot is never shared by multiple
 | 
			
		||||
bounce buffers. Furthermore, a bounce buffer must be allocated from a single
 | 
			
		||||
slot set, which leads to the maximum bounce buffer size being IO_TLB_SIZE *
 | 
			
		||||
IO_TLB_SEGSIZE. Multiple smaller bounce buffers may co-exist in a single slot
 | 
			
		||||
set if the alignment and size constraints can be met.
 | 
			
		||||
 | 
			
		||||
Slots are also grouped into "areas", with the constraint that a slot set exists
 | 
			
		||||
entirely in a single area. Each area has its own spin lock that must be held to
 | 
			
		||||
manipulate the slots in that area. The division into areas avoids contending
 | 
			
		||||
for a single global spin lock when swiotlb is heavily used, such as in a CoCo
 | 
			
		||||
VM. The number of areas defaults to the number of CPUs in the system for
 | 
			
		||||
maximum parallelism, but since an area can't be smaller than IO_TLB_SEGSIZE
 | 
			
		||||
slots, it might be necessary to assign multiple CPUs to the same area. The
 | 
			
		||||
number of areas can also be set via the "swiotlb=" kernel boot parameter.
 | 
			
		||||
 | 
			
		||||
When allocating a bounce buffer, if the area associated with the calling CPU
 | 
			
		||||
does not have enough free space, areas associated with other CPUs are tried
 | 
			
		||||
sequentially. For each area tried, the area's spin lock must be obtained before
 | 
			
		||||
trying an allocation, so contention may occur if swiotlb is relatively busy
 | 
			
		||||
overall. But an allocation request does not fail unless all areas do not have
 | 
			
		||||
enough free space.
 | 
			
		||||
 | 
			
		||||
IO_TLB_SIZE, IO_TLB_SEGSIZE, and the number of areas must all be powers of 2 as
 | 
			
		||||
the code uses shifting and bit masking to do many of the calculations. The
 | 
			
		||||
number of areas is rounded up to a power of 2 if necessary to meet this
 | 
			
		||||
requirement.
 | 
			
		||||
 | 
			
		||||
The default pool is allocated with PAGE_SIZE alignment. If an alloc_align_mask
 | 
			
		||||
argument to swiotlb_tbl_map_single() specifies a larger alignment, one or more
 | 
			
		||||
initial slots in each slot set might not meet the alloc_align_mask criterium.
 | 
			
		||||
Because a bounce buffer allocation can't cross a slot set boundary, eliminating
 | 
			
		||||
those initial slots effectively reduces the max size of a bounce buffer.
 | 
			
		||||
Currently, there's no problem because alloc_align_mask is set based on IOMMU
 | 
			
		||||
granule size, and granules cannot be larger than PAGE_SIZE. But if that were to
 | 
			
		||||
change in the future, the initial pool allocation might need to be done with
 | 
			
		||||
alignment larger than PAGE_SIZE.
 | 
			
		||||
 | 
			
		||||
Dynamic swiotlb
 | 
			
		||||
---------------
 | 
			
		||||
When CONFIG_DYNAMIC_SWIOTLB is enabled, swiotlb can do on-demand expansion of
 | 
			
		||||
the amount of memory available for allocation as bounce buffers. If a bounce
 | 
			
		||||
buffer request fails due to lack of available space, an asynchronous background
 | 
			
		||||
task is kicked off to allocate memory from general system memory and turn it
 | 
			
		||||
into an swiotlb pool. Creating an additional pool must be done asynchronously
 | 
			
		||||
because the memory allocation may block, and as noted above, swiotlb requests
 | 
			
		||||
are not allowed to block. Once the background task is kicked off, the bounce
 | 
			
		||||
buffer request creates a "transient pool" to avoid returning an "swiotlb full"
 | 
			
		||||
error. A transient pool has the size of the bounce buffer request, and is
 | 
			
		||||
deleted when the bounce buffer is freed. Memory for this transient pool comes
 | 
			
		||||
from the general system memory atomic pool so that creation does not block.
 | 
			
		||||
Creating a transient pool has relatively high cost, particularly in a CoCo VM
 | 
			
		||||
where the memory must be decrypted, so it is done only as a stopgap until the
 | 
			
		||||
background task can add another non-transient pool.
 | 
			
		||||
 | 
			
		||||
Adding a dynamic pool has limitations. Like with the default pool, the memory
 | 
			
		||||
must be physically contiguous, so the size is limited to MAX_PAGE_ORDER pages
 | 
			
		||||
(e.g., 4 MiB on a typical x86 system). Due to memory fragmentation, a max size
 | 
			
		||||
allocation may not be available. The dynamic pool allocator tries smaller sizes
 | 
			
		||||
until it succeeds, but with a minimum size of 1 MiB. Given sufficient system
 | 
			
		||||
memory fragmentation, dynamically adding a pool might not succeed at all.
 | 
			
		||||
 | 
			
		||||
The number of areas in a dynamic pool may be different from the number of areas
 | 
			
		||||
in the default pool. Because the new pool size is typically a few MiB at most,
 | 
			
		||||
the number of areas will likely be smaller. For example, with a new pool size
 | 
			
		||||
of 4 MiB and the 256 KiB minimum area size, only 16 areas can be created. If
 | 
			
		||||
the system has more than 16 CPUs, multiple CPUs must share an area, creating
 | 
			
		||||
more lock contention.
 | 
			
		||||
 | 
			
		||||
New pools added via dynamic swiotlb are linked together in a linear list.
 | 
			
		||||
swiotlb code frequently must search for the pool containing a particular
 | 
			
		||||
swiotlb physical address, so that search is linear and not performant with a
 | 
			
		||||
large number of dynamic pools. The data structures could be improved for
 | 
			
		||||
faster searches.
 | 
			
		||||
 | 
			
		||||
Overall, dynamic swiotlb works best for small configurations with relatively
 | 
			
		||||
few CPUs. It allows the default swiotlb pool to be smaller so that memory is
 | 
			
		||||
not wasted, with dynamic pools making more space available if needed (as long
 | 
			
		||||
as fragmentation isn't an obstacle). It is less useful for large CoCo VMs.
 | 
			
		||||
 | 
			
		||||
Data Structure Details
 | 
			
		||||
----------------------
 | 
			
		||||
swiotlb is managed with four primary data structures: io_tlb_mem, io_tlb_pool,
 | 
			
		||||
io_tlb_area, and io_tlb_slot. io_tlb_mem describes a swiotlb memory allocator,
 | 
			
		||||
which includes the default memory pool and any dynamic or transient pools
 | 
			
		||||
linked to it. Limited statistics on swiotlb usage are kept per memory allocator
 | 
			
		||||
and are stored in this data structure. These statistics are available under
 | 
			
		||||
/sys/kernel/debug/swiotlb when CONFIG_DEBUG_FS is set.
 | 
			
		||||
 | 
			
		||||
io_tlb_pool describes a memory pool, either the default pool, a dynamic pool,
 | 
			
		||||
or a transient pool. The description includes the start and end addresses of
 | 
			
		||||
the memory in the pool, a pointer to an array of io_tlb_area structures, and a
 | 
			
		||||
pointer to an array of io_tlb_slot structures that are associated with the pool.
 | 
			
		||||
 | 
			
		||||
io_tlb_area describes an area. The primary field is the spin lock used to
 | 
			
		||||
serialize access to slots in the area. The io_tlb_area array for a pool has an
 | 
			
		||||
entry for each area, and is accessed using a 0-based area index derived from the
 | 
			
		||||
calling processor ID. Areas exist solely to allow parallel access to swiotlb
 | 
			
		||||
from multiple CPUs.
 | 
			
		||||
 | 
			
		||||
io_tlb_slot describes an individual memory slot in the pool, with size
 | 
			
		||||
IO_TLB_SIZE (2 KiB currently). The io_tlb_slot array is indexed by the slot
 | 
			
		||||
index computed from the bounce buffer address relative to the starting memory
 | 
			
		||||
address of the pool. The size of struct io_tlb_slot is 24 bytes, so the
 | 
			
		||||
overhead is about 1% of the slot size.
 | 
			
		||||
 | 
			
		||||
The io_tlb_slot array is designed to meet several requirements. First, the DMA
 | 
			
		||||
APIs and the corresponding swiotlb APIs use the bounce buffer address as the
 | 
			
		||||
identifier for a bounce buffer. This address is returned by
 | 
			
		||||
swiotlb_tbl_map_single(), and then passed as an argument to
 | 
			
		||||
swiotlb_tbl_unmap_single() and the swiotlb_sync_*() functions.  The original
 | 
			
		||||
memory buffer address obviously must be passed as an argument to
 | 
			
		||||
swiotlb_tbl_map_single(), but it is not passed to the other APIs. Consequently,
 | 
			
		||||
swiotlb data structures must save the original memory buffer address so that it
 | 
			
		||||
can be used when doing sync operations. This original address is saved in the
 | 
			
		||||
io_tlb_slot array.
 | 
			
		||||
 | 
			
		||||
Second, the io_tlb_slot array must handle partial sync requests. In such cases,
 | 
			
		||||
the argument to swiotlb_sync_*() is not the address of the start of the bounce
 | 
			
		||||
buffer but an address somewhere in the middle of the bounce buffer, and the
 | 
			
		||||
address of the start of the bounce buffer isn't known to swiotlb code. But
 | 
			
		||||
swiotlb code must be able to calculate the corresponding original memory buffer
 | 
			
		||||
address to do the CPU copy dictated by the "sync". So an adjusted original
 | 
			
		||||
memory buffer address is populated into the struct io_tlb_slot for each slot
 | 
			
		||||
occupied by the bounce buffer. An adjusted "alloc_size" of the bounce buffer is
 | 
			
		||||
also recorded in each struct io_tlb_slot so a sanity check can be performed on
 | 
			
		||||
the size of the "sync" operation. The "alloc_size" field is not used except for
 | 
			
		||||
the sanity check.
 | 
			
		||||
 | 
			
		||||
Third, the io_tlb_slot array is used to track available slots. The "list" field
 | 
			
		||||
in struct io_tlb_slot records how many contiguous available slots exist starting
 | 
			
		||||
at that slot. A "0" indicates that the slot is occupied. A value of "1"
 | 
			
		||||
indicates only the current slot is available. A value of "2" indicates the
 | 
			
		||||
current slot and the next slot are available, etc. The maximum value is
 | 
			
		||||
IO_TLB_SEGSIZE, which can appear in the first slot in a slot set, and indicates
 | 
			
		||||
that the entire slot set is available. These values are used when searching for
 | 
			
		||||
available slots to use for a new bounce buffer. They are updated when allocating
 | 
			
		||||
a new bounce buffer and when freeing a bounce buffer. At pool creation time, the
 | 
			
		||||
"list" field is initialized to IO_TLB_SEGSIZE down to 1 for the slots in every
 | 
			
		||||
slot set.
 | 
			
		||||
 | 
			
		||||
Fourth, the io_tlb_slot array keeps track of any "padding slots" allocated to
 | 
			
		||||
meet alloc_align_mask requirements described above. When
 | 
			
		||||
swiotlb_tlb_map_single() allocates bounce buffer space to meet alloc_align_mask
 | 
			
		||||
requirements, it may allocate pre-padding space across zero or more slots. But
 | 
			
		||||
when swiotbl_tlb_unmap_single() is called with the bounce buffer address, the
 | 
			
		||||
alloc_align_mask value that governed the allocation, and therefore the
 | 
			
		||||
allocation of any padding slots, is not known. The "pad_slots" field records
 | 
			
		||||
the number of padding slots so that swiotlb_tbl_unmap_single() can free them.
 | 
			
		||||
The "pad_slots" value is recorded only in the first non-padding slot allocated
 | 
			
		||||
to the bounce buffer.
 | 
			
		||||
 | 
			
		||||
Restricted pools
 | 
			
		||||
----------------
 | 
			
		||||
The swiotlb machinery is also used for "restricted pools", which are pools of
 | 
			
		||||
memory separate from the default swiotlb pool, and that are dedicated for DMA
 | 
			
		||||
use by a particular device. Restricted pools provide a level of DMA memory
 | 
			
		||||
protection on systems with limited hardware protection capabilities, such as
 | 
			
		||||
those lacking an IOMMU. Such usage is specified by DeviceTree entries and
 | 
			
		||||
requires that CONFIG_DMA_RESTRICTED_POOL is set. Each restricted pool is based
 | 
			
		||||
on its own io_tlb_mem data structure that is independent of the main swiotlb
 | 
			
		||||
io_tlb_mem.
 | 
			
		||||
 | 
			
		||||
Restricted pools add swiotlb_alloc() and swiotlb_free() APIs, which are called
 | 
			
		||||
from the dma_alloc_*() and dma_free_*() APIs. The swiotlb_alloc/free() APIs
 | 
			
		||||
allocate/free slots from/to the restricted pool directly and do not go through
 | 
			
		||||
swiotlb_tbl_map/unmap_single().
 | 
			
		||||
		Loading…
	
		Reference in a new issue