mirror of
				https://github.com/torvalds/linux.git
				synced 2025-11-04 10:40:15 +02:00 
			
		
		
		
	Document mounting options for casefold support in tmpfs. Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: André Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/r/20241021-tonyk-tmpfs-v8-9-f443d5814194@igalia.com Signed-off-by: Christian Brauner <brauner@kernel.org>
		
			
				
	
	
		
			276 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			276 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
.. SPDX-License-Identifier: GPL-2.0
 | 
						|
 | 
						|
=====
 | 
						|
Tmpfs
 | 
						|
=====
 | 
						|
 | 
						|
Tmpfs is a file system which keeps all of its files in virtual memory.
 | 
						|
 | 
						|
 | 
						|
Everything in tmpfs is temporary in the sense that no files will be
 | 
						|
created on your hard drive. If you unmount a tmpfs instance,
 | 
						|
everything stored therein is lost.
 | 
						|
 | 
						|
tmpfs puts everything into the kernel internal caches and grows and
 | 
						|
shrinks to accommodate the files it contains and is able to swap
 | 
						|
unneeded pages out to swap space, if swap was enabled for the tmpfs
 | 
						|
mount. tmpfs also supports THP.
 | 
						|
 | 
						|
tmpfs extends ramfs with a few userspace configurable options listed and
 | 
						|
explained further below, some of which can be reconfigured dynamically on the
 | 
						|
fly using a remount ('mount -o remount ...') of the filesystem. A tmpfs
 | 
						|
filesystem can be resized but it cannot be resized to a size below its current
 | 
						|
usage. tmpfs also supports POSIX ACLs, and extended attributes for the
 | 
						|
trusted.*, security.* and user.* namespaces. ramfs does not use swap and you
 | 
						|
cannot modify any parameter for a ramfs filesystem. The size limit of a ramfs
 | 
						|
filesystem is how much memory you have available, and so care must be taken if
 | 
						|
used so to not run out of memory.
 | 
						|
 | 
						|
An alternative to tmpfs and ramfs is to use brd to create RAM disks
 | 
						|
(/dev/ram*), which allows you to simulate a block device disk in physical RAM.
 | 
						|
To write data you would just then need to create an regular filesystem on top
 | 
						|
this ramdisk. As with ramfs, brd ramdisks cannot swap. brd ramdisks are also
 | 
						|
configured in size at initialization and you cannot dynamically resize them.
 | 
						|
Contrary to brd ramdisks, tmpfs has its own filesystem, it does not rely on the
 | 
						|
block layer at all.
 | 
						|
 | 
						|
Since tmpfs lives completely in the page cache and optionally on swap,
 | 
						|
all tmpfs pages will be shown as "Shmem" in /proc/meminfo and "Shared" in
 | 
						|
free(1). Notice that these counters also include shared memory
 | 
						|
(shmem, see ipcs(1)). The most reliable way to get the count is
 | 
						|
using df(1) and du(1).
 | 
						|
 | 
						|
tmpfs has the following uses:
 | 
						|
 | 
						|
1) There is always a kernel internal mount which you will not see at
 | 
						|
   all. This is used for shared anonymous mappings and SYSV shared
 | 
						|
   memory.
 | 
						|
 | 
						|
   This mount does not depend on CONFIG_TMPFS. If CONFIG_TMPFS is not
 | 
						|
   set, the user visible part of tmpfs is not built. But the internal
 | 
						|
   mechanisms are always present.
 | 
						|
 | 
						|
2) glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
 | 
						|
   POSIX shared memory (shm_open, shm_unlink). Adding the following
 | 
						|
   line to /etc/fstab should take care of this::
 | 
						|
 | 
						|
	tmpfs	/dev/shm	tmpfs	defaults	0 0
 | 
						|
 | 
						|
   Remember to create the directory that you intend to mount tmpfs on
 | 
						|
   if necessary.
 | 
						|
 | 
						|
   This mount is _not_ needed for SYSV shared memory. The internal
 | 
						|
   mount is used for that. (In the 2.3 kernel versions it was
 | 
						|
   necessary to mount the predecessor of tmpfs (shm fs) to use SYSV
 | 
						|
   shared memory.)
 | 
						|
 | 
						|
3) Some people (including me) find it very convenient to mount it
 | 
						|
   e.g. on /tmp and /var/tmp and have a big swap partition. And now
 | 
						|
   loop mounts of tmpfs files do work, so mkinitrd shipped by most
 | 
						|
   distributions should succeed with a tmpfs /tmp.
 | 
						|
 | 
						|
4) And probably a lot more I do not know about :-)
 | 
						|
 | 
						|
 | 
						|
tmpfs has three mount options for sizing:
 | 
						|
 | 
						|
=========  ============================================================
 | 
						|
size       The limit of allocated bytes for this tmpfs instance. The
 | 
						|
           default is half of your physical RAM without swap. If you
 | 
						|
           oversize your tmpfs instances the machine will deadlock
 | 
						|
           since the OOM handler will not be able to free that memory.
 | 
						|
nr_blocks  The same as size, but in blocks of PAGE_SIZE.
 | 
						|
nr_inodes  The maximum number of inodes for this instance. The default
 | 
						|
           is half of the number of your physical RAM pages, or (on a
 | 
						|
           machine with highmem) the number of lowmem RAM pages,
 | 
						|
           whichever is the lower.
 | 
						|
=========  ============================================================
 | 
						|
 | 
						|
These parameters accept a suffix k, m or g for kilo, mega and giga and
 | 
						|
can be changed on remount.  The size parameter also accepts a suffix %
 | 
						|
to limit this tmpfs instance to that percentage of your physical RAM:
 | 
						|
the default, when neither size nor nr_blocks is specified, is size=50%
 | 
						|
 | 
						|
If nr_blocks=0 (or size=0), blocks will not be limited in that instance;
 | 
						|
if nr_inodes=0, inodes will not be limited.  It is generally unwise to
 | 
						|
mount with such options, since it allows any user with write access to
 | 
						|
use up all the memory on the machine; but enhances the scalability of
 | 
						|
that instance in a system with many CPUs making intensive use of it.
 | 
						|
 | 
						|
If nr_inodes is not 0, that limited space for inodes is also used up by
 | 
						|
extended attributes: "df -i"'s IUsed and IUse% increase, IFree decreases.
 | 
						|
 | 
						|
tmpfs blocks may be swapped out, when there is a shortage of memory.
 | 
						|
tmpfs has a mount option to disable its use of swap:
 | 
						|
 | 
						|
======  ===========================================================
 | 
						|
noswap  Disables swap. Remounts must respect the original settings.
 | 
						|
        By default swap is enabled.
 | 
						|
======  ===========================================================
 | 
						|
 | 
						|
tmpfs also supports Transparent Huge Pages which requires a kernel
 | 
						|
configured with CONFIG_TRANSPARENT_HUGEPAGE and with huge supported for
 | 
						|
your system (has_transparent_hugepage(), which is architecture specific).
 | 
						|
The mount options for this are:
 | 
						|
 | 
						|
================ ==============================================================
 | 
						|
huge=never       Do not allocate huge pages.  This is the default.
 | 
						|
huge=always      Attempt to allocate huge page every time a new page is needed.
 | 
						|
huge=within_size Only allocate huge page if it will be fully within i_size.
 | 
						|
                 Also respect madvise(2) hints.
 | 
						|
huge=advise      Only allocate huge page if requested with madvise(2).
 | 
						|
================ ==============================================================
 | 
						|
 | 
						|
See also Documentation/admin-guide/mm/transhuge.rst, which describes the
 | 
						|
sysfs file /sys/kernel/mm/transparent_hugepage/shmem_enabled: which can
 | 
						|
be used to deny huge pages on all tmpfs mounts in an emergency, or to
 | 
						|
force huge pages on all tmpfs mounts for testing.
 | 
						|
 | 
						|
tmpfs also supports quota with the following mount options
 | 
						|
 | 
						|
======================== =================================================
 | 
						|
quota                    User and group quota accounting and enforcement
 | 
						|
                         is enabled on the mount. Tmpfs is using hidden
 | 
						|
                         system quota files that are initialized on mount.
 | 
						|
usrquota                 User quota accounting and enforcement is enabled
 | 
						|
                         on the mount.
 | 
						|
grpquota                 Group quota accounting and enforcement is enabled
 | 
						|
                         on the mount.
 | 
						|
usrquota_block_hardlimit Set global user quota block hard limit.
 | 
						|
usrquota_inode_hardlimit Set global user quota inode hard limit.
 | 
						|
grpquota_block_hardlimit Set global group quota block hard limit.
 | 
						|
grpquota_inode_hardlimit Set global group quota inode hard limit.
 | 
						|
======================== =================================================
 | 
						|
 | 
						|
None of the quota related mount options can be set or changed on remount.
 | 
						|
 | 
						|
Quota limit parameters accept a suffix k, m or g for kilo, mega and giga
 | 
						|
and can't be changed on remount. Default global quota limits are taking
 | 
						|
effect for any and all user/group/project except root the first time the
 | 
						|
quota entry for user/group/project id is being accessed - typically the
 | 
						|
first time an inode with a particular id ownership is being created after
 | 
						|
the mount. In other words, instead of the limits being initialized to zero,
 | 
						|
they are initialized with the particular value provided with these mount
 | 
						|
options. The limits can be changed for any user/group id at any time as they
 | 
						|
normally can be.
 | 
						|
 | 
						|
Note that tmpfs quotas do not support user namespaces so no uid/gid
 | 
						|
translation is done if quotas are enabled inside user namespaces.
 | 
						|
 | 
						|
tmpfs has a mount option to set the NUMA memory allocation policy for
 | 
						|
all files in that instance (if CONFIG_NUMA is enabled) - which can be
 | 
						|
adjusted on the fly via 'mount -o remount ...'
 | 
						|
 | 
						|
======================== ==============================================
 | 
						|
mpol=default             use the process allocation policy
 | 
						|
                         (see set_mempolicy(2))
 | 
						|
mpol=prefer:Node         prefers to allocate memory from the given Node
 | 
						|
mpol=bind:NodeList       allocates memory only from nodes in NodeList
 | 
						|
mpol=interleave          prefers to allocate from each node in turn
 | 
						|
mpol=interleave:NodeList allocates from each node of NodeList in turn
 | 
						|
mpol=local		 prefers to allocate memory from the local node
 | 
						|
======================== ==============================================
 | 
						|
 | 
						|
NodeList format is a comma-separated list of decimal numbers and ranges,
 | 
						|
a range being two hyphen-separated decimal numbers, the smallest and
 | 
						|
largest node numbers in the range.  For example, mpol=bind:0-3,5,7,9-15
 | 
						|
 | 
						|
A memory policy with a valid NodeList will be saved, as specified, for
 | 
						|
use at file creation time.  When a task allocates a file in the file
 | 
						|
system, the mount option memory policy will be applied with a NodeList,
 | 
						|
if any, modified by the calling task's cpuset constraints
 | 
						|
[See Documentation/admin-guide/cgroup-v1/cpusets.rst] and any optional flags,
 | 
						|
listed below.  If the resulting NodeLists is the empty set, the effective
 | 
						|
memory policy for the file will revert to "default" policy.
 | 
						|
 | 
						|
NUMA memory allocation policies have optional flags that can be used in
 | 
						|
conjunction with their modes.  These optional flags can be specified
 | 
						|
when tmpfs is mounted by appending them to the mode before the NodeList.
 | 
						|
See Documentation/admin-guide/mm/numa_memory_policy.rst for a list of
 | 
						|
all available memory allocation policy mode flags and their effect on
 | 
						|
memory policy.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
	=static		is equivalent to	MPOL_F_STATIC_NODES
 | 
						|
	=relative	is equivalent to	MPOL_F_RELATIVE_NODES
 | 
						|
 | 
						|
For example, mpol=bind=static:NodeList, is the equivalent of an
 | 
						|
allocation policy of MPOL_BIND | MPOL_F_STATIC_NODES.
 | 
						|
 | 
						|
Note that trying to mount a tmpfs with an mpol option will fail if the
 | 
						|
running kernel does not support NUMA; and will fail if its nodelist
 | 
						|
specifies a node which is not online.  If your system relies on that
 | 
						|
tmpfs being mounted, but from time to time runs a kernel built without
 | 
						|
NUMA capability (perhaps a safe recovery kernel), or with fewer nodes
 | 
						|
online, then it is advisable to omit the mpol option from automatic
 | 
						|
mount options.  It can be added later, when the tmpfs is already mounted
 | 
						|
on MountPoint, by 'mount -o remount,mpol=Policy:NodeList MountPoint'.
 | 
						|
 | 
						|
 | 
						|
To specify the initial root directory you can use the following mount
 | 
						|
options:
 | 
						|
 | 
						|
====	==================================
 | 
						|
mode	The permissions as an octal number
 | 
						|
uid	The user id
 | 
						|
gid	The group id
 | 
						|
====	==================================
 | 
						|
 | 
						|
These options do not have any effect on remount. You can change these
 | 
						|
parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem.
 | 
						|
 | 
						|
 | 
						|
tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode
 | 
						|
numbers:
 | 
						|
 | 
						|
=======   ========================
 | 
						|
inode64   Use 64-bit inode numbers
 | 
						|
inode32   Use 32-bit inode numbers
 | 
						|
=======   ========================
 | 
						|
 | 
						|
On a 32-bit kernel, inode32 is implicit, and inode64 is refused at mount time.
 | 
						|
On a 64-bit kernel, CONFIG_TMPFS_INODE64 sets the default.  inode64 avoids the
 | 
						|
possibility of multiple files with the same inode number on a single device;
 | 
						|
but risks glibc failing with EOVERFLOW once 33-bit inode numbers are reached -
 | 
						|
if a long-lived tmpfs is accessed by 32-bit applications so ancient that
 | 
						|
opening a file larger than 2GiB fails with EINVAL.
 | 
						|
 | 
						|
 | 
						|
So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs'
 | 
						|
will give you tmpfs instance on /mytmpfs which can allocate 10GB
 | 
						|
RAM/SWAP in 10240 inodes and it is only accessible by root.
 | 
						|
 | 
						|
tmpfs has the following mounting options for case-insensitive lookup support:
 | 
						|
 | 
						|
================= ==============================================================
 | 
						|
casefold          Enable casefold support at this mount point using the given
 | 
						|
                  argument as the encoding standard. Currently only UTF-8
 | 
						|
                  encodings are supported. If no argument is used, it will load
 | 
						|
                  the latest UTF-8 encoding available.
 | 
						|
strict_encoding   Enable strict encoding at this mount point (disabled by
 | 
						|
                  default). In this mode, the filesystem refuses to create file
 | 
						|
                  and directory with names containing invalid UTF-8 characters.
 | 
						|
================= ==============================================================
 | 
						|
 | 
						|
This option doesn't render the entire filesystem case-insensitive. One needs to
 | 
						|
still set the casefold flag per directory, by flipping the +F attribute in an
 | 
						|
empty directory. Nevertheless, new directories will inherit the attribute. The
 | 
						|
mountpoint itself cannot be made case-insensitive.
 | 
						|
 | 
						|
Example::
 | 
						|
 | 
						|
    $ mount -t tmpfs -o casefold=utf8-12.1.0,strict_encoding fs_name /mytmpfs
 | 
						|
    $ mount -t tmpfs -o casefold fs_name /mytmpfs
 | 
						|
 | 
						|
 | 
						|
:Author:
 | 
						|
   Christoph Rohland <cr@sap.com>, 1.12.01
 | 
						|
:Updated:
 | 
						|
   Hugh Dickins, 4 June 2007
 | 
						|
:Updated:
 | 
						|
   KOSAKI Motohiro, 16 Mar 2010
 | 
						|
:Updated:
 | 
						|
   Chris Down, 13 July 2020
 | 
						|
:Updated:
 | 
						|
   André Almeida, 23 Aug 2024
 |