mirror of
https://github.com/torvalds/linux.git
synced 2025-11-12 22:49:37 +02:00
documented (hopefully adequately) in the respective changelogs. Notable
series include:
- Lucas Stach has provided some page-mapping
cleanup/consolidation/maintainability work in the series "mm/treewide:
Remove pXd_huge() API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being allocated:
number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in largely
similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
Weiner has fixed the page allocator's handling of migratetype requests,
with resulting improvements in compaction efficiency.
- In the series "make the hugetlb migration strategy consistent" Baolin
Wang has fixed a hugetlb migration issue, which should improve hugetlb
allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when memory
almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting" Kairui
Song has optimized pagecache insertion, yielding ~10% performance
improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various page->flags
cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
"mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs. This
is a simple first-cut implementation for now. The series is "support
multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in the
series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts in
the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes
the initialization code so that migration between different memory types
works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant driver
in the series "mm: follow_pte() improvements and acrn follow_pte()
fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to folio
in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size THP's
in the series "mm: add per-order mTHP alloc and swpout counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His series
"mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback instrumentation
in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series "Fix
and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
series "Improve anon_vma scalability for anon VMAs". Intel's test bot
reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
=V3R/
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
713 lines
17 KiB
C
713 lines
17 KiB
C
// SPDX-License-Identifier: GPL-2.0-only
|
|
/*
|
|
* KSM functional tests
|
|
*
|
|
* Copyright 2022, Red Hat, Inc.
|
|
*
|
|
* Author(s): David Hildenbrand <david@redhat.com>
|
|
*/
|
|
#define _GNU_SOURCE
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
#include <stdbool.h>
|
|
#include <stdint.h>
|
|
#include <unistd.h>
|
|
#include <errno.h>
|
|
#include <fcntl.h>
|
|
#include <sys/mman.h>
|
|
#include <sys/prctl.h>
|
|
#include <sys/syscall.h>
|
|
#include <sys/ioctl.h>
|
|
#include <sys/wait.h>
|
|
#include <linux/userfaultfd.h>
|
|
|
|
#include "../kselftest.h"
|
|
#include "vm_util.h"
|
|
|
|
#define KiB 1024u
|
|
#define MiB (1024 * KiB)
|
|
#define FORK_EXEC_CHILD_PRG_NAME "ksm_fork_exec_child"
|
|
|
|
#define MAP_MERGE_FAIL ((void *)-1)
|
|
#define MAP_MERGE_SKIP ((void *)-2)
|
|
|
|
enum ksm_merge_mode {
|
|
KSM_MERGE_PRCTL,
|
|
KSM_MERGE_MADVISE,
|
|
KSM_MERGE_NONE, /* PRCTL already set */
|
|
};
|
|
|
|
static int mem_fd;
|
|
static int ksm_fd;
|
|
static int ksm_full_scans_fd;
|
|
static int proc_self_ksm_stat_fd;
|
|
static int proc_self_ksm_merging_pages_fd;
|
|
static int ksm_use_zero_pages_fd;
|
|
static int pagemap_fd;
|
|
static size_t pagesize;
|
|
|
|
static bool range_maps_duplicates(char *addr, unsigned long size)
|
|
{
|
|
unsigned long offs_a, offs_b, pfn_a, pfn_b;
|
|
|
|
/*
|
|
* There is no easy way to check if there are KSM pages mapped into
|
|
* this range. We only check that the range does not map the same PFN
|
|
* twice by comparing each pair of mapped pages.
|
|
*/
|
|
for (offs_a = 0; offs_a < size; offs_a += pagesize) {
|
|
pfn_a = pagemap_get_pfn(pagemap_fd, addr + offs_a);
|
|
/* Page not present or PFN not exposed by the kernel. */
|
|
if (pfn_a == -1ul || !pfn_a)
|
|
continue;
|
|
|
|
for (offs_b = offs_a + pagesize; offs_b < size;
|
|
offs_b += pagesize) {
|
|
pfn_b = pagemap_get_pfn(pagemap_fd, addr + offs_b);
|
|
if (pfn_b == -1ul || !pfn_b)
|
|
continue;
|
|
if (pfn_a == pfn_b)
|
|
return true;
|
|
}
|
|
}
|
|
return false;
|
|
}
|
|
|
|
static long get_my_ksm_zero_pages(void)
|
|
{
|
|
char buf[200];
|
|
char *substr_ksm_zero;
|
|
size_t value_pos;
|
|
ssize_t read_size;
|
|
unsigned long my_ksm_zero_pages;
|
|
|
|
if (!proc_self_ksm_stat_fd)
|
|
return 0;
|
|
|
|
read_size = pread(proc_self_ksm_stat_fd, buf, sizeof(buf) - 1, 0);
|
|
if (read_size < 0)
|
|
return -errno;
|
|
|
|
buf[read_size] = 0;
|
|
|
|
substr_ksm_zero = strstr(buf, "ksm_zero_pages");
|
|
if (!substr_ksm_zero)
|
|
return 0;
|
|
|
|
value_pos = strcspn(substr_ksm_zero, "0123456789");
|
|
my_ksm_zero_pages = strtol(substr_ksm_zero + value_pos, NULL, 10);
|
|
|
|
return my_ksm_zero_pages;
|
|
}
|
|
|
|
static long get_my_merging_pages(void)
|
|
{
|
|
char buf[10];
|
|
ssize_t ret;
|
|
|
|
if (proc_self_ksm_merging_pages_fd < 0)
|
|
return proc_self_ksm_merging_pages_fd;
|
|
|
|
ret = pread(proc_self_ksm_merging_pages_fd, buf, sizeof(buf) - 1, 0);
|
|
if (ret <= 0)
|
|
return -errno;
|
|
buf[ret] = 0;
|
|
|
|
return strtol(buf, NULL, 10);
|
|
}
|
|
|
|
static long ksm_get_full_scans(void)
|
|
{
|
|
char buf[10];
|
|
ssize_t ret;
|
|
|
|
ret = pread(ksm_full_scans_fd, buf, sizeof(buf) - 1, 0);
|
|
if (ret <= 0)
|
|
return -errno;
|
|
buf[ret] = 0;
|
|
|
|
return strtol(buf, NULL, 10);
|
|
}
|
|
|
|
static int ksm_merge(void)
|
|
{
|
|
long start_scans, end_scans;
|
|
|
|
/* Wait for two full scans such that any possible merging happened. */
|
|
start_scans = ksm_get_full_scans();
|
|
if (start_scans < 0)
|
|
return start_scans;
|
|
if (write(ksm_fd, "1", 1) != 1)
|
|
return -errno;
|
|
do {
|
|
end_scans = ksm_get_full_scans();
|
|
if (end_scans < 0)
|
|
return end_scans;
|
|
} while (end_scans < start_scans + 2);
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int ksm_unmerge(void)
|
|
{
|
|
if (write(ksm_fd, "2", 1) != 1)
|
|
return -errno;
|
|
return 0;
|
|
}
|
|
|
|
static char *__mmap_and_merge_range(char val, unsigned long size, int prot,
|
|
enum ksm_merge_mode mode)
|
|
{
|
|
char *map;
|
|
char *err_map = MAP_MERGE_FAIL;
|
|
int ret;
|
|
|
|
/* Stabilize accounting by disabling KSM completely. */
|
|
if (ksm_unmerge()) {
|
|
ksft_print_msg("Disabling (unmerging) KSM failed\n");
|
|
return err_map;
|
|
}
|
|
|
|
if (get_my_merging_pages() > 0) {
|
|
ksft_print_msg("Still pages merged\n");
|
|
return err_map;
|
|
}
|
|
|
|
map = mmap(NULL, size, PROT_READ|PROT_WRITE,
|
|
MAP_PRIVATE|MAP_ANON, -1, 0);
|
|
if (map == MAP_FAILED) {
|
|
ksft_print_msg("mmap() failed\n");
|
|
return err_map;
|
|
}
|
|
|
|
/* Don't use THP. Ignore if THP are not around on a kernel. */
|
|
if (madvise(map, size, MADV_NOHUGEPAGE) && errno != EINVAL) {
|
|
ksft_print_msg("MADV_NOHUGEPAGE failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* Make sure each page contains the same values to merge them. */
|
|
memset(map, val, size);
|
|
|
|
if (mprotect(map, size, prot)) {
|
|
ksft_print_msg("mprotect() failed\n");
|
|
err_map = MAP_MERGE_SKIP;
|
|
goto unmap;
|
|
}
|
|
|
|
switch (mode) {
|
|
case KSM_MERGE_PRCTL:
|
|
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
|
|
if (ret < 0 && errno == EINVAL) {
|
|
ksft_print_msg("PR_SET_MEMORY_MERGE not supported\n");
|
|
err_map = MAP_MERGE_SKIP;
|
|
goto unmap;
|
|
} else if (ret) {
|
|
ksft_print_msg("PR_SET_MEMORY_MERGE=1 failed\n");
|
|
goto unmap;
|
|
}
|
|
break;
|
|
case KSM_MERGE_MADVISE:
|
|
if (madvise(map, size, MADV_MERGEABLE)) {
|
|
ksft_print_msg("MADV_MERGEABLE failed\n");
|
|
goto unmap;
|
|
}
|
|
break;
|
|
case KSM_MERGE_NONE:
|
|
break;
|
|
}
|
|
|
|
/* Run KSM to trigger merging and wait. */
|
|
if (ksm_merge()) {
|
|
ksft_print_msg("Running KSM failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/*
|
|
* Check if anything was merged at all. Ignore the zero page that is
|
|
* accounted differently (depending on kernel support).
|
|
*/
|
|
if (val && !get_my_merging_pages()) {
|
|
ksft_print_msg("No pages got merged\n");
|
|
goto unmap;
|
|
}
|
|
|
|
return map;
|
|
unmap:
|
|
munmap(map, size);
|
|
return err_map;
|
|
}
|
|
|
|
static char *mmap_and_merge_range(char val, unsigned long size, int prot,
|
|
enum ksm_merge_mode mode)
|
|
{
|
|
char *map;
|
|
char *ret = MAP_FAILED;
|
|
|
|
map = __mmap_and_merge_range(val, size, prot, mode);
|
|
if (map == MAP_MERGE_FAIL)
|
|
ksft_test_result_fail("Merging memory failed");
|
|
else if (map == MAP_MERGE_SKIP)
|
|
ksft_test_result_skip("Merging memory skipped");
|
|
else
|
|
ret = map;
|
|
|
|
return ret;
|
|
}
|
|
|
|
static void test_unmerge(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
|
|
if (map == MAP_FAILED)
|
|
return;
|
|
|
|
if (madvise(map, size, MADV_UNMERGEABLE)) {
|
|
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"Pages were unmerged\n");
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
|
|
static void test_unmerge_zero_pages(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
unsigned int offs;
|
|
unsigned long pages_expected;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
if (proc_self_ksm_stat_fd < 0) {
|
|
ksft_test_result_skip("open(\"/proc/self/ksm_stat\") failed\n");
|
|
return;
|
|
}
|
|
if (ksm_use_zero_pages_fd < 0) {
|
|
ksft_test_result_skip("open \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
|
|
return;
|
|
}
|
|
if (write(ksm_use_zero_pages_fd, "1", 1) != 1) {
|
|
ksft_test_result_skip("write \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
|
|
return;
|
|
}
|
|
|
|
/* Let KSM deduplicate zero pages. */
|
|
map = mmap_and_merge_range(0x00, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
|
|
if (map == MAP_FAILED)
|
|
return;
|
|
|
|
/* Check if ksm_zero_pages is updated correctly after KSM merging */
|
|
pages_expected = size / pagesize;
|
|
if (pages_expected != get_my_ksm_zero_pages()) {
|
|
ksft_test_result_fail("'ksm_zero_pages' updated after merging\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* Try to unmerge half of the region */
|
|
if (madvise(map, size / 2, MADV_UNMERGEABLE)) {
|
|
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* Check if ksm_zero_pages is updated correctly after unmerging */
|
|
pages_expected /= 2;
|
|
if (pages_expected != get_my_ksm_zero_pages()) {
|
|
ksft_test_result_fail("'ksm_zero_pages' updated after unmerging\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* Trigger unmerging of the other half by writing to the pages. */
|
|
for (offs = size / 2; offs < size; offs += pagesize)
|
|
*((unsigned int *)&map[offs]) = offs;
|
|
|
|
/* Now we should have no zeropages remaining. */
|
|
if (get_my_ksm_zero_pages()) {
|
|
ksft_test_result_fail("'ksm_zero_pages' updated after write fault\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* Check if ksm zero pages are really unmerged */
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"KSM zero pages were unmerged\n");
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
|
|
static void test_unmerge_discarded(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
|
|
if (map == MAP_FAILED)
|
|
return;
|
|
|
|
/* Discard half of all mapped pages so we have pte_none() entries. */
|
|
if (madvise(map, size / 2, MADV_DONTNEED)) {
|
|
ksft_test_result_fail("MADV_DONTNEED failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
if (madvise(map, size, MADV_UNMERGEABLE)) {
|
|
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"Pages were unmerged\n");
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
|
|
#ifdef __NR_userfaultfd
|
|
static void test_unmerge_uffd_wp(void)
|
|
{
|
|
struct uffdio_writeprotect uffd_writeprotect;
|
|
const unsigned int size = 2 * MiB;
|
|
struct uffdio_api uffdio_api;
|
|
char *map;
|
|
int uffd;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
|
|
if (map == MAP_FAILED)
|
|
return;
|
|
|
|
/* See if UFFD is around. */
|
|
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
|
|
if (uffd < 0) {
|
|
ksft_test_result_skip("__NR_userfaultfd failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
/* See if UFFD-WP is around. */
|
|
uffdio_api.api = UFFD_API;
|
|
uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP;
|
|
if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) {
|
|
ksft_test_result_fail("UFFDIO_API failed\n");
|
|
goto close_uffd;
|
|
}
|
|
if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) {
|
|
ksft_test_result_skip("UFFD_FEATURE_PAGEFAULT_FLAG_WP not available\n");
|
|
goto close_uffd;
|
|
}
|
|
|
|
/* Register UFFD-WP, no need for an actual handler. */
|
|
if (uffd_register(uffd, map, size, false, true, false)) {
|
|
ksft_test_result_fail("UFFDIO_REGISTER_MODE_WP failed\n");
|
|
goto close_uffd;
|
|
}
|
|
|
|
/* Write-protect the range using UFFD-WP. */
|
|
uffd_writeprotect.range.start = (unsigned long) map;
|
|
uffd_writeprotect.range.len = size;
|
|
uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP;
|
|
if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) {
|
|
ksft_test_result_fail("UFFDIO_WRITEPROTECT failed\n");
|
|
goto close_uffd;
|
|
}
|
|
|
|
if (madvise(map, size, MADV_UNMERGEABLE)) {
|
|
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
|
|
goto close_uffd;
|
|
}
|
|
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"Pages were unmerged\n");
|
|
close_uffd:
|
|
close(uffd);
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
#endif
|
|
|
|
/* Verify that KSM can be enabled / queried with prctl. */
|
|
static void test_prctl(void)
|
|
{
|
|
int ret;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
|
|
if (ret < 0 && errno == EINVAL) {
|
|
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
|
|
return;
|
|
} else if (ret) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
|
|
return;
|
|
}
|
|
|
|
ret = prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0);
|
|
if (ret < 0) {
|
|
ksft_test_result_fail("PR_GET_MEMORY_MERGE failed\n");
|
|
return;
|
|
} else if (ret != 1) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 not effective\n");
|
|
return;
|
|
}
|
|
|
|
ret = prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0);
|
|
if (ret) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
|
|
return;
|
|
}
|
|
|
|
ret = prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0);
|
|
if (ret < 0) {
|
|
ksft_test_result_fail("PR_GET_MEMORY_MERGE failed\n");
|
|
return;
|
|
} else if (ret != 0) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 not effective\n");
|
|
return;
|
|
}
|
|
|
|
ksft_test_result_pass("Setting/clearing PR_SET_MEMORY_MERGE works\n");
|
|
}
|
|
|
|
static int test_child_ksm(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
|
|
/* Test if KSM is enabled for the process. */
|
|
if (prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0) != 1)
|
|
return -1;
|
|
|
|
/* Test if merge could really happen. */
|
|
map = __mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_NONE);
|
|
if (map == MAP_MERGE_FAIL)
|
|
return -2;
|
|
else if (map == MAP_MERGE_SKIP)
|
|
return -3;
|
|
|
|
munmap(map, size);
|
|
return 0;
|
|
}
|
|
|
|
static void test_child_ksm_err(int status)
|
|
{
|
|
if (status == -1)
|
|
ksft_test_result_fail("unexpected PR_GET_MEMORY_MERGE result in child\n");
|
|
else if (status == -2)
|
|
ksft_test_result_fail("Merge in child failed\n");
|
|
else if (status == -3)
|
|
ksft_test_result_skip("Merge in child skipped\n");
|
|
}
|
|
|
|
/* Verify that prctl ksm flag is inherited. */
|
|
static void test_prctl_fork(void)
|
|
{
|
|
int ret, status;
|
|
pid_t child_pid;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
|
|
if (ret < 0 && errno == EINVAL) {
|
|
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
|
|
return;
|
|
} else if (ret) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
|
|
return;
|
|
}
|
|
|
|
child_pid = fork();
|
|
if (!child_pid) {
|
|
exit(test_child_ksm());
|
|
} else if (child_pid < 0) {
|
|
ksft_test_result_fail("fork() failed\n");
|
|
return;
|
|
}
|
|
|
|
if (waitpid(child_pid, &status, 0) < 0) {
|
|
ksft_test_result_fail("waitpid() failed\n");
|
|
return;
|
|
}
|
|
|
|
status = WEXITSTATUS(status);
|
|
if (status) {
|
|
test_child_ksm_err(status);
|
|
return;
|
|
}
|
|
|
|
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
|
|
return;
|
|
}
|
|
|
|
ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
|
|
}
|
|
|
|
static void test_prctl_fork_exec(void)
|
|
{
|
|
int ret, status;
|
|
pid_t child_pid;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
|
|
if (ret < 0 && errno == EINVAL) {
|
|
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
|
|
return;
|
|
} else if (ret) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
|
|
return;
|
|
}
|
|
|
|
child_pid = fork();
|
|
if (child_pid == -1) {
|
|
ksft_test_result_skip("fork() failed\n");
|
|
return;
|
|
} else if (child_pid == 0) {
|
|
char *prg_name = "./ksm_functional_tests";
|
|
char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
|
|
|
|
execv(prg_name, argv_for_program);
|
|
return;
|
|
}
|
|
|
|
if (waitpid(child_pid, &status, 0) > 0) {
|
|
if (WIFEXITED(status)) {
|
|
status = WEXITSTATUS(status);
|
|
if (status) {
|
|
test_child_ksm_err(status);
|
|
return;
|
|
}
|
|
} else {
|
|
ksft_test_result_fail("program didn't terminate normally\n");
|
|
return;
|
|
}
|
|
} else {
|
|
ksft_test_result_fail("waitpid() failed\n");
|
|
return;
|
|
}
|
|
|
|
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
|
|
return;
|
|
}
|
|
|
|
ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
|
|
}
|
|
|
|
static void test_prctl_unmerge(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_PRCTL);
|
|
if (map == MAP_FAILED)
|
|
return;
|
|
|
|
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
|
|
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"Pages were unmerged\n");
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
|
|
static void test_prot_none(void)
|
|
{
|
|
const unsigned int size = 2 * MiB;
|
|
char *map;
|
|
int i;
|
|
|
|
ksft_print_msg("[RUN] %s\n", __func__);
|
|
|
|
map = mmap_and_merge_range(0x11, size, PROT_NONE, KSM_MERGE_MADVISE);
|
|
if (map == MAP_FAILED)
|
|
goto unmap;
|
|
|
|
/* Store a unique value in each page on one half using ptrace */
|
|
for (i = 0; i < size / 2; i += pagesize) {
|
|
lseek(mem_fd, (uintptr_t) map + i, SEEK_SET);
|
|
if (write(mem_fd, &i, sizeof(i)) != sizeof(i)) {
|
|
ksft_test_result_fail("ptrace write failed\n");
|
|
goto unmap;
|
|
}
|
|
}
|
|
|
|
/* Trigger unsharing on the other half. */
|
|
if (madvise(map + size / 2, size / 2, MADV_UNMERGEABLE)) {
|
|
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
|
|
goto unmap;
|
|
}
|
|
|
|
ksft_test_result(!range_maps_duplicates(map, size),
|
|
"Pages were unmerged\n");
|
|
unmap:
|
|
munmap(map, size);
|
|
}
|
|
|
|
int main(int argc, char **argv)
|
|
{
|
|
unsigned int tests = 8;
|
|
int err;
|
|
|
|
if (argc > 1 && !strcmp(argv[1], FORK_EXEC_CHILD_PRG_NAME)) {
|
|
exit(test_child_ksm());
|
|
}
|
|
|
|
#ifdef __NR_userfaultfd
|
|
tests++;
|
|
#endif
|
|
|
|
ksft_print_header();
|
|
ksft_set_plan(tests);
|
|
|
|
pagesize = getpagesize();
|
|
|
|
mem_fd = open("/proc/self/mem", O_RDWR);
|
|
if (mem_fd < 0)
|
|
ksft_exit_fail_msg("opening /proc/self/mem failed\n");
|
|
ksm_fd = open("/sys/kernel/mm/ksm/run", O_RDWR);
|
|
if (ksm_fd < 0)
|
|
ksft_exit_skip("open(\"/sys/kernel/mm/ksm/run\") failed\n");
|
|
ksm_full_scans_fd = open("/sys/kernel/mm/ksm/full_scans", O_RDONLY);
|
|
if (ksm_full_scans_fd < 0)
|
|
ksft_exit_skip("open(\"/sys/kernel/mm/ksm/full_scans\") failed\n");
|
|
pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
|
|
if (pagemap_fd < 0)
|
|
ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n");
|
|
proc_self_ksm_stat_fd = open("/proc/self/ksm_stat", O_RDONLY);
|
|
proc_self_ksm_merging_pages_fd = open("/proc/self/ksm_merging_pages",
|
|
O_RDONLY);
|
|
ksm_use_zero_pages_fd = open("/sys/kernel/mm/ksm/use_zero_pages", O_RDWR);
|
|
|
|
test_unmerge();
|
|
test_unmerge_zero_pages();
|
|
test_unmerge_discarded();
|
|
#ifdef __NR_userfaultfd
|
|
test_unmerge_uffd_wp();
|
|
#endif
|
|
|
|
test_prot_none();
|
|
|
|
test_prctl();
|
|
test_prctl_fork();
|
|
test_prctl_fork_exec();
|
|
test_prctl_unmerge();
|
|
|
|
err = ksft_get_fail_cnt();
|
|
if (err)
|
|
ksft_exit_fail_msg("%d out of %d tests failed\n",
|
|
err, ksft_test_num());
|
|
ksft_exit_pass();
|
|
}
|