The bad pages that need to be retired are not all allocated
in the same poison consumption process, so an interface is
needed to query the processes that allocate the bad pages.
By killing all the processes that allocate the bad pages,
the bad pages can be reserved immediately.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set the dirty bit when the memory resource is not cleared
during BO release.
v2(Christian):
- Drop the cleared flag set to false.
- Improve the amdgpu_vram_mgr_set_clear_state() function.
v3:
- Add back the resource clear flag set function call after
being cleared during eviction (Christian).
- Modified the patch subject name.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add clear page support in vram memory region.
v1(Christian):
- Dont handle clear page as TTM flag since when moving the BO back
in from GTT again we don't need that.
- Make a specialized version of amdgpu_fill_buffer() which only
clears the VRAM areas which are not already cleared
- Drop the TTM_PL_FLAG_WIPE_ON_RELEASE check in
amdgpu_object.c
v2:
- Modify the function name amdgpu_ttm_* (Alex)
- Drop the delayed parameter (Christian)
- handle amdgpu_res_cleared(&cursor) just above the size
calculation (Christian)
- Use AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE for clearing the buffers
in the free path to properly wait for fences etc.. (Christian)
v3(Christian):
- Remove buffer clear code in VRAM manager instead change the
AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE handling to set
the DRM_BUDDY_CLEARED flag.
- Remove ! from amdgpu_res_cleared(&cursor) check.
v4(Christian):
- vres flag setting move to vram manager file
- use dma_fence_get_stub in amdgpu_ttm_clear_buffer function
- make fence a mandatory parameter and drop the if and the get/put dance
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419063538.11957-2-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
User reported gpu page fault when running graphics applications
and in some cases garbaged graphics are observed as soon as X
starts. This patch fixes all the issues.
Fixed the typecast issue for fpfn and lpfn variables, thus
preventing the overflow problem which resolves the memory
corruption.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714101214.7620-1-Arunpravin.PaneerSelvam@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
- Switch to drm buddy allocator
- Add resource cursor support for drm buddy
v2(Matthew Auld):
- replace spinlock with mutex as we call kmem_cache_zalloc
(..., GFP_KERNEL) in drm_buddy_alloc() function
- lock drm_buddy_block_trim() function as it calls
mark_free/mark_split are all globally visible
v3(Matthew Auld):
- remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function
v4:
- fix warnings reported by kernel test robot <lkp@intel.com>
v5:
- fix merge conflict issue
v6:
- fix warnings reported by kernel test robot <lkp@intel.com>
v7:
- remove DRM_BUDDY_RANGE_ALLOCATION flag usage
v8:
- keep DRM_BUDDY_RANGE_ALLOCATION flag usage
- resolve conflicts created by drm/amdgpu: remove VRAM accounting v2
v9(Christian):
- merged the below patch
- drm/amdgpu: move vram inline functions into a header
- rename label name as fallback
- move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
- remove unnecessary flags from struct amdgpu_vram_reservation
- rewrite block NULL check condition
- change else style as per coding standard
- rewrite the node max size
- add a helper function to fetch the first entry from the list
v10(Christian):
- rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block
v11:
- if size is not aligned with min_page_size, enable is_contiguous flag,
therefore, the size round up to the power of two and trimmed to the
original size.
v12:
- rename the function names having prefix as amdgpu_vram_mgr_*()
- modify the round_up() logic conforming to contiguous flag enablement
or if size is not aligned to min_block_size
- modify the trim logic
- rename node as block wherever applicable
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407224843.2416-1-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>