android_kernel_google_msm/mm
Johannes Weiner 8c8971e883 mm: keep page cache radix tree nodes in check
Previously, page cache radix tree nodes were freed after reclaim emptied
out their page pointers.  But now reclaim stores shadow entries in their
place, which are only reclaimed when the inodes themselves are
reclaimed.  This is problematic for bigger files that are still in use
after they have a significant amount of their cache reclaimed, without
any of those pages actually refaulting.  The shadow entries will just
sit there and waste memory.  In the worst case, the shadow entries will
accumulate until the machine runs out of memory.

To get this under control, the VM will track radix tree nodes
exclusively containing shadow entries on a per-NUMA node list.  Per-NUMA
rather than global because we expect the radix tree nodes themselves to
be allocated node-locally and we want to reduce cross-node references of
otherwise independent cache workloads.  A simple shrinker will then
reclaim these nodes on memory pressure.

A few things need to be stored in the radix tree node to implement the
shadow node LRU and allow tree deletions coming from the list:

1. There is no index available that would describe the reverse path
   from the node up to the tree root, which is needed to perform a
   deletion.  To solve this, encode in each node its offset inside the
   parent.  This can be stored in the unused upper bits of the same
   member that stores the node's height at no extra space cost.

2. The number of shadow entries needs to be counted in addition to the
   regular entries, to quickly detect when the node is ready to go to
   the shadow node LRU list.  The current entry count is an unsigned
   int but the maximum number of entries is 64, so a shadow counter
   can easily be stored in the unused upper bits.

3. Tree modification needs tree lock and tree root, which are located
   in the address space, so store an address_space backpointer in the
   node.  The parent pointer of the node is in a union with the 2-word
   rcu_head, so the backpointer comes at no extra cost as well.

4. The node needs to be linked to an LRU list, which requires a list
   head inside the node.  This does increase the size of the node, but
   it does not change the number of objects that fit into a slab page.

[akpm@linux-foundation.org: export the right function]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Change-Id: I6da4642d842f91615747957e7f54a5c2d4593427
2021-11-26 22:02:17 +01:00
..
backing-dev.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
bootmem.c mm: sparse: fix usemap allocation above node descriptor section 2016-10-29 23:12:12 +08:00
bounce.c
cleancache.c
compaction.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap.c lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt 2020-12-07 21:02:05 +03:00
filemap_xip.c mm: kill vma flag VM_CAN_NONLINEAR 2020-11-29 16:11:40 +03:00
fremap.c mm: kill vma flag VM_CAN_NONLINEAR 2020-11-29 16:11:40 +03:00
highmem.c mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address 2014-06-11 12:04:22 -07:00
huge_memory.c mm, thp: fix collapsing of hugepages on madvise 2015-02-02 17:05:07 +08:00
hugetlb.c Fix incomplete backport of commit 0f792cf949a0 2016-10-26 23:15:44 +08:00
hwpoison-inject.c
init-mm.c
internal.h Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
Kconfig BACKPORT: mm/zsmalloc: add statistics support 2018-01-01 21:27:09 +03:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: kmemleak: allow safe memory scanning during kmemleak disabling 2015-10-22 09:20:06 +08:00
ksm.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
list_lru.c mm: keep page cache radix tree nodes in check 2021-11-26 22:02:17 +01:00
maccess.c
madvise.c mm/fs: route MADV_REMOVE to FALLOC_FL_PUNCH_HOLE 2020-12-07 21:00:58 +03:00
Makefile list: add a new LRU list type 2021-11-26 21:56:07 +01:00
memblock.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
memcontrol.c shmem: replace page if mapping excludes its zone 2020-12-07 20:57:06 +03:00
memory-failure.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
memory.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
memory_hotplug.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
mempolicy.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
mempool.c
migrate.c BACKPORT: Sanitize 'move_pages()' permission checks 2018-01-13 17:13:40 +03:00
mincore.c swap: make each swap partition have one address_space 2018-01-01 22:02:05 +03:00
mlock.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
mm_init.c
mmap.c mm: allow drivers to prevent new writable mappings 2020-12-07 21:08:09 +03:00
mmu_context.c
mmu_notifier.c mm: mmu_notifier: re-fix freed page still mapped in secondary MMU 2013-06-07 12:49:25 -07:00
mmzone.c
mprotect.c mm: add a field to store names for private anonymous memory 2013-10-11 10:02:06 -07:00
mremap.c
msync.c
nobootmem.c
nommu.c mm: kill vma flag VM_CAN_NONLINEAR 2020-11-29 16:11:40 +03:00
oom_kill.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
page-writeback.c mm: fix calculation of dirtyable memory 2016-10-29 23:12:16 +08:00
page_alloc.c mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces 2020-12-01 19:08:36 +01:00
page_cgroup.c cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. 2015-02-02 17:05:07 +08:00
page_io.c
page_isolation.c
pagewalk.c mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas 2013-06-07 12:49:28 -07:00
percpu-km.c
percpu-vm.c percpu: perform tlb flush after pcpu_map_pages() failure 2014-12-01 18:02:23 +08:00
percpu.c Revert "percpu: free percpu allocation info for uniprocessor system" 2015-02-02 17:04:38 +08:00
pgtable-generic.c
prio_tree.c
process_vm_access.c
quicklist.c
readahead.c mm: change initial readahead window size calculation 2016-10-29 23:12:18 +08:00
rmap.c mm: fix anon_vma->degree underflow in anon_vma endless growing prevention 2015-04-14 17:34:04 +08:00
shmem.c shmem: update memory reservation on truncate 2020-12-23 16:15:47 +03:00
slab.c cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags 2014-12-01 18:02:38 +08:00
slob.c
slub.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
sparse-vmemmap.c
sparse.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
swap.c swap: make each swap partition have one address_space 2018-01-01 22:02:05 +03:00
swap_state.c mm: allow drivers to prevent new writable mappings 2020-12-07 21:08:09 +03:00
swapfile.c vfs: make path_openat take a struct filename pointer 2018-12-07 22:28:48 +04:00
thrash.c
truncate.c mm/fs: remove truncate_range 2020-12-07 20:57:30 +03:00
util.c swap: make each swap partition have one address_space 2018-01-01 22:02:05 +03:00
vmalloc.c mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512! 2020-12-01 19:08:45 +01:00
vmscan.c mm: new shrinker API 2020-11-29 16:11:30 +03:00
vmstat.c Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 2017-12-27 17:13:15 +03:00
zpool.c BACKPORT: mm/zpool: add name argument to create zpool 2018-01-01 21:27:09 +03:00
zsmalloc.c UPSTREAM: zsmalloc: fix a null pointer dereference in destroy_handle_cache() 2018-01-01 21:27:14 +03:00