android_kernel_samsung_msm8976/mm
Sha Zhengju 58cf188ed6 memcg, oom: provide more precise dump info while memcg oom happening
Currently when a memcg oom is happening the oom dump messages is still
global state and provides few useful info for users.  This patch prints
more pointed memcg page statistics for memcg-oom and take hierarchy into
consideration:

Based on Michal's advice, we take hierarchy into consideration: supppose
we trigger an OOM on A's limit

        root_memcg
            |
            A (use_hierachy=1)
           / \
          B   C
          |
          D
then the printed info will be:

  Memory cgroup stats for /A:...
  Memory cgroup stats for /A/B:...
  Memory cgroup stats for /A/C:...
  Memory cgroup stats for /A/B/D:...

Following are samples of oom output:

(1) Before change:

    mal-80 invoked oom-killer:gfp_mask=0xd0, order=0, oom_score_adj=0
    mal-80 cpuset=/ mems_allowed=0
    Pid: 2976, comm: mal-80 Not tainted 3.7.0+ #10
    Call Trace:
     [<ffffffff8167fbfb>] dump_header+0x83/0x1ca
     ..... (call trace)
     [<ffffffff8168a818>] page_fault+0x28/0x30
                             <<<<<<<<<<<<<<<<<<<<< memcg specific information
    Task in /A/B/D killed as a result of limit of /A
    memory: usage 101376kB, limit 101376kB, failcnt 57
    memory+swap: usage 101376kB, limit 101376kB, failcnt 0
    kmem: usage 0kB, limit 9007199254740991kB, failcnt 0
                             <<<<<<<<<<<<<<<<<<<<< print per cpu pageset stat
    Mem-Info:
    Node 0 DMA per-cpu:
    CPU    0: hi:    0, btch:   1 usd:   0
    ......
    CPU    3: hi:    0, btch:   1 usd:   0
    Node 0 DMA32 per-cpu:
    CPU    0: hi:  186, btch:  31 usd: 173
    ......
    CPU    3: hi:  186, btch:  31 usd: 130
                             <<<<<<<<<<<<<<<<<<<<< print global page state
    active_anon:92963 inactive_anon:40777 isolated_anon:0
     active_file:33027 inactive_file:51718 isolated_file:0
     unevictable:0 dirty:3 writeback:0 unstable:0
     free:729995 slab_reclaimable:6897 slab_unreclaimable:6263
     mapped:20278 shmem:35971 pagetables:5885 bounce:0
     free_cma:0
                             <<<<<<<<<<<<<<<<<<<<< print per zone page state
    Node 0 DMA free:15836kB ... all_unreclaimable? no
    lowmem_reserve[]: 0 3175 3899 3899
    Node 0 DMA32 free:2888564kB ... all_unrelaimable? no
    lowmem_reserve[]: 0 0 724 724
    lowmem_reserve[]: 0 0 0 0
    Node 0 DMA: 1*4kB (U) ... 3*4096kB (M) = 15836kB
    Node 0 DMA32: 41*4kB (UM) ... 702*4096kB (MR) = 2888316kB
    120710 total pagecache pages
    0 pages in swap cache
                             <<<<<<<<<<<<<<<<<<<<< print global swap cache stat
    Swap cache stats: add 0, delete 0, find 0/0
    Free swap  = 499708kB
    Total swap = 499708kB
    1040368 pages RAM
    58678 pages reserved
    169065 pages shared
    173632 pages non-shared
    [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
    [ 2693]     0  2693     6005     1324      17        0             0 god
    [ 2754]     0  2754     6003     1320      16        0             0 god
    [ 2811]     0  2811     5992     1304      18        0             0 god
    [ 2874]     0  2874     6005     1323      18        0             0 god
    [ 2935]     0  2935     8720     7742      21        0             0 mal-30
    [ 2976]     0  2976    21520    17577      42        0             0 mal-80
    Memory cgroup out of memory: Kill process 2976 (mal-80) score 665 or sacrifice child
    Killed process 2976 (mal-80) total-vm:86080kB, anon-rss:69964kB, file-rss:344kB

We can see that messages dumped by show_free_areas() are longsome and can
provide so limited info for memcg that just happen oom.

(2) After change
    mal-80 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
    mal-80 cpuset=/ mems_allowed=0
    Pid: 2704, comm: mal-80 Not tainted 3.7.0+ #10
    Call Trace:
     [<ffffffff8167fd0b>] dump_header+0x83/0x1d1
     .......(call trace)
     [<ffffffff8168a918>] page_fault+0x28/0x30
    Task in /A/B/D killed as a result of limit of /A
                             <<<<<<<<<<<<<<<<<<<<< memcg specific information
    memory: usage 102400kB, limit 102400kB, failcnt 140
    memory+swap: usage 102400kB, limit 102400kB, failcnt 0
    kmem: usage 0kB, limit 9007199254740991kB, failcnt 0
    Memory cgroup stats for /A: cache:32KB rss:30984KB mapped_file:0KB swap:0KB inactive_anon:6912KB active_anon:24072KB inactive_file:32KB active_file:0KB unevictable:0KB
    Memory cgroup stats for /A/B: cache:0KB rss:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
    Memory cgroup stats for /A/C: cache:0KB rss:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
    Memory cgroup stats for /A/B/D: cache:32KB rss:71352KB mapped_file:0KB swap:0KB inactive_anon:6656KB active_anon:64696KB inactive_file:16KB active_file:16KB unevictable:0KB
    [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
    [ 2260]     0  2260     6006     1325      18        0             0 god
    [ 2383]     0  2383     6003     1319      17        0             0 god
    [ 2503]     0  2503     6004     1321      18        0             0 god
    [ 2622]     0  2622     6004     1321      16        0             0 god
    [ 2695]     0  2695     8720     7741      22        0             0 mal-30
    [ 2704]     0  2704    21520    17839      43        0             0 mal-80
    Memory cgroup out of memory: Kill process 2704 (mal-80) score 669 or sacrifice child
    Killed process 2704 (mal-80) total-vm:86080kB, anon-rss:71016kB, file-rss:340kB

This version provides more pointed info for memcg in "Memory cgroup stats
for XXX" section.

Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:08 -08:00
..
backing-dev.c bdi: allow block devices to say that they require stable page writes 2013-02-21 17:22:19 -08:00
balloon_compaction.c
bootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
bounce.c block: optionally snapshot page contents to provide stable pages during write 2013-02-21 17:22:20 -08:00
cleancache.c
compaction.c mm: compaction: partially revert capture of suitable high-order page 2013-01-11 14:54:56 -08:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap.c mm: only enforce stable page writes if the backing device requires it 2013-02-21 17:22:19 -08:00
filemap_xip.c
fremap.c
frontswap.c
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c thp: avoid dumping huge zero page 2013-02-05 20:38:46 +11:00
hugetlb.c mm/hugetlb: set PTE as huge in hugetlb_change_protection and remove_migration_pte 2013-02-05 20:38:47 +11:00
hugetlb_cgroup.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hwpoison-inject.c
init-mm.c
internal.h mm: compaction: partially revert capture of suitable high-order page 2013-01-11 14:54:56 -08:00
interval_tree.c
Kconfig Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: remove obsolete simple_strtoul 2012-12-18 15:02:15 -08:00
ksm.c ksm: make rmap walks more scalable 2012-12-20 07:06:56 -08:00
maccess.c
madvise.c
Makefile
memblock.c memblock: Add memblock_mem_size() 2013-01-29 19:32:57 -08:00
memcontrol.c memcg, oom: provide more precise dump info while memcg oom happening 2013-02-23 17:50:08 -08:00
memory-failure.c
memory.c mm: reinstante dropped pmd_trans_splitting() check 2013-01-09 08:36:54 -08:00
memory_hotplug.c mm/memory_hotplug.c: improve comments 2012-12-18 15:02:15 -08:00
mempolicy.c mm: mempolicy: Convert shared_policy mutex to spinlock 2013-01-02 17:32:13 -08:00
mempool.c
migrate.c mm/hugetlb: set PTE as huge in hugetlb_change_protection and remove_migration_pte 2013-02-05 20:38:47 +11:00
mincore.c
mlock.c mm: don't overwrite mm->def_flags in do_mlockall() 2013-02-12 14:34:00 -08:00
mm_init.c
mmap.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 18:19:48 -08:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c mm/mprotect.c: coding-style cleanups 2012-12-18 15:02:15 -08:00
mremap.c sched: Move sched.h sysctl bits into separate header 2013-02-07 20:50:54 +01:00
msync.c
nobootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
nommu.c sched: Move sched.h sysctl bits into separate header 2013-02-07 20:50:54 +01:00
oom_kill.c memcg, oom: provide more precise dump info while memcg oom happening 2013-02-23 17:50:08 -08:00
page-writeback.c block: optionally snapshot page contents to provide stable pages during write 2013-02-21 17:22:20 -08:00
page_alloc.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 18:19:48 -08:00
page_cgroup.c
page_io.c
page_isolation.c mm: fix zone_watermark_ok_safe() accounting of isolated pages 2013-01-04 16:11:46 -08:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c s390/mm: implement software dirty bits 2013-02-14 15:55:23 +01:00
shmem.c mempolicy: remove arg from mpol_parse_str, mpol_to_str 2013-01-02 09:27:10 -08:00
slab.c memcg: add comments clarifying aspects of cache attribute propagation 2012-12-18 15:02:15 -08:00
slab.h slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slab_common.c slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slob.c sl[au]b: always get the cache from its page in kmem_cache_free() 2012-12-18 15:02:14 -08:00
slub.c slub: drop mutex before deleting sysfs entry 2012-12-18 15:02:15 -08:00
sparse-vmemmap.c
sparse.c
swap.c
swap_state.c
swapfile.c
truncate.c mm: drop vmtruncate 2012-12-20 18:46:29 -05:00
util.c
vmalloc.c
vmscan.c MM: vmscan: remove __devinit attribute. 2013-01-03 15:57:13 -08:00
vmstat.c