android_kernel_google_msm/mm
Johannes Weiner 162692c2f8 mm: vmscan: clear kswapd's special reclaim powers before exiting
commit 71abdc15ad upstream.

When kswapd exits, it can end up taking locks that were previously held
by allocating tasks while they waited for reclaim.  Lockdep currently
warns about this:

On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote:
>  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
>  kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes:
>   (&sig->group_rwsem){+++++?}, at: exit_signals+0x24/0x130
>  {RECLAIM_FS-ON-W} state was registered at:
>     mark_held_locks+0xb9/0x140
>     lockdep_trace_alloc+0x7a/0xe0
>     kmem_cache_alloc_trace+0x37/0x240
>     flex_array_alloc+0x99/0x1a0
>     cgroup_attach_task+0x63/0x430
>     attach_task_by_pid+0x210/0x280
>     cgroup_procs_write+0x16/0x20
>     cgroup_file_write+0x120/0x2c0
>     vfs_write+0xc0/0x1f0
>     SyS_write+0x4c/0xa0
>     tracesys+0xdd/0xe2
>  irq event stamp: 49
>  hardirqs last  enabled at (49):  _raw_spin_unlock_irqrestore+0x36/0x70
>  hardirqs last disabled at (48):  _raw_spin_lock_irqsave+0x2b/0xa0
>  softirqs last  enabled at (0):  copy_process.part.24+0x627/0x15f0
>  softirqs last disabled at (0):            (null)
>
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
>
>         CPU0
>         ----
>    lock(&sig->group_rwsem);
>    <Interrupt>
>      lock(&sig->group_rwsem);
>
>   *** DEADLOCK ***
>
>  no locks held by kswapd2/1151.
>
>  stack backtrace:
>  CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4
>  Call Trace:
>    dump_stack+0x19/0x1b
>    print_usage_bug+0x1f7/0x208
>    mark_lock+0x21d/0x2a0
>    __lock_acquire+0x52a/0xb60
>    lock_acquire+0xa2/0x140
>    down_read+0x51/0xa0
>    exit_signals+0x24/0x130
>    do_exit+0xb5/0xa50
>    kthread+0xdb/0x100
>    ret_from_fork+0x7c/0xb0

This is because the kswapd thread is still marked as a reclaimer at the
time of exit.  But because it is exiting, nobody is actually waiting on
it to make reclaim progress anymore, and it's nothing but a regular
thread at this point.  Be tidy and strip it of all its powers
(PF_MEMALLOC, PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state)
before returning from the thread function.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30 20:01:31 -07:00
..
backing-dev.c
bootmem.c mm: sparse: fix usemap allocation above node descriptor section 2012-10-02 10:30:36 -07:00
bounce.c
cleancache.c
compaction.c mm: compaction: fix echo 1 > compact_memory return error issue 2013-01-17 08:50:43 -08:00
debug-pagealloc.c
dmapool.c mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls 2012-12-17 10:37:44 -08:00
fadvise.c mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages 2013-02-28 06:59:01 -08:00
failslab.c
filemap.c
filemap_xip.c
fremap.c
highmem.c mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address 2014-06-11 12:04:22 -07:00
huge_memory.c mm/huge_memory.c: fix potential NULL pointer dereference 2013-09-26 17:15:51 -07:00
hugetlb.c mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages() 2014-06-07 16:01:57 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm: setup pageblock_order before it's used by sparsemem 2014-02-20 10:45:32 -08:00
Kconfig
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c
maccess.c
madvise.c mm: Hold a file reference in madvise_remove 2012-07-16 09:04:43 -07:00
Makefile
memblock.c x86, mm: Trim memory in memblock to be page aligned 2012-10-31 10:02:56 -07:00
memcontrol.c memcg: fix multiple large threshold notifications 2013-09-26 17:15:50 -07:00
memory-failure.c mm/memory-failure.c: don't let collect_procs() skip over processes for MF_ACTION_REQUIRED 2014-06-30 20:01:31 -07:00
memory.c mm: make fixup_user_fault() check the vma access rights too 2014-06-07 16:02:00 -07:00
memory_hotplug.c mm/hotplug: correctly add new zone to all other nodes' zone lists 2014-03-11 16:10:04 -07:00
mempolicy.c tmpfs mempolicy: fix /proc/mounts corrupting memory 2013-01-11 09:06:49 -08:00
mempool.c
migrate.c mm: migration: add migrate_entry_wait_huge() 2013-06-20 11:58:46 -07:00
mincore.c
mlock.c
mm_init.c
mmap.c mm: do not grow the stack vma just because of an overrun on preceding vma 2013-10-22 09:02:25 +01:00
mmu_context.c
mmu_notifier.c mm: mmu_notifier: re-fix freed page still mapped in secondary MMU 2013-06-07 12:49:25 -07:00
mmzone.c
mprotect.c
mremap.c
msync.c
nobootmem.c memblock: free allocated memblock_reserved_regions later 2012-07-16 09:04:45 -07:00
nommu.c vm: add no-mmu vm_iomap_memory() stub 2013-08-20 08:26:27 -07:00
oom_kill.c mm, memcg: give exiting processes access to memory reserves 2013-10-05 07:06:54 -07:00
page-writeback.c mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq() 2014-02-20 10:45:32 -08:00
page_alloc.c mm: setup pageblock_order before it's used by sparsemem 2014-02-20 10:45:32 -08:00
page_cgroup.c
page_io.c
page_isolation.c
pagewalk.c mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas 2013-06-07 12:49:28 -07:00
percpu-km.c
percpu-vm.c
percpu.c percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() 2014-06-07 16:02:03 -07:00
pgtable-generic.c
prio_tree.c
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-14 11:29:51 -07:00
quicklist.c
readahead.c
rmap.c mm: fix sleeping function warning from __put_anon_vma 2014-06-30 20:01:31 -07:00
shmem.c tmpfs: fix use-after-free of mempolicy object 2013-02-28 06:59:01 -08:00
slab.c slab: fix the DEADLOCK issue on l3 alien lock 2012-10-13 05:38:37 +09:00
slob.c
slub.c slub: Fix calculation of cpu slabs 2014-02-13 11:51:09 -08:00
sparse-vmemmap.c
sparse.c mm: setup pageblock_order before it's used by sparsemem 2014-02-20 10:45:32 -08:00
swap.c mm: hugetlbfs: fix hugetlbfs optimization 2014-02-06 11:05:46 -08:00
swap_state.c swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion 2013-06-20 11:58:45 -07:00
swapfile.c swap: fix shmem swapping when more than 8 areas 2012-06-22 11:36:55 -07:00
thrash.c
truncate.c mm: fix invalidate_complete_page2() lock ordering 2012-10-13 05:38:51 +09:00
util.c
vmalloc.c mm: fix faulty initialization in vmalloc_init() 2012-06-10 00:36:06 +09:00
vmscan.c mm: vmscan: clear kswapd's special reclaim powers before exiting 2014-06-30 20:01:31 -07:00
vmstat.c