android_kernel_google_msm/mm
Wu Fengguang 9d823e8f6b writeback: per task dirty rate limit
Add two fields to task_struct.

1) account dirtied pages in the individual tasks, for accuracy
2) per-task balance_dirty_pages() call intervals, for flexibility

The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will
scale near-sqrt to the safety gap between dirty pages and threshold.

The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying
pages at exactly the same time, each task will be assigned a large
initial nr_dirtied_pause, so that the dirty threshold will be exceeded
long before each task reached its nr_dirtied_pause and hence call
balance_dirty_pages().

The solution is to watch for the number of pages dirtied on each CPU in
between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages
(3% dirty threshold), force call balance_dirty_pages() for a chance to
set bdi->dirty_exceeded. In normal situations, this safeguarding
condition is not expected to trigger at all.

On the sqrt in dirty_poll_interval():

It will serve as an initial guess when dirty pages are still in the
freerun area.

When dirty pages are floating inside the dirty control scope [freerun,
limit], a followup patch will use some refined dirty poll interval to
get the desired pause time.

   thresh-dirty (MB)    sqrt
		   1      16
		   2      22
		   4      32
		   8      45
		  16      64
		  32      90
		  64     128
		 128     181
		 256     256
		 512     362
		1024     512

The above table means, given 1MB (or 1GB) gap and the dd tasks polling
balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't
be exceeded as long as there are less than 16 (or 512) concurrent dd's.

So sqrt naturally leads to less overheads and more safe concurrent tasks
for large memory servers, which have large (thresh-freerun) gaps.

peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case

CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
..
backing-dev.c writeback: stabilize bdi->dirty_ratelimit 2011-10-03 21:08:57 +08:00
bootmem.c
bounce.c
cleancache.c mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
compaction.c mm: compaction: abort compaction if too many pages are isolated and caller is asynchronous V2 2011-06-15 20:04:02 -07:00
debug-pagealloc.c
dmapool.c devres: fix possible use after free 2011-07-25 20:57:14 -07:00
fadvise.c
failslab.c fault-injection: add ability to export fault_attr in arbitrary directory 2011-08-03 14:25:20 -10:00
filemap.c mm: account skipped entries to avoid looping in find_get_pages 2011-09-14 18:17:56 -07:00
filemap_xip.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
fremap.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
highmem.c mm: make HASHED_PAGE_VIRTUAL page_address' struct page argument const. 2011-08-17 13:00:20 -07:00
huge_memory.c mm/huge_memory.c: minor lock simplification in __khugepaged_exit 2011-07-25 20:57:09 -07:00
hugetlb.c mm: hugetlb: fix coding style issues 2011-07-25 20:57:09 -07:00
hwpoison-inject.c
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
Kconfig mm Kconfig typo: cleancacne -> cleancache 2011-06-10 14:47:52 +02:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
ksm.c ksm: fix NULL pointer dereference in scan_get_next_rmap_item() 2011-06-15 20:04:02 -07:00
maccess.c maccess,probe_kernel: Make write/read src const void * 2011-05-25 19:56:23 -04:00
madvise.c fs: kill i_alloc_sem 2011-07-20 20:47:46 -04:00
Makefile mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
memblock.c mm/memblock.c: avoid abuse of RED_INACTIVE 2011-07-25 20:57:09 -07:00
memcontrol.c memcg: Revert "memcg: add memory.vmscan_stat" 2011-09-14 18:09:38 -07:00
memory-failure.c HWPoison: add memory_failure_queue() 2011-08-03 11:15:58 -04:00
memory.c mm/futex: fix futex writes on archs with SW tracking of dirty & young 2011-07-25 20:57:11 -07:00
memory_hotplug.c mm: extend memory hotplug API to allow memory hotplug in virtual machines 2011-07-25 20:57:08 -07:00
mempolicy.c mm/mempolicy.c: make copy_from_user() provably correct 2011-09-14 18:09:36 -07:00
mempool.c
migrate.c migrate: don't account swapcache as shmem 2011-06-16 15:01:24 -07:00
mincore.c mm: clarify the radix_tree exceptional cases 2011-08-03 14:25:24 -10:00
mlock.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mm_init.c
mmap.c mmap: fix and tidy up overcommit page arithmetic 2011-07-25 20:57:09 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
msync.c
nobootmem.c memblock/nobootmem: remove unneeded code from alloc_bootmem_node_high() 2011-05-25 08:39:31 -07:00
nommu.c mmap: fix and tidy up overcommit page arithmetic 2011-07-25 20:57:09 -07:00
oom_kill.c oom: task->mm == NULL doesn't mean the memory was freed 2011-08-01 15:24:12 -10:00
page-writeback.c writeback: per task dirty rate limit 2011-10-03 21:08:57 +08:00
page_alloc.c fault-injection: add ability to export fault_attr in arbitrary directory 2011-08-03 14:25:20 -10:00
page_cgroup.c mm/page_cgroup.c: simplify code by using SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macros 2011-07-25 20:57:09 -07:00
page_io.c
page_isolation.c
pagewalk.c pagewalk: fix code comment for THP 2011-07-25 20:57:09 -07:00
percpu-km.c
percpu-vm.c
percpu.c Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-05-24 11:53:42 -07:00
pgtable-generic.c
prio_tree.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
quicklist.c
readahead.c readahead: readahead page allocations are OK to fail 2011-05-25 08:39:25 -07:00
rmap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback 2011-07-26 10:39:54 -07:00
shmem.c mm: clarify the radix_tree exceptional cases 2011-08-03 14:25:24 -10:00
slab.c slab, lockdep: Annotate the locks before using them 2011-08-04 10:18:00 +02:00
slob.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
slub.c slub: add slab with one free object to partial list tail 2011-08-27 11:58:59 +03:00
sparse-vmemmap.c
sparse.c mm: make some struct page's const 2011-07-25 20:57:07 -07:00
swap.c mm: batch activate_page() to reduce lock contention 2011-05-25 08:39:37 -07:00
swap_state.c
swapfile.c mm: let swap use exceptional entries 2011-08-03 14:25:22 -10:00
thrash.c mm: swap-token: add a comment for priority aging 2011-07-25 20:57:08 -07:00
truncate.c mm: a few small updates for radix-swap 2011-08-03 14:25:24 -10:00
util.c mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
vmalloc.c mm: sync vmalloc address space page tables in alloc_vm_area() 2011-09-14 18:09:38 -07:00
vmscan.c memcg: Revert "memcg: add memory.vmscan_stat" 2011-09-14 18:09:38 -07:00
vmstat.c numa: fix NUMA compile error when sysfs and procfs are disabled 2011-09-14 18:09:37 -07:00