Commit Graph

727 Commits

Author SHA1 Message Date
Kevin F. Haggerty 0fdd45c3ac Merge remote-tracking branch 'google-common/deprecated/android-3.4' into lineage-16.0
Change-Id: I363f9d4d0623906eaffffb3747a162ccbc92ccb0
Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>
2019-08-06 11:41:21 +02:00
Kevin F. Haggerty 238a0fb5ad Merge tag 'v3.4.113' into lineage-16.0
This is the 3.4.113 stable release

Change-Id: I80791430656359c5447a675cbff4431362d18df0
Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>
2019-08-05 14:20:47 +02:00
Francescodario Cuzzocrea 85baa390bf misc: Import SM-G900H kernel source code
* Samsung Package Version: G800HXXU1CRJ1
    * CAF Tag: LA.BF.1.1.3-00110-8x26.0
2019-08-02 15:14:10 +02:00
Vinayak Menon 898d548f43 mm: prevent MIGRATE_ISOLATE pages entering other free lists
It is observed that, during a CMA allocation,
test_pages_isolated() fails sometimes.
In low memory situations, this results in continuous
failures of CMA allocations.
When testing for isolation, test_pages_isolated finds
that some pages are allocated, which were available on
pcp list when start_isolate_page_range marked the
pageblock as MIGRATE_ISOLATE. These pages, though
marked as isolated in the respective pageblock, are
moved to non-isolate freelists while being drained.
This results in these pages being allocated from other
paths while a contig allocation is in progress. In
certain cases, these pages get allocated from the
migrate_pages() of current alloc_contig_range() itself.

In the function free_pcppages_bulk(), when a page is
moved to buddy, the migrate list is picked based on
page_private. In the case of CMA pages, which were on
pcp list during start_isolate_page_range, page_private
will point to MIGRATE_CMA, and the pageblock migrate
type can be MIGRATE_ISOLATE. This results in these pages
being moved to cma freelist, rather than isolate freelist,
inside free_pcppages_bulk. This means they can get allocated.

If cma pages are never allowed to enter pcp lists, we
will not hit this issue. This is partly done by
free_hot_cold_page(), where we prevent cma pages
entering pcp. But cma pages can still get on pcp lists
through rmqueue_bulk. Preventing cma pages enter pcp
via rmqueue_bulk, can result in underutilization of
CMA pages for non-contiguous allocations, as rmqueue_bulk
is the frequent path for order 0 allocations.

CRs-fixed:720761
Change-Id: I01b12e6455c75519f1dce5d31467123cb1eb54b7
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2014-10-20 23:31:23 -07:00
Linux Build Service Account f6eeaa0fb3 Merge "mm: change freepage state correctly in __isolate_free_page" 2013-12-11 05:07:38 -08:00
Laura Abbott 102d27f7a0 mm: change freepage state correctly in __isolate_free_page
Commit 2e30abd1730751d58463d88bc0844ab8fd7112a9
(mm: cma: skip watermarks check for already isolated blocks
in split_free_page()) changed the ordering of where the watermark
checks go for isolated pages. There was already an 'enhancement'
present in __isolate_free_page to skip the watermark checks for
CMA pages to increase success. The merging of the enhancment and
the aforementioned commit was done incorrectly, resulting in
the free page state never being modified for CMA pages even if
the CMA page was removed from the free list. Add the enhancement
properly by only checking for CMA pages at the watermark level
and allow the page state to be modfied for CMA pages as well.

Change-Id: Iabea982108d98150f54e5c42b7dbf30f0743653a
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-12-10 13:23:13 -08:00
Rik van Riel 3b321ebfc9 add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.

This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period.  In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.

It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.

[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.

Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
Git-commit: 92189d47f66c67e5fd92eafaa287e153197a454f
Git-repo: https://android.googlesource.com/kernel/common/
[bhargavuln@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Bhargav Upperla <bhargavuln@codeaurora.org>
2013-11-27 11:01:37 -08:00
Lisa Du e0935219e1 mm: vmscan: fix do_try_to_free_pages() livelock
This patch is based on KOSAKI's work and I add a little more description,
please refer https://lkml.org/lkml/2012/6/14/74.

Currently, I found system can enter a state that there are lots of free
pages in a zone but only order-0 and order-1 pages which means the zone is
heavily fragmented, then high order allocation could make direct reclaim
path's long stall(ex, 60 seconds) especially in no swap and no compaciton
enviroment.  This problem happened on v3.4, but it seems issue still lives
in current tree, the reason is do_try_to_free_pages enter live lock:

kswapd will go to sleep if the zones have been fully scanned and are still
not balanced.  As kswapd thinks there's little point trying all over again
to avoid infinite loop.  Instead it changes order from high-order to
0-order because kswapd think order-0 is the most important.  Look at
73ce02e9 in detail.  If watermarks are ok, kswapd will go back to sleep
and may leave zone->all_unreclaimable =3D 0.  It assume high-order users
can still perform direct reclaim if they wish.

Direct reclaim continue to reclaim for a high order which is not a
COSTLY_ORDER without oom-killer until kswapd turn on
zone->all_unreclaimble= .  This is because to avoid too early oom-kill.
So it means direct_reclaim depends on kswapd to break this loop.

In worst case, direct-reclaim may continue to page reclaim forever when
kswapd sleeps forever until someone like watchdog detect and finally kill
the process.  As described in:
http://thread.gmane.org/gmane.linux.kernel.mm/103737

We can't turn on zone->all_unreclaimable from direct reclaim path because
direct reclaim path don't take any lock and this way is racy.  Thus this
patch removes zone->all_unreclaimable field completely and recalculates
zone reclaimable state every time.

Note: we can't take the idea that direct-reclaim see zone->pages_scanned
directly and kswapd continue to use zone->all_unreclaimable.  Because, it
is racy.  commit 929bea7c71 (vmscan: all_unreclaimable() use
zone->all_unreclaimable as a name) describes the detail.

Change-Id: I49970a0fa751cf33af293fd1ee784e36422785b1
[akpm@linux-foundation.org: uninline zone_reclaimable_pages() and zone_reclaimable()]
Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
Cc: Ying Han <yinghan@google.com>
Cc: Nick Piggin <npiggin@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Neil Zhang <zhangwm@marvell.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Lisa Du <cldu@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: 6e543d5780e36ff5ee56c44d7e2e30db3457a7ed
[lauraa@codeaurora.org: Some context fixups and variable name changes due
to backporting. Dropped parts that don't apply to older kernels]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-11-01 11:00:26 -07:00
Laura Abbott f8382d82e1 mm: Remove __init annotations from free_bootmem_late
free_bootmem_late is currently set up to only be used in init
functions. Some clients need to use this function past initcalls.
The functions themselves have no restrictions on being used later
minus the __init annotations so remove the annotation.

Change-Id: I7c7e15cf2780a8843ebb4610da5b633c9abb0b3d
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-09-17 12:52:21 -07:00
Lee Susman 0cc1a0cdb3 mm: pass readahead info down to the i/o scheduler
Some i/o schedulers (i.e. row-iosched, cfq-iosched) deploy an idling
algorithm in order to be better synced with the readahead algorithm.
Idling is a prediction algorithm for incoming read requests.

In this patch we mark pages which are part of a readahead window, by
setting a newly introduced flag. With this flag, the i/o scheduler can
identify a request which is associated with a readahead page. This
enables the i/o scheduler's idling mechanism to be en-sync with the
readahead mechanism and, in turn, can increase read throughput.

Change-Id: I0654f23315b6d19d71bcc9cc029c6b281a44b196
Signed-off-by: Lee Susman <lsusman@codeaurora.org>
2013-06-23 16:43:22 +03:00
Mel Gorman 9814940006 mm: compaction: partially revert capture of suitable high-order page
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket.  It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page
immediately when it is made available").

The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed.  For Eric, the problem was
that page->pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped.  However, I identified a few more problems with the patch
including the fact that it can increase contention on zone->lock in some
cases which could result in async direct compaction being aborted early.

In retrospect the capture patch took the wrong approach.  What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way.  While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested.  This patch
partially reverts commit 1fb3f8ca0e92 ("mm: compaction: capture a
suitable high-order page immediately when it is made available").

Change-Id: I985725a72aac0fdecbf4310c04d176f39e0386dd
Reported-and-tested-by: Eric Wong <normalperson@yhbt.net>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 8fb74b9fb2b182d54beee592350d9ea1f325917a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:51 -07:00
Marek Szyprowski 993e9516a3 mm: cma: skip watermarks check for already isolated blocks in split_free_page()
Since commit 2139cbe627b8 ("cma: fix counting of isolated pages") free
pages in isolated pageblocks are not accounted to NR_FREE_PAGES counters,
so watermarks check is not required if one operates on a free page in
isolated pageblock.

Change-Id: Id9b38d2e4e504336e11274a30044e3aacbc37d03
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 2e30abd1730751d58463d88bc0844ab8fd7112a9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Context fixups due to being merged
out of order. Bring over an added 'feature' where we don't
obey watermarks for CMA pages]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:51 -07:00
Minchan Kim d05b452041 CMA: migrate mlocked pages
Presently CMA cannot migrate mlocked pages so it ends up failing to allocate
contiguous memory space.

This patch makes mlocked pages be migrated out.  Of course, it can affect
realtime processes but in CMA usecase, contiguous memory allocation failing
is far worse than access latency to an mlocked page being variable while
CMA is running.  If someone wants to make the system realtime, he shouldn't
enable CMA because stalls can still happen at random times.

Change-Id: I560f43fdeb94f8fd2a4cc9e2ac12a1593ca38ecb
[akpm@linux-foundation.org: tweak comment text, per Mel]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: e46a28790e594c0876d1a84270926abf75460f61
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Minor context fixups]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:47 -07:00
Mel Gorman 9317a2b709 mm: compaction: clear PG_migrate_skip based on compaction and reclaim activity
Compaction caches if a pageblock was scanned and no pages were isolated so
that the pageblocks can be skipped in the future to reduce scanning.  This
information is not cleared by the page allocator based on activity due to
the impact it would have to the page allocator fast paths.  Hence there is
a requirement that something clear the cache or pageblocks will be skipped
forever.  Currently the cache is cleared if there were a number of recent
allocation failures and it has not been cleared within the last 5 seconds.
Time-based decisions like this are terrible as they have no relationship
to VM activity and is basically a big hammer.

Unfortunately, accurate heuristics would add cost to some hot paths so
this patch implements a rough heuristic.  There are two cases where the
cache is cleared.

1. If a !kswapd process completes a compaction cycle (migrate and free
   scanner meet), the zone is marked compact_blockskip_flush. When kswapd
   goes to sleep, it will clear the cache. This is expected to be the
   common case where the cache is cleared. It does not really matter if
   kswapd happens to be asleep or going to sleep when the flag is set as
   it will be woken on the next allocation request.

2. If there have been multiple failures recently and compaction just
   finished being deferred then a process will clear the cache and start a
   full scan.  This situation happens if there are multiple high-order
   allocation requests under heavy memory pressure.

The clearing of the PG_migrate_skip bits and other scans is inherently
racy but the race is harmless.  For allocations that can fail such as THP,
they will simply fail.  For requests that cannot fail, they will retry the
allocation.  Tests indicated that scanning rates were roughly similar to
when the time-based heuristic was used and the allocation success rates
were similar.

Change-Id: If690ae126badb9f9cc5632e9ffb9d376bf210fb0
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Richard Davies <richard@arachsys.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 62997027ca5b3d4618198ed8b1aba40b61b1137b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:47 -07:00
Minchan Kim 13f662c943 mm: clean up __count_immobile_pages()
The __count_immobile_pages() naming is rather awkward.  Choose a more
clear name and add a comment.

Change-Id: Ic1d8573bbc7eaa82dd5e3f9a1199ee6dd4ac9fc0
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 80934513b230bfcf70265f2ef0fdae89fb391633
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:45 -07:00
Minchan Kim 259ca6a75e mm: do not use page_count() without a page pin
d179e84ba ("mm: vmscan: do not use page_count without a page pin") fixed
this problem in vmscan.c but same problem is in __count_immobile_pages().

I copy and paste d179e84ba's contents for description.

"It is unsafe to run page_count during the physical pfn scan because
compound_head could trip on a dangling pointer when reading
page->first_page if the compound page is being freed by another CPU."

Change-Id: I2b620c07adfa2c5a0f56219a11c042dc97428085
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 97d255c816946388bab504122937730d3447c612
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:45 -07:00
Marek Szyprowski dad0d8bd09 mm: cma: WARN if freed memory is still in use
Memory returned to free_contig_range() must have no other references.
Let kernel to complain loudly if page reference count is not equal to 1.

Change-Id: If1b84bb383d97eff441d7a1e18b874c64b7f5f85
[rientjes@google.com: support sparsemem]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bcc2b02f4c1b36bc67272df7119b75bac78525ab
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:44 -07:00
Mel Gorman 6b50664102 mm: compaction: cache if a pageblock was scanned and no pages were isolated
When compaction was implemented it was known that scanning could
potentially be excessive.  The ideal was that a counter be maintained for
each pageblock but maintaining this information would incur a severe
penalty due to a shared writable cache line.  It has reached the point
where the scanning costs are a serious problem, particularly on
long-lived systems where a large process starts and allocates a large
number of THPs at the same time.

Instead of using a shared counter, this patch adds another bit to the
pageblock flags called PG_migrate_skip.  If a pageblock is scanned by
either migrate or free scanner and 0 pages were isolated, the pageblock is
marked to be skipped in the future.  When scanning, this bit is checked
before any scanning takes place and the block skipped if set.

The main difficulty with a patch like this is "when to ignore the cached
information?" If it's ignored too often, the scanning rates will still be
excessive.  If the information is too stale then allocations will fail
that might have otherwise succeeded.  In this patch

o CMA always ignores the information
o If the migrate and free scanner meet then the cached information will
  be discarded if it's at least 5 seconds since the last time the cache
  was discarded
o If there are a large number of allocation failures, discard the cache.

The time-based heuristic is very clumsy but there are few choices for a
better event.  Depending solely on multiple allocation failures still
allows excessive scanning when THP allocations are failing in quick
succession due to memory pressure.  Waiting until memory pressure is
relieved would cause compaction to continually fail instead of using
reclaim/compaction to try allocate the page.  The time-based mechanism is
clumsy but a better option is not obvious.

Change-Id: I17a4887aca9bb3d2d9d3756089ad7c9b89922727
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Richard Davies <richard@arachsys.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bb13ffeb9f6bfeb301443994dfbf29f91117dfb3
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Context fixup due to merging patches out of order]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:43 -07:00
Mel Gorman 0a1104a834 mm: compaction: capture a suitable high-order page immediately when it is made available
While compaction is migrating pages to free up large contiguous blocks
for allocation it races with other allocation requests that may steal
these blocks or break them up.  This patch alters direct compaction to
capture a suitable free page as soon as it becomes available to reduce
this race.  It uses similar logic to split_free_page() to ensure that
watermarks are still obeyed.

Change-Id: I46fc38ca67bc50aa7a77a59255caf563f50343a9
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 1fb3f8ca0e9222535a39b884cb67a34628411b9f
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:40 -07:00
Mel Gorman 27d1af88b8 mm: compaction: Abort async compaction if locks are contended or taking too long
Jim Schutt reported a problem that pointed at compaction contending
heavily on locks.  The workload is straight-forward and in his own words;

	The systems in question have 24 SAS drives spread across 3 HBAs,
	running 24 Ceph OSD instances, one per drive.  FWIW these servers
	are dual-socket Intel 5675 Xeons w/48 GB memory.  I've got ~160
	Ceph Linux clients doing dd simultaneously to a Ceph file system
	backed by 12 of these servers.

Early in the test everything looks fine

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  31 15          0     287216        576   38606628    0    0     2  1158    2   14   1  3  95  0  0
  27 15          0     225288        576   38583384    0    0    18 2222016 203357 134876  11 56  17 15  0
  28 17          0     219256        576   38544736    0    0    11 2305932 203141 146296  11 49  23 17  0
   6 18          0     215596        576   38552872    0    0     7 2363207 215264 166502  12 45  22 20  0
  22 18          0     226984        576   38596404    0    0     3 2445741 223114 179527  12 43  23 22  0

and then it goes to pot

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  163  8          0     464308        576   36791368    0    0    11 22210  866  536   3 13  79  4  0
  207 14          0     917752        576   36181928    0    0   712 1345376 134598 47367   7 90   1  2  0
  123 12          0     685516        576   36296148    0    0   429 1386615 158494 60077   8 84   5  3  0
  123 12          0     598572        576   36333728    0    0  1107 1233281 147542 62351   7 84   5  4  0
  622  7          0     660768        576   36118264    0    0   557 1345548 151394 59353   7 85   4  3  0
  223 11          0     283960        576   36463868    0    0    46 1107160 121846 33006   6 93   1  1  0

Note that system CPU usage is very high blocks being written out has
dropped by 42%. He analysed this with perf and found

  perf record -g -a sleep 10
  perf report --sort symbol --call-graph fractal,5
    34.63%  [k] _raw_spin_lock_irqsave
            |
            |--97.30%-- isolate_freepages
            |          compaction_alloc
            |          unmap_and_move
            |          migrate_pages
            |          compact_zone
            |          compact_zone_order
            |          try_to_compact_pages
            |          __alloc_pages_direct_compact
            |          __alloc_pages_slowpath
            |          __alloc_pages_nodemask
            |          alloc_pages_vma
            |          do_huge_pmd_anonymous_page
            |          handle_mm_fault
            |          do_page_fault
            |          page_fault
            |          |
            |          |--87.39%-- skb_copy_datagram_iovec
            |          |          tcp_recvmsg
            |          |          inet_recvmsg
            |          |          sock_recvmsg
            |          |          sys_recvfrom
            |          |          system_call
            |          |          __recv
            |          |          |
            |          |           --100.00%-- (nil)
            |          |
            |           --12.61%-- memcpy
             --2.70%-- [...]

There was other data but primarily it is all showing that compaction is
contended heavily on the zone->lock and zone->lru_lock.

commit [b2eef8c0: mm: compaction: minimise the time IRQs are disabled
while isolating pages for migration] noted that it was possible for
migration to hold the lru_lock for an excessive amount of time. Very
broadly speaking this patch expands the concept.

This patch introduces compact_checklock_irqsave() to check if a lock
is contended or the process needs to be scheduled. If either condition
is true then async compaction is aborted and the caller is informed.
The page allocator will fail a THP allocation if compaction failed due
to contention. This patch also introduces compact_trylock_irqsave()
which will acquire the lock only if it is not contended and the process
does not need to schedule.

Change-Id: Ia5318c923b903948072ff279dc9aed698bb6d02d
Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: c67fe3752abe6ab47639e2f9b836900c3dc3da84
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Minor context fixup in isolate_migratepages_range]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:31 -07:00
Minchan Kim 726cfb48b6 cma: decrease cc.nr_migratepages after reclaiming pagelist
reclaim_clean_pages_from_list() reclaims clean pages before migration so
cc.nr_migratepages should be updated.  Currently, there is no problem but
it can be wrong if we try to use the value in future.

Change-Id: I8b3f1238645ba1b3adcc0fe3c41e10f7074b9a96
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: beb51eaa88238daba698ad837222ad277d440b6d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Change cc to be structure instead of pointer]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:19 -07:00
Minchan Kim aab50f141c mm: cma: discard clean pages during contiguous allocation instead of migration
Drop clean cache pages instead of migration during alloc_contig_range() to
minimise allocation latency by reducing the amount of migration that is
necessary.  It's useful for CMA because latency of migration is more
important than evicting the background process's working set.  In
addition, as pages are reclaimed then fewer free pages for migration
targets are required so it avoids memory reclaiming to get free pages,
which is a contributory factor to increased latency.

I measured elapsed time of __alloc_contig_migrate_range() which migrates
10M in 40M movable zone in QEMU machine.

Before - 146ms, After - 7ms

Change-Id: Ia527b7253bc5fa63b555ac445b676588b6def119
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Mel Gorman <mgorman@suse.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Rik van Riel <riel@redhat.com>
Tested-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 02c6de8d757cb32c0829a45d81c3dfcbcafd998b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Fixups in mm/internal.h due to contexts]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-06-18 10:24:18 -07:00
Tomasz Stanislawski 18d4e2d157 mm: page_alloc: fix watermark check in __zone_watermark_ok()
The watermark check consists of two sub-checks.
The first one is:

	if (free_pages <= min + lowmem_reserve)
		return false;

The check assures that there is minimal amount of RAM in the zone.  If CMA is
used then the free_pages is reduced by the number of free pages in CMA prior
to the over-mentioned check.

	if (!(alloc_flags & ALLOC_CMA))
		free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);

This prevents the zone from being drained from pages available for non-movable
allocations.

The second check prevents the zone from getting too fragmented.

	for (o = 0; o < order; o++) {
		free_pages -= z->free_area[o].nr_free << o;
		min >>= 1;
		if (free_pages <= min)
			return false;
	}

The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages.  Therefore the CMA pages are subtracted twice.  This
may cause a false positive fail of __zone_watermark_ok() if the CMA area gets
strongly fragmented.  In such a case there are many 0-order free pages located
in CMA. Those pages are subtracted twice therefore they will quickly drain
free_pages during the check against fragmentation.  The test fails even though
there are many free non-cma pages in the zone.

This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.

Change-Id: I899919476cfca4bfa48bb5877ed6644784a43ce0
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Patch-mainline: linux-mm @ 05/09/2013, 12:50AM
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-05-15 19:04:38 -07:00
Marek Szyprowski 178a68efb0 mm: cma: fix accounting of CMA pages placed in high memory
The total number of low memory pages is determined as totalram_pages -
totalhigh_pages, so without this patch all CMA pageblocks placed in
highmem were accounted to low memory.

Change-Id: I10b78fa6a710828520a487b2fc2419b4f7521a6f
CRs-Fixed: 480377
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: 41a7973447b0b8717f0a214d4328dc31ec2291d7
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-04-30 13:45:08 -07:00
Marek Szyprowski 800f2b20ed mm: cma: remove watermark hacks
Commits 2139cbe627b8 ("cma: fix counting of isolated pages") and
d95ea5d18e69 ("cma: fix watermark checking") introduced a reliable
method of free page accounting when memory is being allocated from CMA
regions, so the workaround introduced earlier by commit 49f223a9cd96
("mm: trigger page reclaim in alloc_contig_range() to stabilise
watermarks") can be finally removed.

Change-Id: Iae17de8185eeabffd46752dbaf819591e6585869
CRs-Fixed: 480377
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bc357f431c836c6631751e3ef7dfe7882394ad67
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lauraa@codeaurora.org: Context fixup in mmzone.h, keep zone definition in
alloc_contig_range for other purposes]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-04-30 13:45:02 -07:00
Laura Abbott e2958b822e mm: Retry original migrate type if CMA failed
Currently, __rmqueue_cma will disregard the original migrate type
and only try MIGRATE_CMA for allocations. If the MIGRATE_CMA
allocation fails, the fallback types of the original migrate type
are used. Note that in this current path we never try to actually
allocate from the original migrate type. If the only pages left
in the system are the original migrate type, we will fail the
allocation since we never actually try the original migrate type.
This may lead to infinite looping since the system still (correctly)
calculates there are pages available for allocation and will keep
trying to allocate pages. Fix this degenerate case by allocating
from the original migrate type if the MIGRATE_CMA allocation fails.

Change-Id: I62ab293dc694955eaf88e790131a8565395ba8cb
CRs-Fixed: 470615
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-04-05 12:46:59 -07:00
Liam Mark 3f1e551a0e android/lowmemorykiller: Selectively count free CMA pages
In certain memory configurations there can be a large number of
CMA pages which are not suitable to satisfy certain memory
requests.

This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.
In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free CMA pages if they are suitable
for satisfying the memory request.

Change-Id: I7f06d53e2d8cfe7439e5561fe6e5209ce73b1c90
CRs-fixed: 437016
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2013-03-29 14:56:04 -07:00
Olav Haugan 603c272027 Revert "android/lowmemorykiller: Only consider gfp_mask free pages"
This change is not needed anymore due to a new and improved change to the
android low memory killer that is incompatible with this change. To be able
to apply the new change this change needs to be removed.

This reverts commit cc4baf7066.

Change-Id: I33801faf9ac7371f46cf409009939d853adee95e
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2013-03-27 17:27:01 -07:00
Laura Abbott be911befd7 mm: Don't put CMA pages on per cpu lists
CMA allocations rely on being able to migrate pages out
quickly to fulfill the allocations. Most use cases for
movable allocations meet this requirement. File system
allocations may take an unaccpetably long time to
migrate, which creates delays from CMA. Prevent CMA
pages from ending up on the per-cpu lists to avoid
code paths grabbing CMA pages on the fast path. CMA
pages can still be allocated as a fallback under tight
memory pressure.

CRs-Fixed: 452508
Change-Id: I79a28f697275a2a1870caabae53c8ea345b4b47d
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-02-18 07:48:07 -08:00
Heesub Shin 771aaa629a cma: redirect page allocation to CMA
CMA pages are designed to be used as fallback for movable allocations
and cannot be used for non-movable allocations. If CMA pages are
utilized poorly, non-movable allocations may end up getting starved if
all regular movable pages are allocated and the only pages left are
CMA. Always using CMA pages first creates unacceptable performance
problems. As a midway alternative, use CMA pages for certain
userspace allocations. The userspace pages can be migrated or dropped
quickly which giving decent utilization.

Change-Id: I6165dda01b705309eebabc6dfa67146b7a95c174
CRs-Fixed: 452508
[lauraa@codeaurora.org: Missing CONFIG_CMA guards, add commit text]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-02-18 07:24:18 -08:00
Laura Abbott 434f55a0a6 Revert "mm: cma: on movable allocations try MIGRATE_CMA first"
This reverts commit b5662d64fa.

Using CMA pages first creates good utilization but has some
unfortunate side effects. Many movable allocations come from
the filesystem layer which can hold on to pages for long periods
of time which causes high allocation times (~200ms) and high
rates of failure. Revert this patch and use alternate allocation
strategies to get better utilization.

Change-Id: I917e137d5fb292c9f8282506f71a799a6451ccfa
CRs-Fixed: 452508
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-02-18 07:23:04 -08:00
Bartlomiej Zolnierkiewicz 1964431f5d cma: fix watermark checking
* Add ALLOC_CMA alloc flag and pass it to [__]zone_watermark_ok()
  (from Minchan Kim).

* During watermark check decrease available free pages number by
  free CMA pages number if necessary (unmovable allocations cannot
  use pages from CMA areas).

Change-Id: Ibd069b028eb80b70701c1b81cb28a503d8265be0
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[lauraa@codeaurora.org: context fixups]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-02-03 08:15:42 -08:00
Michal Nazarewicz b5662d64fa mm: cma: on movable allocations try MIGRATE_CMA first
It has been observed that system tends to keep a lot of CMA free pages
even in very high memory pressure use cases.  The CMA fallback for
movable pages is used very rarely, only when system is completely
pruned from MOVABLE pages.  This means that the out-of-memory is
triggered for unmovable allocations even when there are many CMA pages
available.  This problem was not observed previously since movable
pages were used as a fallback for unmovable allocations.

To avoid such situation this commit changes the allocation order so
that on movable allocations the MIGRATE_CMA pageblocks are used first.

This change means that the MIGRATE_CMA can be removed from fallback
path of the MIGRATE_MOVABLE type.  This means that the
__rmqueue_fallback() function will never deal with CMA pages and thus
all the checks around MIGRATE_CMA can be removed from that function.

Change-Id: Ie13312d62a6af12d7aa78b4283ed25535a6d49fd
CRs-Fixed: 435287
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-01-07 10:13:17 -08:00
Rabin Vincent 394b808b23 mm: show migration types in show_mem
This is useful to diagnose the reason for page allocation failure for
cases where there appear to be several free pages.

Example, with this alloc_pages(GFP_ATOMIC) failure:

 swapper/0: page allocation failure: order:0, mode:0x0
 ...
 Mem-info:
 Normal per-cpu:
 CPU    0: hi:   90, btch:  15 usd:  48
 CPU    1: hi:   90, btch:  15 usd:  21
 active_anon:0 inactive_anon:0 isolated_anon:0
  active_file:0 inactive_file:84 isolated_file:0
  unevictable:0 dirty:0 writeback:0 unstable:0
  free:4026 slab_reclaimable:75 slab_unreclaimable:484
  mapped:0 shmem:0 pagetables:0 bounce:0
 Normal free:16104kB min:2296kB low:2868kB high:3444kB active_anon:0kB
 inactive_anon:0kB active_file:0kB inactive_file:336kB unevictable:0kB
 isolated(anon):0kB isolated(file):0kB present:331776kB mlocked:0kB
 dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:300kB
 slab_unreclaimable:1936kB kernel_stack:328kB pagetables:0kB unstable:0kB
 bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
 lowmem_reserve[]: 0 0

Before the patch, it's hard (for me, at least) to say why all these free
chunks weren't considered for allocation:

 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB
 1*1024kB 1*2048kB 3*4096kB = 16128kB

After the patch, it's obvious that the reason is that all of these are
in the MIGRATE_CMA (C) freelist:

 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB
 (C) 1*1024kB (C) 1*2048kB (C) 3*4096kB (C) = 16128kB

Change-Id: Ic5fe77d762e0c03715bfb917774e7c4f03ac43f5
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2013-01-07 10:13:04 -08:00
Liam Mark cc4baf7066 android/lowmemorykiller: Only consider gfp_mask free pages
In certain memory configurations there can be a large number of
CMA pages which are not suitable to satisfy certain memory
requests.
This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.

In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free pages which are suitable for
satisfying the memory request.

Change-Id: Iabc3bebc54be9dec7fcbffb94a2fbf18dd684669
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-12-17 17:23:52 -08:00
Larry Bassel 6d6e2c9bcf mm: make counts of CMA free pages correct
Both patches needed, second patch (among other things) fixes
a bug in the first.

commit 2139cbe627b8910ded55148f87ee10f7485408ed
Author: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Date:   Mon Oct 8 16:32:00 2012 -0700

    cma: fix counting of isolated pages

    Isolated free pages shouldn't be accounted to NR_FREE_PAGES counter.  Fix
    it by properly decreasing/increasing NR_FREE_PAGES counter in
    set_migratetype_isolate()/unset_migratetype_isolate() and removing counter
    adjustment for isolated pages from free_one_page() and split_free_page().

    Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
    Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Cc: Michal Nazarewicz <mina86@mina86.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Hugh Dickins <hughd@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [lbassel@codeaurora.org: backport from 3.7, small changes needed]
    Signed-off-by: Larry Bassel <lbassel@codeaurora.org>

commit d1ce749a0db12202b711d1aba1d29e823034648d
Author: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Date:   Mon Oct 8 16:32:02 2012 -0700

    cma: count free CMA pages

    Add NR_FREE_CMA_PAGES counter to be later used for checking watermark in
    __zone_watermark_ok().  For simplicity and to avoid #ifdef hell make this
    counter always available (not only when CONFIG_CMA=y).

    [akpm@linux-foundation.org: use conventional migratetype naming]
    Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
    Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Cc: Michal Nazarewicz <mina86@mina86.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Hugh Dickins <hughd@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [lbassel@codeaurora.org: backport from 3.7, small changes needed]
    Signed-off-by: Larry Bassel <lbassel@codeaurora.org>

Change-Id: I3c694620021efa5ddee1b9dcd87f95b63ad8557c
Signed-off-by: Larry Bassel <lbassel@codeaurora.org>
2012-12-17 14:48:27 -08:00
woojoong.lee 539118fd55 cma : use migrate_prep() instead of migrate_prep_local()
__alloc_contig_migrate_range() should use all possible ways to get all the
pages migrated from the given memory range, so pruning per-cpu lru lists
for all CPUs is required, regadless the cost of such operation. Otherwise
some pages which got stuck at per-cpu lru list might get missed by
migration procedure causing the contiguous allocation to fail.

Change-Id: I70cc0864c57dd49e89f57797122a3fd0f300647a
Signed-off-by: woojoong.lee <woojoong.lee@samsung.com>
Reviewed-on: http://165.213.202.130:8080/43063
Tested-by: System S/W SCM <scm.systemsw@samsung.com>
Reviewed-by: daeho jeong <daeho.jeong@samsung.com>
Reviewed-by: Jeong-Ho Kim <jammer@samsung.com>
Tested-by: Jeong-Ho Kim <jammer@samsung.com>
[lauraa@codeaurora.org: Applied to correct file]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-12-12 06:50:52 -08:00
Linux Build Service Account 40d5552104 Merge "mm: Add is_cma_pageblock definition" 2012-12-10 14:56:15 -08:00
Linux Build Service Account 11e09273e8 Merge "cma: fix migration mode" 2012-12-10 14:56:13 -08:00
Laura Abbott 5bca72232c mm: Use aligned zone start for pfn_to_bitidx calculation
The current calculation in pfn_to_bitidx assumes that
(pfn - zone->zone_start_pfn) >> pageblock_order will return the
same bit for all pfn in a pageblock. If zone_start_pfn is not
aligned to pageblock_nr_pages, this may not always be correct.

Consider the following with pageblock order = 10, zone start 2MB:

pfn     | pfn - zone start | (pfn - zone start) >> page block order
----------------------------------------------------------------
0x26000 | 0x25e00	   |  0x97
0x26100 | 0x25f00	   |  0x97
0x26200 | 0x26000	   |  0x98
0x26300 | 0x26100	   |  0x98

This means that calling {get,set}_pageblock_migratetype on a single
page will not set the migratetype for the full block. Fix this by
rounding down zone_start_pfn when doing the bitidx calculation.

Change-Id: I13e2f53f50db294f38ec86138c17c6fe29f0ee82
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-12-06 13:08:04 -08:00
Linux Build Service Account fac1e451e8 Merge "Revert "android/lowmemorykiller: Only consider gfp_mask free pages"" 2012-12-05 20:28:21 -08:00
Linux Build Service Account 3eda4a946c Merge "mm: split_free_page ignore memory watermarks for CMA" 2012-12-05 18:03:06 -08:00
Liam Mark d47b12e245 Revert "android/lowmemorykiller: Only consider gfp_mask free pages"
This reverts commit f714041949.

Iterating the page free_list without locking it can crash
because the list can change while being iterated.

Iterating the page free_list while locking it can cause a
deadlock.

CRs-fixed: 427061
Change-Id: I9e89b121e7436c8dcac922ad973d8c06a31ce9b0
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2012-12-05 13:14:38 -08:00
Laura Abbott 364dcd43fe mm: Add is_cma_pageblock definition
Bring back the is_cma_pageblock definition for determining if a
page is CMA or not.

Change-Id: I39fd546e22e240b752244832c79514f109c8e84b
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-12-03 10:18:01 -08:00
Minchan Kim 132fb79ea0 cma: fix migration mode
__alloc_contig_migrate_range calls migrate_pages with wrong argument
for migrate_mode. Fix it.

Change-Id: I84697cf7c6aef6253e9ee7e5b3028c946b95e253
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-12-03 10:17:27 -08:00
Liam Mark 4ca9a215f2 mm: split_free_page ignore memory watermarks for CMA
Memory watermarks were sometimes preventing CMA allocations
in low memory.

Change-Id: I550ec987cbd6bc6dadd72b4a764df20cd0758479
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2012-11-27 18:49:58 -08:00
Liam Mark f714041949 android/lowmemorykiller: Only consider gfp_mask free pages
In certain memory configurations there can be a large number of
pages which are not suitable to satisfy certain memory requests
(such as when CMA is enabled).
This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.

In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free pages which are suitable for
satisfying the memory request.

Change-Id: Iaf8c06bb6f34d61e02d2f80f9d7e90e762ee75e8
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2012-11-02 14:28:10 -07:00
Rabin Vincent fbfcd6fa25 mm: cma: don't replace lowmem pages with highmem
The filesystem layer expects pages in the block device's mapping to not
be in highmem (the mapping's gfp mask is set in bdget()), but CMA can
currently replace lowmem pages with highmem pages, leading to crashes in
filesystem code such as the one below:

  Unable to handle kernel NULL pointer dereference at virtual address 00000400
  pgd = c0c98000
  [00000400] *pgd=00c91831, *pte=00000000, *ppte=00000000
  Internal error: Oops: 817 [#1] PREEMPT SMP ARM
  CPU: 0    Not tainted  (3.5.0-rc5+ #80)
  PC is at __memzero+0x24/0x80
  ...
  Process fsstress (pid: 323, stack limit = 0xc0cbc2f0)
  Backtrace:
  [<c010e3f0>] (ext4_getblk+0x0/0x180) from [<c010e58c>] (ext4_bread+0x1c/0x98)
  [<c010e570>] (ext4_bread+0x0/0x98) from [<c0117944>] (ext4_mkdir+0x160/0x3bc)
   r4:c15337f0
  [<c01177e4>] (ext4_mkdir+0x0/0x3bc) from [<c00c29e0>] (vfs_mkdir+0x8c/0x98)
  [<c00c2954>] (vfs_mkdir+0x0/0x98) from [<c00c2a60>] (sys_mkdirat+0x74/0xac)
   r6:00000000 r5:c152eb40 r4:000001ff r3:c14b43f0
  [<c00c29ec>] (sys_mkdirat+0x0/0xac) from [<c00c2ab8>] (sys_mkdir+0x20/0x24)
   r6:beccdcf0 r5:00074000 r4:beccdbbc
  [<c00c2a98>] (sys_mkdir+0x0/0x24) from [<c000e3c0>] (ret_fast_syscall+0x0/0x30)

Fix this by replacing only highmem pages with highmem.

Change-Id: I6af2d509af48b5a586037be14bd3593b3f269d95
Reported-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-07-23 14:48:17 -07:00
Marek Szyprowski f1f6388781 mm: trigger page reclaim in alloc_contig_range() to stabilise watermarks
alloc_contig_range() performs memory allocation so it also should keep
track on keeping the correct level of memory watermarks. This commit adds
a call to *_slowpath style reclaim to grab enough pages to make sure that
the final collection of contiguous pages from freelists will not starve
the system.

Change-Id: I2d68d9ac2cfcd32ca6f515fc7e44e8d9d850dff1
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
CC: Michal Nazarewicz <mina86@mina86.com>
Tested-by: Rob Clark <rob.clark@linaro.org>
Tested-by: Ohad Ben-Cohen <ohad@wizery.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Robert Nelson <robertcnelson@gmail.com>
Tested-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-06-29 16:06:50 -07:00
Marek Szyprowski 1ed3651116 mm: extract reclaim code from __alloc_pages_direct_reclaim()
This patch extracts common reclaim code from __alloc_pages_direct_reclaim()
function to separate function: __perform_reclaim() which can be later used
by alloc_contig_range().

Change-Id: Ia9d8b82018d91dc669488955b20f69f1cba43147
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Tested-by: Rob Clark <rob.clark@linaro.org>
Tested-by: Ohad Ben-Cohen <ohad@wizery.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Robert Nelson <robertcnelson@gmail.com>
Tested-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2012-06-29 16:06:49 -07:00