Commit Graph

887 Commits

Author SHA1 Message Date
syphyr dece380b97 Revert "lowmemorykiller: adapt to vmpressure"
This reverts commit a7668cd5e2.
2019-07-27 22:09:44 +02:00
Vlastimil Babka 674e801619 mm: when stealing freepages, also take pages created by splitting buddy page
When studying page stealing, I noticed some weird looking decisions in
try_to_steal_freepages().  The first I assume is a bug (Patch 1), the
following two patches were driven by evaluation.

Testing was done with stress-highalloc of mmtests, using the
mm_page_alloc_extfrag tracepoint and postprocessing to get counts of how
often page stealing occurs for individual migratetypes, and what
migratetypes are used for fallbacks.  Arguably, the worst case of page
stealing is when UNMOVABLE allocation steals from MOVABLE pageblock.
RECLAIMABLE allocation stealing from MOVABLE allocation is also not ideal,
so the goal is to minimize these two cases.

The evaluation of v2 wasn't always clear win and Joonsoo questioned the
results.  Here I used different baseline which includes RFC compaction
improvements from [1].  I found that the compaction improvements reduce
variability of stress-highalloc, so there's less noise in the data.

First, let's look at stress-highalloc configured to do sync compaction,
and how these patches reduce page stealing events during the test.  First
column is after fresh reboot, other two are reiterations of test without
reboot.  That was all accumulater over 5 re-iterations (so the benchmark
was run 5x3 times with 5 fresh restarts).

Baseline:

                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                  5-nothp-1       5-nothp-2       5-nothp-3
Page alloc extfrag event                               10264225     8702233    10244125
Extfrag fragmenting                                    10263271     8701552    10243473
Extfrag fragmenting for unmovable                         13595       17616       15960
Extfrag fragmenting unmovable placed with movable          7989       12193        8447
Extfrag fragmenting for reclaimable                         658        1840        1817
Extfrag fragmenting reclaimable placed with movable         558        1677        1679
Extfrag fragmenting for movable                        10249018     8682096    10225696

With Patch 1:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                  6-nothp-1       6-nothp-2       6-nothp-3
Page alloc extfrag event                               11834954     9877523     9774860
Extfrag fragmenting                                    11833993     9876880     9774245
Extfrag fragmenting for unmovable                          7342       16129       11712
Extfrag fragmenting unmovable placed with movable          4191       10547        6270
Extfrag fragmenting for reclaimable                         373        1130         923
Extfrag fragmenting reclaimable placed with movable         302         906         738
Extfrag fragmenting for movable                        11826278     9859621     9761610

With Patch 2:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                  7-nothp-1       7-nothp-2       7-nothp-3
Page alloc extfrag event                                4725990     3668793     3807436
Extfrag fragmenting                                     4725104     3668252     3806898
Extfrag fragmenting for unmovable                          6678        7974        7281
Extfrag fragmenting unmovable placed with movable          2051        3829        4017
Extfrag fragmenting for reclaimable                         429        1208        1278
Extfrag fragmenting reclaimable placed with movable         369         976        1034
Extfrag fragmenting for movable                         4717997     3659070     3798339

With Patch 3:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                  8-nothp-1       8-nothp-2       8-nothp-3
Page alloc extfrag event                                5016183     4700142     3850633
Extfrag fragmenting                                     5015325     4699613     3850072
Extfrag fragmenting for unmovable                          1312        3154        3088
Extfrag fragmenting unmovable placed with movable          1115        2777        2714
Extfrag fragmenting for reclaimable                         437        1193        1097
Extfrag fragmenting reclaimable placed with movable         330         969         879
Extfrag fragmenting for movable                         5013576     4695266     3845887

In v2 we've seen apparent regression with Patch 1 for unmovable events,
this is now gone, suggesting it was indeed noise.  Here, each patch
improves the situation for unmovable events.  Reclaimable is improved by
patch 1 and then either the same modulo noise, or perhaps sligtly worse -
a small price for unmovable improvements, IMHO.  The number of movable
allocations falling back to other migratetypes is most noisy, but it's
reduced to half at Patch 2 nevertheless.  These are least critical as
compaction can move them around.

If we look at success rates, the patches don't affect them, that didn't change.

Baseline:
                             3.19-rc4              3.19-rc4              3.19-rc4
                            5-nothp-1             5-nothp-2             5-nothp-3
Success 1 Min         49.00 (  0.00%)       42.00 ( 14.29%)       41.00 ( 16.33%)
Success 1 Mean        51.00 (  0.00%)       45.00 ( 11.76%)       42.60 ( 16.47%)
Success 1 Max         55.00 (  0.00%)       51.00 (  7.27%)       46.00 ( 16.36%)
Success 2 Min         53.00 (  0.00%)       47.00 ( 11.32%)       44.00 ( 16.98%)
Success 2 Mean        59.60 (  0.00%)       50.80 ( 14.77%)       48.20 ( 19.13%)
Success 2 Max         64.00 (  0.00%)       56.00 ( 12.50%)       52.00 ( 18.75%)
Success 3 Min         84.00 (  0.00%)       82.00 (  2.38%)       78.00 (  7.14%)
Success 3 Mean        85.60 (  0.00%)       82.80 (  3.27%)       79.40 (  7.24%)
Success 3 Max         86.00 (  0.00%)       83.00 (  3.49%)       80.00 (  6.98%)

Patch 1:
                             3.19-rc4              3.19-rc4              3.19-rc4
                            6-nothp-1             6-nothp-2             6-nothp-3
Success 1 Min         49.00 (  0.00%)       44.00 ( 10.20%)       44.00 ( 10.20%)
Success 1 Mean        51.80 (  0.00%)       46.00 ( 11.20%)       45.80 ( 11.58%)
Success 1 Max         54.00 (  0.00%)       49.00 (  9.26%)       49.00 (  9.26%)
Success 2 Min         58.00 (  0.00%)       49.00 ( 15.52%)       48.00 ( 17.24%)
Success 2 Mean        60.40 (  0.00%)       51.80 ( 14.24%)       50.80 ( 15.89%)
Success 2 Max         63.00 (  0.00%)       54.00 ( 14.29%)       55.00 ( 12.70%)
Success 3 Min         84.00 (  0.00%)       81.00 (  3.57%)       79.00 (  5.95%)
Success 3 Mean        85.00 (  0.00%)       81.60 (  4.00%)       79.80 (  6.12%)
Success 3 Max         86.00 (  0.00%)       82.00 (  4.65%)       82.00 (  4.65%)

Patch 2:

                             3.19-rc4              3.19-rc4              3.19-rc4
                            7-nothp-1             7-nothp-2             7-nothp-3
Success 1 Min         50.00 (  0.00%)       44.00 ( 12.00%)       39.00 ( 22.00%)
Success 1 Mean        52.80 (  0.00%)       45.60 ( 13.64%)       42.40 ( 19.70%)
Success 1 Max         55.00 (  0.00%)       46.00 ( 16.36%)       47.00 ( 14.55%)
Success 2 Min         52.00 (  0.00%)       48.00 (  7.69%)       45.00 ( 13.46%)
Success 2 Mean        53.40 (  0.00%)       49.80 (  6.74%)       48.80 (  8.61%)
Success 2 Max         57.00 (  0.00%)       52.00 (  8.77%)       52.00 (  8.77%)
Success 3 Min         84.00 (  0.00%)       81.00 (  3.57%)       79.00 (  5.95%)
Success 3 Mean        85.00 (  0.00%)       82.40 (  3.06%)       79.60 (  6.35%)
Success 3 Max         86.00 (  0.00%)       83.00 (  3.49%)       80.00 (  6.98%)

Patch 3:
                             3.19-rc4              3.19-rc4              3.19-rc4
                            8-nothp-1             8-nothp-2             8-nothp-3
Success 1 Min         46.00 (  0.00%)       44.00 (  4.35%)       42.00 (  8.70%)
Success 1 Mean        50.20 (  0.00%)       45.60 (  9.16%)       44.00 ( 12.35%)
Success 1 Max         52.00 (  0.00%)       47.00 (  9.62%)       47.00 (  9.62%)
Success 2 Min         53.00 (  0.00%)       49.00 (  7.55%)       48.00 (  9.43%)
Success 2 Mean        55.80 (  0.00%)       50.60 (  9.32%)       49.00 ( 12.19%)
Success 2 Max         59.00 (  0.00%)       52.00 ( 11.86%)       51.00 ( 13.56%)
Success 3 Min         84.00 (  0.00%)       80.00 (  4.76%)       79.00 (  5.95%)
Success 3 Mean        85.40 (  0.00%)       81.60 (  4.45%)       80.40 (  5.85%)
Success 3 Max         87.00 (  0.00%)       83.00 (  4.60%)       82.00 (  5.75%)

While there's no improvement here, I consider reduced fragmentation events
to be worth on its own.  Patch 2 also seems to reduce scanning for free
pages, and migrations in compaction, suggesting it has somewhat less work
to do:

Patch 1:

Compaction stalls                 4153        3959        3978
Compaction success                1523        1441        1446
Compaction failures               2630        2517        2531
Page migrate success           4600827     4943120     5104348
Page migrate failure             19763       16656       17806
Compaction pages isolated      9597640    10305617    10653541
Compaction migrate scanned    77828948    86533283    87137064
Compaction free scanned      517758295   521312840   521462251
Compaction cost                   5503        5932        6110

Patch 2:

Compaction stalls                 3800        3450        3518
Compaction success                1421        1316        1317
Compaction failures               2379        2134        2201
Page migrate success           4160421     4502708     4752148
Page migrate failure             19705       14340       14911
Compaction pages isolated      8731983     9382374     9910043
Compaction migrate scanned    98362797    96349194    98609686
Compaction free scanned      496512560   469502017   480442545
Compaction cost                   5173        5526        5811

As with v2, /proc/pagetypeinfo appears unaffected with respect to numbers
of unmovable and reclaimable pageblocks.

Configuring the benchmark to allocate like THP page fault (i.e.  no sync
compaction) gives much noisier results for iterations 2 and 3 after
reboot.  This is not so surprising given how [1] offers lower improvements
in this scenario due to less restarts after deferred compaction which
would change compaction pivot.

Baseline:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                    5-thp-1         5-thp-2         5-thp-3
Page alloc extfrag event                                8148965     6227815     6646741
Extfrag fragmenting                                     8147872     6227130     6646117
Extfrag fragmenting for unmovable                         10324       12942       15975
Extfrag fragmenting unmovable placed with movable          5972        8495       10907
Extfrag fragmenting for reclaimable                         601        1707        2210
Extfrag fragmenting reclaimable placed with movable         520        1570        2000
Extfrag fragmenting for movable                         8136947     6212481     6627932

Patch 1:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                    6-thp-1         6-thp-2         6-thp-3
Page alloc extfrag event                                8345457     7574471     7020419
Extfrag fragmenting                                     8343546     7573777     7019718
Extfrag fragmenting for unmovable                         10256       18535       30716
Extfrag fragmenting unmovable placed with movable          6893       11726       22181
Extfrag fragmenting for reclaimable                         465        1208        1023
Extfrag fragmenting reclaimable placed with movable         353         996         843
Extfrag fragmenting for movable                         8332825     7554034     6987979

Patch 2:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                    7-thp-1         7-thp-2         7-thp-3
Page alloc extfrag event                                3512847     3020756     2891625
Extfrag fragmenting                                     3511940     3020185     2891059
Extfrag fragmenting for unmovable                          9017        6892        6191
Extfrag fragmenting unmovable placed with movable          1524        3053        2435
Extfrag fragmenting for reclaimable                         445        1081        1160
Extfrag fragmenting reclaimable placed with movable         375         918         986
Extfrag fragmenting for movable                         3502478     3012212     2883708

Patch 3:
                                                   3.19-rc4        3.19-rc4        3.19-rc4
                                                    8-thp-1         8-thp-2         8-thp-3
Page alloc extfrag event                                3181699     3082881     2674164
Extfrag fragmenting                                     3180812     3082303     2673611
Extfrag fragmenting for unmovable                          1201        4031        4040
Extfrag fragmenting unmovable placed with movable           974        3611        3645
Extfrag fragmenting for reclaimable                         478        1165        1294
Extfrag fragmenting reclaimable placed with movable         387         985        1030
Extfrag fragmenting for movable                         3179133     3077107     2668277

The improvements for first iteration are clear, the rest is much noisier
and can appear like regression for Patch 1.  Anyway, patch 2 rectifies it.

Allocation success rates are again unaffected so there's no point in
making this e-mail any longer.

[1] http://marc.info/?l=linux-mm&m=142166196321125&w=2

This patch (of 3):

When __rmqueue_fallback() is called to allocate a page of order X, it will
find a page of order Y >= X of a fallback migratetype, which is different
from the desired migratetype.  With the help of try_to_steal_freepages(),
it may change the migratetype (to the desired one) also of:

1) all currently free pages in the pageblock containing the fallback page
2) the fallback pageblock itself
3) buddy pages created by splitting the fallback page (when Y > X)

These decisions take the order Y into account, as well as the desired
migratetype, with the goal of preventing multiple fallback allocations
that could e.g.  distribute UNMOVABLE allocations among multiple
pageblocks.

Originally, decision for 1) has implied the decision for 3).  Commit
47118af076 ("mm: mmzone: MIGRATE_CMA migration type added") changed that
(probably unintentionally) so that the buddy pages in case 3) are always
changed to the desired migratetype, except for CMA pageblocks.

Commit fef903efcf0c ("mm/page_allo.c: restructure free-page stealing code
and fix a bug") did some refactoring and added a comment that the case of
3) is intended.  Commit 0cbef29a7821 ("mm: __rmqueue_fallback() should
respect pageblock type") removed the comment and tried to restore the
original behavior where 1) implies 3), but due to the previous
refactoring, the result is instead that only 2) implies 3) - and the
conditions for 2) are less frequently met than conditions for 1).  This
may increase fragmentation in situations where the code decides to steal
all free pages from the pageblock (case 1)), but then gives back the buddy
pages produced by splitting.

This patch restores the original intended logic where 1) implies 3).
During testing with stress-highalloc from mmtests, this has shown to
decrease the number of events where UNMOVABLE and RECLAIMABLE allocations
steal from MOVABLE pageblocks, which can lead to permanent fragmentation.
In some cases it has increased the number of events when MOVABLE
allocations steal from UNMOVABLE or RECLAIMABLE pageblocks, but these are
fixable by sync compaction and thus less harmful.

Note that evaluation has shown that the behavior introduced by
47118af076 for buddy pages in case 3) is actually even better than the
original logic, so the following patch will introduce it properly once
again.  For stable backports of this patch it makes thus sense to only fix
versions containing 0cbef29a7821.

[iamjoonsoo.kim@lge.com: tracepoint fix]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>	[3.13+ containing 0cbef29a7821]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-27 22:09:38 +02:00
KOSAKI Motohiro c865a91570 mm: get rid of unnecessary overhead of trace_mm_page_alloc_extfrag()
In general, every tracepoint should be zero overhead if it is disabled.
However, trace_mm_page_alloc_extfrag() is one of exception.  It evaluate
"new_type == start_migratetype" even if tracepoint is disabled.

However, the code can be moved into tracepoint's TP_fast_assign() and
TP_fast_assign exist exactly such purpose.  This patch does it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-27 22:09:37 +02:00
Srivatsa S. Bhat d2b37143d9 mm/page_alloc.c: fix the value of fallback_migratetype in alloc_extfrag tracepoint()
In the current code, the value of fallback_migratetype that is printed
using the mm_page_alloc_extfrag tracepoint, is the value of the
migratetype *after* it has been set to the preferred migratetype (if the
ownership was changed).  Obviously that wouldn't have been the original
intent.  (We already have a separate 'change_ownership' field to tell
whether the ownership of the pageblock was changed from the
fallback_migratetype to the preferred type.)

The intent of the fallback_migratetype field is to show the migratetype
from which we borrowed pages in order to satisfy the allocation request.
So fix the code to print that value correctly.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-27 22:09:37 +02:00
Theodore Ts'o 27eb83a15e ext4: force inode writes when nfsd calls commit_metadata()
commit fde872682e175743e0c3ef939c89e3c6008a1529 upstream.

Some time back, nfsd switched from calling vfs_fsync() to using a new
commit_metadata() hook in export_operations().  If the file system did
not provide a commit_metadata() hook, it fell back to using
sync_inode_metadata().  Unfortunately doesn't work on all file
systems.  In particular, it doesn't work on ext4 due to how the inode
gets journalled --- the VFS writeback code will not always call
ext4_write_inode().

So we need to provide our own ext4_nfs_commit_metdata() method which
calls ext4_write_inode() directly.

Google-Bug-Id: 121195940
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-27 21:53:31 +02:00
Thierry Strudel f37b18c195 trace: cpufreq: fix typo in min/max cpufreq
Change-Id: Ieed402d3a912b7a318826e101efe2c24b07ebfe4
Signed-off-by: Thierry Strudel <tstrudel@google.com>
Git-commit: 483c0cc57aa888b2c81d11eb72a47135da7c36f4
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2019-07-27 21:51:10 +02:00
mukesh agrawal c194121320 ANDROID: trace: net: use %pK for kernel pointers
We want to use network trace events in production
builds, to help diagnose Wifi problems. However, we
don't want to expose raw kernel pointers in such
builds.

Change the format specifier for the skbaddr field,
so that, if kptr_restrict is enabled, the pointers
will be reported as 0.

Bug: 30090733
Change-Id: Ic4bd583d37af6637343601feca875ee24479ddff
Signed-off-by: mukesh agrawal <quiche@google.com>
Git-commit: 0020e178ac5c7069d45fef7ed1dafedf168dcfaf
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2019-07-27 21:50:48 +02:00
Rik van Riel 3dc7f9279e tracing: Add #undef to fix compile error
commit bf7165cfa23695c51998231c4efa080fe1d3548d upstream.

There are several trace include files that define TRACE_INCLUDE_FILE.

Include several of them in the same .c file (as I currently have in
some code I am working on), and the compile will blow up with a
"warning: "TRACE_INCLUDE_FILE" redefined #define TRACE_INCLUDE_FILE syscalls"

Every other include file in include/trace/events/ avoids that issue
by having a #undef TRACE_INCLUDE_FILE before the #define; syscalls.h
should have one, too.

Link: http://lkml.kernel.org/r/20160928225554.13bd7ac6@annuminas.surriel.com

Fixes: b8007ef742 ("tracing: Separate raw syscall from syscall tracer")
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:43:11 +02:00
Luca Stefani 82b37d9f2f Merge remote-tracking branch 'f2fs/linux-3.10.y' into HEAD
Change-Id: Ic2fe24529f029909ddd96490bd6d885d60f88be2
2017-04-18 17:02:28 +02:00
LuK1337 fc9499e55a Import latest Samsung release
* Package version: T713XXU2BQCO

Change-Id: I293d9e7f2df458c512d59b7a06f8ca6add610c99
2017-04-18 03:43:52 +02:00
Hou Pengyang 1be145cb9d f2fs: add f2fs_drop_inode tracepoint
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 19:13:49 -08:00
Jaegeuk Kim 78fdc8a8e1 f2fs: show actual device info in tracepoints
This patch shows actual device information in the tracepoints.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	include/trace/events/f2fs.h
2017-02-23 16:30:19 -08:00
Jaegeuk Kim e3b8583b25 f2fs: add submit_bio tracepoint
This patch adds final submit_bio() tracepoint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
2017-02-06 13:45:25 -08:00
Jaegeuk Kim e27aa59132 f2fs: resolve op and op_flags confilcts
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:45:13 -08:00
Damien Le Moal 4fd379f3be f2fs: Trace reset zone events
Similarly to the regular discard, trace zone reset events.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:54 -08:00
Srinivas Rao L 28caff276f cpuidle: lpm-levels: Consider cluster history for LPM selection
Consider recent history (residencies) of the cluster and core
low power modes while the cluster level low power mode to enter
is selected.

Change-Id: Ifdc847a71136f55aded8a758b92179bb9aebfdcb
Signed-off-by: Srinivas Rao L <lsrao@codeaurora.org>
2016-11-08 16:10:30 +05:30
Srinivas Rao L e4c619e932 cpuidle: lpm-levels: Consider history during LPM selection
Consider recent history (residencies) of the low power modes per
core while the next low power mode to enter is selected. If most
of the history says the pattern of residencies is repeating with
minimal deviation then use the average of these for predicting
the next mode to enter.

If the pattern is not repeating then if more than 50 percent of
the samples out of history have exited a low power mode earlier
than the minumim residency of that mode, restrict it and also low
power modes deeper than that.

In any of the above case, trigger a hrtimer to wakeup cpu with
timeout as predicted+delta or max residency of the mode selected
if a deeper state can be selected after waking up incase if
prediction goes wrong.

Change-Id: I902a06939e19ac51dfd8c2db6b727b203ebfda14
Signed-off-by: Srinivas Rao L <lsrao@codeaurora.org>
2016-11-08 16:10:30 +05:30
Ruchi Kandoi da252e4730 trace: cpufreq: Add tracing for min/max cpufreq
Change-Id: I73f6ec437c1f805437d9376abb6510d1364b07ec
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>

Conflicts:
	drivers/cpufreq/cpufreq.c
	include/trace/events/power.h
2016-05-18 14:34:41 +05:30
Riley Andrews a9f3059ae9 sched: add sched blocked tracepoint which dumps out context of sleep.
Decare war on uninterruptible sleep. Add a tracepoint which
walks the kernel stack and dumps the first non-scheduler function
called before the scheduler is invoked.

Change-Id: I19e965d5206329360a92cbfe2afcc8c30f65c229
Signed-off-by: Riley Andrews <riandrews@google.com>
2016-05-18 14:34:41 +05:30
Jin Qian 51473e62ee trace/events: fix compilation error
include/trace/events/filemap.h: In function 'ftrace_raw_output_mm_filemap_find_page_cache_miss':
include/trace/ftrace.h:232:9: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'size_t' [-Werror=format=]

Change-Id: Ic2d76f18fc8802cf1e2246f96d84a06d267a30ad
Signed-off-by: Jin Qian <jinqian@google.com>
2016-05-18 14:31:35 +05:30
Daniel Campello a7a2c2347c Page cache miss tracing using ftrace on mm/filemap
This patch includes two trace events on generic_perform_write and
do_generic_file_read to check on the address_space mapping for the
pages to be accessed by the request.

Change-Id: Ib319b9b2c971b9e5c76645be6cfd995ef9465d77
Signed-off-by: Daniel Campello <campello@google.com>

Conflicts:
	include/linux/pagemap.h
2016-05-18 14:31:34 +05:30
Jaegeuk Kim 09a336dfe0 f2fs: use percpu_counter for page counters
This patch substitutes percpu_counter for atomic_counter when counting
various types of pages.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-17 17:19:22 -07:00
Chao Yu a1b1884c50 f2fs: support in batch multi blocks preallocation
This patch introduces reserve_new_blocks to make preallocation of multi
blocks as in batch operation, so it can avoid lots of redundant
operation, result in better performance.

In virtual machine, with rotational device:

time fallocate -l 32G /mnt/f2fs/file

Before:
real	0m4.584s
user	0m0.000s
sys	0m4.580s

After:
real	0m0.292s
user	0m0.000s
sys	0m0.272s

In x86, with SSD:

time fallocate -l 500G $MNT/testfile

Before : 24.758 s
After  :  1.604 s

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-17 13:46:06 -07:00
Chao Yu 9505a2fe31 f2fs: trace old block address for CoWed page
This patch enables to trace old block address of CoWed page for better
debugging.

f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE

f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA

f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-01 16:58:59 -08:00
Chao Yu 76e295a1e8 f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file

With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.

But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.

So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.

If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-01 16:58:48 -08:00
Chao Yu b9eee2bf19 f2fs: add a tracepoint for sync_dirty_inodes
This patch adds a tracepoint for sync_dirty_inodes.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-01-07 19:15:09 -08:00
Jaegeuk Kim bfc9891be1 f2fs: catch up to v4.4-rc1
The last patch is:

commit beaa57dd986d4f398728c060692fc2452895cfd8
Author: Chao Yu <chao2.yu@samsung.com>
Date:   Thu Oct 22 18:24:12 2015 +0800

    f2fs: fix to skip shrinking extent nodes

    In f2fs_shrink_extent_tree we should stop shrink flow if we have already
    shrunk enough nodes in extent cache.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-11-29 00:05:40 -08:00
Srivatsa Vaddagiri a330f3d5bc sched: colocate related threads
Provide userspace interface for tasks to be grouped together as
"related" threads. For example, all threads involved in updating
display buffer could be tagged as related.

Scheduler will attempt to provide special treatment for group of
related threads such as:

1) Colocation of related threads in same "preferred" cluster
2) Aggregation of demand towards determination of cluster frequency

This patch extends scheduler to provide best-effort colocation support
for a group of related threads.

Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2015-11-25 21:43:34 -08:00
Srivatsa Vaddagiri cdbbd96a7f sched: Consolidate cluster-specific information
Many cluster-shared attributes like cur_freq, max_freq etc are
needlessly maintained in per-cpu 'struct rq' currently. Consolidate
them in cluster structure.

Change-Id: I36e508082bb1e8a7c1a60e99902b5bc260f5f8f6
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2015-11-25 21:43:15 -08:00
Manaf Meethalavalappu Pallikunhi abe105f868 msm: thermal: Add ftrace to KTM progressive mitigation
Add ftrace events for KTM progressive algorithm mitigation.
Below ftrace events are added to KTM,
thermal_progressive_sampling
thermal_progressive_state
thermal_progressive_mitigate

Change-Id: Iec8a3a62f5655a59901ebaf744233457c22c84a5
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
2015-11-03 00:45:25 -08:00
Alok Chauhan a8296aa2f0 msm: msm_bus: Correct the timestamp signature for logging
correct the time-stamp signature for logging in the format
of seconds.nanoseconds with all the digit.

Change-Id: I3c93c2c8daa546c4488096b2c2af939b93254b5c
Signed-off-by: Alok Chauhan <alokc@codeaurora.org>
2015-09-23 07:31:37 -07:00
Alok Chauhan 7ff79463e0 platform: msm: msm_bus: Add trace events for bus AB
Add trace events to the ad-hoc bus driver to log
Average BW (AB) that bus driver sends to RPM for
shared slaves.

Change-Id: I8fad0a3b3df6a6be5c659ca371f15fb27710b3f0
Signed-off-by: Alok Chauhan <alokc@codeaurora.org>
2015-09-08 23:44:06 -07:00
Alok Chauhan 4bb1959616 platform: msm: Add snapshot of missing msm_bus driver changes
The msm_bus driver provides offers a means of managing performance
levels for the buses, fabrics and NoCs, which connect the peripherals
and processes within MSM chipsets.

This snapshot is taken to merge missing changes from msm-3.18 to msm-3.10.

Change-Id: If6fff441265716632ee2b8d666cbbdaf5974a34e
Signed-off-by: Alok Chauhan <alokc@codeaurora.org>
Signed-off-by: Kiran Gunda <kgunda@codeaurora.org>
2015-08-25 17:06:31 +05:30
Linux Build Service Account a54b10a29c Merge "qcom:rpm-smd: Add debug ftrace events" 2015-06-27 21:50:29 -07:00
Achyuth Sai Vadrav 333377b102 qcom:rpm-smd: Add debug ftrace events
Added ftrace events for debugging purpose

Change-Id: I925ec327d83f55fd4e391f3ed903f925d19aede9
Signed-off-by: Achyuth Sai Vadrav <avadra@codeaurora.org>
2015-06-25 15:22:55 +05:30
Devesh Jhunjhunwala 2e7ce64042 tracing: power: Define clock_set_rate_complete trace event
clk_set_rate event involves a variable overhead due to a
number of additional operations needed (regulator set voltage
operations, etc). Since the overhead may differ based on a
number of factors, add a clk_set_rate event so as to enable
recording of clk_set_rate operation latency.

Change-Id: Ie5c0ab1929f32e08b04d2a255e631c95a68e49fc
Signed-off-by: Devesh Jhunjhunwala <deveshj@codeaurora.org>
2015-06-15 16:20:12 -07:00
Linux Build Service Account 383071d529 Merge "msm: lpm-levels: Modify ftrace events to track latency/sleep time" 2015-06-13 11:59:55 -07:00
Linux Build Service Account 26b4e48668 Merge "soc: qcom: msm_perf: Add support for enter/exit cycle for io detection" 2015-06-11 13:07:57 -07:00
Maulik Shah 7f2c9922b1 msm: lpm-levels: Modify ftrace events to track latency/sleep time
cpuidle enter and exit ftrace events can mismatch if lpm driver falls
back after choosing a low power mode. Modify ftrace events to clearly
distinguish between cpu power select event and cpuidle enter event.

Change-Id: I81bc2b175264c6bdeec555b2b1271c0f40870b1b
Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
2015-06-10 21:23:35 +05:30
Konstantin Dorfman 5aee76ab1e mmc: core: add trace events to profile mmc runtime suspend-resume
Add trace events to capture mmc runtime suspend-resume latencies.
This would be useful to capture latencies for variety of MMC/SD cards
and decide on best possible runtime PM timeout.

Usage -
cd /sys/d/tracing
echo 1 > events/mmc/mmc_host_runtime_suspend/enable
echo 1 > events/mmc/mmc_host_runtime_resume/enable

Change-Id: Ia5dbb175c16f3e697d1239cb1faa5764ffeb5414
Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>
2015-06-03 14:34:01 +03:00
Tapas Kumar Kundu 464ca24c00 soc: qcom: msm_perf: Add support for enter/exit cycle for io detection
Add support for enter/exit cycle sysfs nodes for io detection

There are some usecases which may benefit from different enter/exit
cycle load criteria for IO load. This change adds support for
that.

Change-Id: Iff135ed11b92becc374ace4578e0efc212d2b731
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
2015-06-01 19:15:46 +05:30
Tapas Kumar Kundu 27be82907e soc: qcom: msm_perf: Add support for multi_cycle entry/exit nodes
Add support for multi_enter_cycles/multi_exit_cycles per cluster

There are some usecases which may benefit from different enter/exit
cycle load criteria for multimode cpu load. This change adds support for
that.

Change-Id: I3408405307ca03b9bba3f03e216ef59b98f29832
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
2015-06-01 19:10:31 +05:30
Tapas Kumar Kundu d69075f454 soc: qcom: msm_perf: Add timers to exit SINGLE mode
Certain governors may stop sending out notifications once CPUs enter
idle at min frequency.If governor's notifications stop then single mode
will not exit for long time. It can happen only if the exit conditions are
set in such a way that the time taken to exit single mode exceeds the time
for the governor to ramp down, idle out and hence stop sending
notifications leaving the system in single mode indefinitely.

This change adds seperate enter/exit cycle sysfs nodes along with a per
cluster non-deferrable timer for single mode exit. The timer is armed only
when the load starts falling below the exit load threshold and is
cancelled when either the load starts going up or SINGLE mode is exited
due to exceeding exit cycle count. On expiry the timer resets SINGLE mode
and the enter/exit cycle counts.

Change-Id: I02dd3fa8af39ca320e80da6391eb2b1ea635a433
Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
2015-06-01 19:04:53 +05:30
Ritesh Harjani 4583bf6dba msm: sdhci-msm: Revert multiple pm_qos configs
Reverting previous pm_qos changes since msm-3.10
will be supporting only 1 qos design for 8939 based
targets. There is no need to support multiple Qos
designs since there will be no target which needs
this.

Revert "mmc: sdhci-msm: add pm_qos trace-points"
This reverts commit 2ada209031.

Revert "mmc: sdhci-msm: support multiple pm_qos configurations"
This reverts commit a9070cf869.

Conflicts:
	drivers/mmc/host/sdhci-msm.c

Change-Id: I84ce58f97632ae8006146d260ede9a1ef0f6428a
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
2015-05-07 02:27:58 -07:00
Vinayak Menon 17429185a5 mm: process reclaim: vmpressure based process reclaim
With this patch, anon pages of incative tasks can be reclaimed,
depending on memory pressure. Memory pressure is detected
using vmpressure events. 'N' best tasks in terms of anon
size is selected and pages proportional to their tasksize
is reclaimed. The total number of pages reclaimed at each
run of the swap work, can be tuned from userspace, the
default being SWAP_CLUSTER_MAX * 32.

The patch also adds tracepoints to debug and tune the
feature.

echo 1 > /sys/module/process_reclaim/parameters/enable_process_reclaim
to enable the feature.

echo <pages> > /sys/module/process_reclaim/parameters/per_swap_size,
to set the number of pages reclaimed in each scan.

/sys/module/process_reclaim/parameters/reclaim_avg_efficiency, provides
the average efficiency (scan to reclaim ratio) of the algorithm.

/sys/module/process_reclaim/parameters/swap_eff_win, to set the window
period (in unit of number of times reclaim is triggered) to detect
low efficiency runs.

/sys/module/process_reclaim/parameters/swap_opt_eff, to set the optimal
efficiency threshold for low efficiency detection.

Change-Id: I895986f10c997d1715761eaaadc4bbbee60db9d2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2015-04-16 11:00:47 -07:00
Ram Chandrasekar 4052d1f57d power: bcl: Add ftrace events
Add ftrace events when battery parameter change
is notified, when mitigation is initiated and
cleared.

CRs-Fixed: 760435
Change-Id: I901d8e737f9ba2150ef7c2a9c19bbcf79e93c0e6
Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
2015-03-31 11:29:26 -06:00
Ram Chandrasekar db2f4ef4fd power: bcl_peripheral: Add ftrace events
Add ftrace events to bcl driver events. Events are
added in the places where the registers are read/write,
when interrupts are triggered and bcl state changes.

CRs-Fixed: 760435
Change-Id: I53985292a3bccb9a2e33f144528c56f4d659205a
Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
2015-03-31 11:29:23 -06:00
Linux Build Service Account d486629893 Merge "tracing: ufs: create a trace event class template for common events" 2015-03-14 04:51:41 -07:00
Linux Build Service Account 32882c438d Merge "lowmemorykiller: adapt to vmpressure" 2015-03-14 00:59:33 -07:00
Gilad Broner e0fbe16232 tracing: ufs: create a trace event class template for common events
The following three trace events:
ufshcd_clk_gating, ufshcd_hibern8_on_idle and ufshcd_auto_bkops_state
share the same arguments and meaning - logging some state change
in the UFS driver.
Defining those as template instances takes up less memory compared
to be defined as separate trace events.

Change-Id: I92c2bf3eada6f876b8c9e8a7bfc4568c7886548f
Signed-off-by: Gilad Broner <gbroner@codeaurora.org>
2015-03-12 13:54:47 +02:00