The logic used to skip reclaim on low efficiency results
in process reclaim not triggering at all. Fix it by
properly handling the skip_reclaim atomic variable.
Change-Id: I119097bb9b1baf8f3e8d4afa0a6dc2c30c0de6e7
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
For instrumenting global variables KASan will shadow memory backing memory
for modules. So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().
__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area. Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole. So we could fail to allocate shadow
for module_alloc().
Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range(). Add new parameter 'vm_flags' to
__vmalloc_node_range() function.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dkeitel@codeaurora.org: resolve trivial merge conflicts. Only apply arm
and arm64 relevant changes.]
Git-commit: cb9e3c292d0115499c660028ad35ac5501d722b5
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I5ed16f4719b7fa3654b00358bbe40ed8a0e77d2e
Current approach in handling shadow memory for modules is broken.
Shadow memory could be freed only after memory shadow corresponds it is no
longer used. vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:
void vfree(const void *addr)
{
...
if (unlikely(in_interrupt())) {
struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
if (llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);
Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.
So shadow memory should be freed after module's memory. However, such
deallocation order could race with kasan_module_alloc() in module_alloc().
Free shadow right before releasing vm area. At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dkeitel@codeaurora.org: resolved trivial merge conflicts]
Git-commit: a5af5aa8b67dfdba36c853b70564fd2dfe73d478
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I9e2cca957a8dfda65ff9ad01cb13c89904abf8e4
This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)
The idea of this is simple. Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function. Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.
This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dkeitel@codeaurora.org: resolve trivial merge conflicts]
Git-commit: bebf56a1b176c2e1c9efe44e7e6915532cc682cf
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I4dda6aa06fc53fd018a87ce8b08b62a9712f54fe
Recently instrumentation of builtin functions calls was removed from GCC
5.0. To check the memory accessed by such functions, userspace asan
always uses interceptors for them.
So now we should do this as well. This patch declares
memset/memmove/memcpy as weak symbols. In mm/kasan/kasan.c we have our
own implementation of those functions which checks memory before accessing
it.
Default memset/memmove/memcpy now now always have aliases with '__'
prefix. For files that built without kasan instrumentation (e.g.
mm/slub.c) original mem* replaced (via #define) with prefixed variants,
cause we don't want to check memory accesses there.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dkeitel@codeaurora.org: section of patch which edits non-existing
efistub header.]
Git-commit: 393f203f5fd54421fddb1e2a263f64d3876eeadb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I4418c6396f526e66a0d85ca8ed830a747ab0c8ac
With this patch kasan will be able to catch bugs in memory allocated by
slub. Initially all objects in newly allocated slab page, marked as
redzone. Later, when allocation of slub object happens, requested by
caller number of bytes marked as accessible, and the rest of the object
(including slub's metadata) marked as redzone (inaccessible).
We also mark object as accessible if ksize was called for this object.
There is some places in kernel where ksize function is called to inquire
size of really allocated area. Such callers could validly access whole
allocated memory, so it should be marked as accessible.
Code in slub.c and slab_common.c files could validly access to object's
metadata, so instrumentation for this files are disabled.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Signed-off-by: Dmitry Chernenkov <dmitryc@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dkeitel@codeaurora.org: resolve merge conflicts, also remove pieces of
that do not apply to 3.10 version of kernel]
Git-commit: 0316bec22ec95ea2faca6406437b0b5950553b7c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I306a4d3851670d8a237c6da1b7244eee24bc1d8e
It's ok for slub to access memory that marked by kasan as inaccessible
(object's metadata). Kasan shouldn't print report in that case because
these accesses are valid. Disabling instrumentation of slub.c code is not
enough to achieve this because slub passes pointer to object's metadata
into external functions like memchr_inv().
We don't want to disable instrumentation for memchr_inv() because this is
quite generic function, and we don't want to miss bugs.
metadata_access_enable/metadata_access_disable used to tell KASan where
accesses to metadata starts/end, so we could temporarily disable KASan
reports.
Change-Id: Icbd15f42c71332399eccafe2a05e3034dcd90d67
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: a79316c6178ca419e35feef47d47f50b4e0ee9f2
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Currently memory hotplug won't work with KASan. As we don't have shadow
for hotplugged memory, kernel will crash on the first access to it. To
make this work we will need to allocate shadow for new memory.
At some future point proper memory hotplug support will be implemented.
Until then, print a warning at startup and disable memory hot-add.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 786a8959912eb94fc2381c2ae487a96ce55dabca
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I1661e8c9699228c105d653b13bc6bfbadd8695af
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Kernel Address sanitizer (KASan) is a dynamic memory error detector. It
provides fast and comprehensive solution for finding use-after-free and
out-of-bounds bugs.
KASAN uses compile-time instrumentation for checking every memory access,
therefore GCC > v4.9.2 required. v4.9.2 almost works, but has issues with
putting symbol aliases into the wrong section, which breaks kasan
instrumentation of globals.
This patch only adds infrastructure for kernel address sanitizer. It's
not available for use yet. The idea and some code was borrowed from [1].
Basic idea:
The main idea of KASAN is to use shadow memory to record whether each byte
of memory is safe to access or not, and use compiler's instrumentation to
check the shadow memory on each memory access.
Address sanitizer uses 1/8 of the memory addressable in kernel for shadow
memory and uses direct mapping with a scale and offset to translate a
memory address to its corresponding shadow address.
Here is function to translate address to corresponding shadow address:
unsigned long kasan_mem_to_shadow(unsigned long addr)
{
return (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET;
}
where KASAN_SHADOW_SCALE_SHIFT = 3.
So for every 8 bytes there is one corresponding byte of shadow memory.
The following encoding used for each shadow byte: 0 means that all 8 bytes
of the corresponding memory region are valid for access; k (1 <= k <= 7)
means that the first k bytes are valid for access, and other (8 - k) bytes
are not; Any negative value indicates that the entire 8-bytes are
inaccessible. Different negative values used to distinguish between
different kinds of inaccessible memory (redzones, freed memory) (see
mm/kasan/kasan.h).
To be able to detect accesses to bad memory we need a special compiler.
Such compiler inserts a specific function calls (__asan_load*(addr),
__asan_store*(addr)) before each memory access of size 1, 2, 4, 8 or 16.
These functions check whether memory region is valid to access or not by
checking corresponding shadow memory. If access is not valid an error
printed.
Historical background of the address sanitizer from Dmitry Vyukov:
"We've developed the set of tools, AddressSanitizer (Asan),
ThreadSanitizer and MemorySanitizer, for user space. We actively use
them for testing inside of Google (continuous testing, fuzzing,
running prod services). To date the tools have found more than 10'000
scary bugs in Chromium, Google internal codebase and various
open-source projects (Firefox, OpenSSL, gcc, clang, ffmpeg, MySQL and
lots of others): [2] [3] [4].
The tools are part of both gcc and clang compilers.
We have not yet done massive testing under the Kernel AddressSanitizer
(it's kind of chicken and egg problem, you need it to be upstream to
start applying it extensively). To date it has found about 50 bugs.
Bugs that we've found in upstream kernel are listed in [5].
We've also found ~20 bugs in out internal version of the kernel. Also
people from Samsung and Oracle have found some.
[...]
As others noted, the main feature of AddressSanitizer is its
performance due to inline compiler instrumentation and simple linear
shadow memory. User-space Asan has ~2x slowdown on computational
programs and ~2x memory consumption increase. Taking into account that
kernel usually consumes only small fraction of CPU and memory when
running real user-space programs, I would expect that kernel Asan will
have ~10-30% slowdown and similar memory consumption increase (when we
finish all tuning).
I agree that Asan can well replace kmemcheck. We have plans to start
working on Kernel MemorySanitizer that finds uses of unitialized
memory. Asan+Msan will provide feature-parity with kmemcheck. As
others noted, Asan will unlikely replace debug slab and pagealloc that
can be enabled at runtime. Asan uses compiler instrumentation, so even
if it is disabled, it still incurs visible overheads.
Asan technology is easily portable to other architectures. Compiler
instrumentation is fully portable. Runtime has some arch-dependent
parts like shadow mapping and atomic operation interception. They are
relatively easy to port."
Comparison with other debugging features:
========================================
KMEMCHECK:
- KASan can do almost everything that kmemcheck can. KASan uses
compile-time instrumentation, which makes it significantly faster than
kmemcheck. The only advantage of kmemcheck over KASan is detection of
uninitialized memory reads.
Some brief performance testing showed that kasan could be
x500-x600 times faster than kmemcheck:
$ netperf -l 30
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
no debug: 87380 16384 16384 30.00 41624.72
kasan inline: 87380 16384 16384 30.00 12870.54
kasan outline: 87380 16384 16384 30.00 10586.39
kmemcheck: 87380 16384 16384 30.03 20.23
- Also kmemcheck couldn't work on several CPUs. It always sets
number of CPUs to 1. KASan doesn't have such limitation.
DEBUG_PAGEALLOC:
- KASan is slower than DEBUG_PAGEALLOC, but KASan works on sub-page
granularity level, so it able to find more bugs.
SLUB_DEBUG (poisoning, redzones):
- SLUB_DEBUG has lower overhead than KASan.
- SLUB_DEBUG in most cases are not able to detect bad reads,
KASan able to detect both reads and writes.
- In some cases (e.g. redzone overwritten) SLUB_DEBUG detect
bugs only on allocation/freeing of object. KASan catch
bugs right before it will happen, so we always know exact
place of first bad read/write.
[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
[2] https://code.google.com/p/address-sanitizer/wiki/FoundBugs
[3] https://code.google.com/p/thread-sanitizer/wiki/FoundBugs
[4] https://code.google.com/p/memory-sanitizer/wiki/FoundBugs
[5] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel#Trophies
Based on work by Andrey Konovalov.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[tsoni@codeaurora.org: trivial merge conflicts]
Git-commit: 0b24becc810dc3be6e3f94103a866f214c282394
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: If62dffc8bb54d92654f221f5b365ed3f1a07fd3a
Signed-off-by: David Keitel <dkeitel@codeaurora.org>
commit 9c145c56d0c8a0b62e48c8d71e055ad0fb2012ba upstream.
The stack guard page error case has long incorrectly caused a SIGBUS
rather than a SIGSEGV, but nobody actually noticed until commit
fee7e49d4514 ("mm: propagate error from stack expansion even for guard
page") because that error case was never actually triggered in any
normal situations.
Now that we actually report the error, people noticed the wrong signal
that resulted. So far, only the test suite of libsigsegv seems to have
actually cared, but there are real applications that use libsigsegv, so
let's not wait for any of those to break.
Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33692f27597fcab536d7cbbcc8f52905133e4aa7 upstream.
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.
That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.
However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.
To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.
This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.
Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[shengyong: Backport to 3.10
- adjust context
- ignore modification for arch nios2, because 3.10 does not support it
- ignore modification for driver lustre, because 3.10 does not support it
- ignore VM_FAULT_FALLBACK in VM_FAULT_ERROR, becase 3.10 does not support
this flag
- add SIGSEGV handling to powerpc/cell spu_fault.c, because 3.10 does not
separate it to copro_fault.c
- add SIGSEGV handling in mm/memory.c, because 3.10 does not separate it
to gup.c
]
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vmstat_work is currently a per cpu worker thread that requeues
itself using schedule_delayed_work().
schedule_delayed_work() makes the worker thread unbound. Since
its unbound, when the timer for the delayed workqueue is migrated,
the current code path can cause the per cpu worker to get
executed on the CPU other than what it is intended for causing
undesired effects. This overrides the choice of making the worker
per cpu in the first place.
Fix this by using schedule_delayed_work_on() and make it CPU
bound.
Change-Id: Ib7952c544bda7d8ec0a79c52de8f2d80b11637e8
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJVFBE+AAoJEDjbvchgkmk+oTkP/j2ipSvgXghFEipZbOJUQkqC
fa8elfoF7riTKpKOuDtDU2WI1ttCGYs5gmTNpd4KaEt23eJOQgVqIpV8GhAkW5Af
NVyGhjF3dXNqpBkxnyuIkk5OLrNKGRNS2xpz1U254iGObYrK+tr62IzGPxEcPAhX
Y+58xPVSjLtNdTJW3YLT3DohUbnbHG6Br9geI1IHtlxg1oDiTxtnX2FmOFzzDpP5
qu8gnPIekg/+1EE46nEiq0C59AwC3aCzNxwlYe1Kd41SY3LUFF1eZMzmOnnwyI5K
3FslAzT6x/sOmGJFTYrKjFA4GKsW67xHVkB/hp/Mu768RqxiQCxV4kgmPsAFLbXb
D5qbNwr3i0iQ/9AaD7h8HJkxC/KHmszMux00L/mgZ3SGdGMEIBxHg+oP8+nP8V6C
WfXKSWA94dpdRyULEfWdnKnUnp2860C7kt7ASTkOl8rIgU8HgaRqeu+U/KPM2ovD
ZJtXPVB5UXCRuVAhZwbvvrLOY8UMZTnv2auAaeLYG8YptcvGeN5Z398/8qdV/z7c
A9kOsgebs74X+lR3rbVgSDPQaq2AEiuIvtX77SfmrWXBXGmc99i9+PikuFggRprz
cJm5bCM9DaHu/3b77X9Fwl7vnpReB0zPHiwTdH/p7OPMf5m1uQt7SqegC6btLPHs
iYgjLd4oW+6uiV/2X1Vx
=L+mC
-----END PGP SIGNATURE-----
Merge commit 'v3.10.73' into msm-3.10
This merge brings us up to date with upstream kernel.org tag v3.10.73.
As part of the conflict resolution, changes introduced by commit 72684eae7
("arm64: Fix up /proc/cpuinfo") have been intentionally dropped, as they
conflict with Android changes msm-3.10 kernel to solve the problems
in a different way. Since userspace readers of this file may depend on
the existing msm-3.10 implementation, it's left as-is for now. The
commit may later be introduced if it is found to not impact userspaces
paired with this kernel.
* commit 'v3.10.73' (264 commits):
Linux 3.10.73
target: Allow Write Exclusive non-reservation holders to READ
target: Allow AllRegistrants to re-RESERVE existing reservation
target: Fix R_HOLDER bit usage for AllRegistrants
target/pscsi: Fix NULL pointer dereference in get_device_type
iscsi-target: Avoid early conn_logout_comp for iser connections
target: Fix reference leak in target_get_sess_cmd() error path
ARM: at91: pm: fix at91rm9200 standby
ipvs: rerouting to local clients is not needed anymore
ipvs: add missing ip_vs_pe_put in sync code
powerpc/smp: Wait until secondaries are active & online
x86/vdso: Fix the build on GCC5
x86/fpu: Drop_fpu() should not assume that tsk equals current
x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig()
crypto: aesni - fix memory usage in GCM decryption
libsas: Fix Kernel Crash in smp_execute_task
xen-pciback: limit guest control of command register
nilfs2: fix deadlock of segment constructor during recovery
regulator: core: Fix enable GPIO reference counting
regulator: Only enable disabled regulators on resume
ALSA: hda - Treat stereo-to-mono mix properly
ALSA: hda - Add workaround for MacBook Air 5,2 built-in mic
ALSA: hda - Set single_adc_amp flag for CS420x codecs
ALSA: hda - Don't access stereo amps for mono channel widgets
ALSA: hda - Fix built-in mic on Compaq Presario CQ60
ALSA: control: Add sanity checks for user ctl id name string
spi: pl022: Fix race in giveback() leading to driver lock-up
tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send
workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
can: add missing initialisations in CAN related skbuffs
Change email address for 8250_pci
virtio_console: init work unconditionally
fuse: notify: don't move pages
fuse: set stolen page uptodate
drm/radeon: drop setting UPLL to sleep mode
drm/radeon: do a posting read in rs600_set_irq
drm/radeon: do a posting read in si_set_irq
drm/radeon: do a posting read in r600_set_irq
drm/radeon: do a posting read in r100_set_irq
drm/radeon: do a posting read in evergreen_set_irq
drm/radeon: fix DRM_IOCTL_RADEON_CS oops
tcp: make connect() mem charging friendly
net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour
tcp: fix tcp fin memory accounting
Revert "net: cx82310_eth: use common match macro"
rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg()
caif: fix MSG_OOB test in caif_seqpkt_recvmsg()
inet_diag: fix possible overflow in inet_diag_dump_one_icsk()
rds: avoid potential stack overflow
net: sysctl_net_core: check SNDBUF and RCVBUF for min length
sparc64: Fix several bugs in memmove().
sparc: Touch NMI watchdog when walking cpus and calling printk
sparc: perf: Make counting mode actually work
sparc: perf: Remove redundant perf_pmu_{en|dis}able calls
sparc: semtimedop() unreachable due to comparison error
sparc32: destroy_context() and switch_mm() needs to disable interrupts.
Linux 3.10.72
ath5k: fix spontaneus AR5312 freezes
ACPI / video: Load the module even if ACPI is disabled
drm/radeon: fix 1 RB harvest config setup for TN/RL
Drivers: hv: vmbus: incorrect device name is printed when child device is unregistered
HID: fixup the conflicting keyboard mappings quirk
HID: input: fix confusion on conflicting mappings
staging: comedi: cb_pcidas64: fix incorrect AI range code handling
dm snapshot: fix a possible invalid memory access on unload
dm: fix a race condition in dm_get_md
dm io: reject unsupported DISCARD requests with EOPNOTSUPP
dm mirror: do not degrade the mirror on discard error
staging: comedi: comedi_compat32.c: fix COMEDI_CMD copy back
clk: sunxi: Support factor clocks with N factor starting not from 0
fixed invalid assignment of 64bit mask to host dma_boundary for scatter gather segment boundary limit.
nilfs2: fix potential memory overrun on inode
IB/qib: Do not write EEPROM
sg: fix read() error reporting
ALSA: hda - Add pin configs for ASUS mobo with IDT 92HD73XX codec
ALSA: pcm: Don't leave PREPARED state after draining
tty: fix up atime/mtime mess, take four
sunrpc: fix braino in ->poll()
procfs: fix race between symlink removals and traversals
debugfs: leave freeing a symlink body until inode eviction
autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
USB: serial: fix potential use-after-free after failed probe
TTY: fix tty_wait_until_sent on 64-bit machines
USB: serial: fix infinite wait_until_sent timeout
net: irda: fix wait_until_sent poll timeout
xhci: fix reporting of 0-sized URBs in control endpoint
xhci: Allocate correct amount of scratchpad buffers
usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards
USB: usbfs: don't leak kernel data in siginfo
USB: serial: cp210x: Adding Seletek device id's
KVM: MIPS: Fix trace event to save PC directly
KVM: emulate: fix CMPXCHG8B on 32-bit hosts
Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref.
Btrfs: fix data loss in the fast fsync path
btrfs: fix lost return value due to variable shadowing
iio: imu: adis16400: Fix sign extension
x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization
PM / QoS: remove duplicate call to pm_qos_update_target
target: Check for LBA + sectors wrap-around in sbc_parse_cdb
mm/memory.c: actually remap enough memory
mm/compaction: fix wrong order check in compact_finished()
mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
mm/hugetlb: add migration entry check in __unmap_hugepage_range
team: don't traverse port list using rcu in team_set_mac_address
udp: only allow UFO for packets from SOCK_DGRAM sockets
usb: plusb: Add support for National Instruments host-to-host cable
macvtap: make sure neighbour code can push ethernet header
net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg
team: fix possible null pointer dereference in team_handle_frame
net: reject creation of netdev names with colons
ematch: Fix auto-loading of ematch modules.
net: phy: Fix verification of EEE support in phy_init_eee
ipv4: ip_check_defrag should not assume that skb_network_offset is zero
ipv4: ip_check_defrag should correctly check return value of skb_copy_bits
gen_stats.c: Duplicate xstats buffer for later use
rtnetlink: call ->dellink on failure when ->newlink exists
ipv6: fix ipv6_cow_metrics for non DST_HOST case
rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY
Linux 3.10.71
libceph: fix double __remove_osd() problem
libceph: change from BUG to WARN for __remove_osd() asserts
libceph: assert both regular and lingering lists in __remove_osd()
MIPS: Export FP functions used by lose_fpu(1) for KVM
x86, mm/ASLR: Fix stack randomization on 64-bit systems
blk-throttle: check stats_cpu before reading it from sysfs
jffs2: fix handling of corrupted summary length
md/raid1: fix read balance when a drive is write-mostly.
md/raid5: Fix livelock when array is both resyncing and degraded.
metag: Fix KSTK_EIP() and KSTK_ESP() macros
gpio: tps65912: fix wrong container_of arguments
arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian
hx4700: regulator: declare full constraints
KVM: x86: update masterclock values on TSC writes
KVM: MIPS: Don't leak FPU/DSP to guest
ARC: fix page address calculation if PAGE_OFFSET != LINUX_LINK_BASE
ntp: Fixup adjtimex freq validation on 32-bit systems
kdb: fix incorrect counts in KDB summary command output
ARM: pxa: add regulator_has_full_constraints to poodle board file
ARM: pxa: add regulator_has_full_constraints to corgi board file
vt: provide notifications on selection changes
usb: core: buffer: smallest buffer should start at ARCH_DMA_MINALIGN
USB: fix use-after-free bug in usb_hcd_unlink_urb()
USB: cp210x: add ID for RUGGEDCOM USB Serial Console
tty: Prevent untrappable signals from malicious program
axonram: Fix bug in direct_access
cfq-iosched: fix incorrect filing of rt async cfqq
cfq-iosched: handle failure of cfq group allocation
iscsi-target: Drop problematic active_ts_list usage
NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_args
Added Little Endian support to vtpm module
tpm/tpm_i2c_stm_st33: Fix potential bug in tpm_stm_i2c_send
tpm: Fix NULL return in tpm_ibmvtpm_get_desired_dma
tpm_tis: verify interrupt during init
ARM: 8284/1: sa1100: clear RCSR_SMR on resume
tracing: Fix unmapping loop in tracing_mark_write
MIPS: KVM: Deliver guest interrupts after local_irq_disable()
nfs: don't call blocking operations while !TASK_RUNNING
mmc: sdhci-pxav3: fix setting of pdata->clk_delay_cycles
power_supply: 88pm860x: Fix leaked power supply on probe fail
ALSA: hdspm - Constrain periods to 2 on older cards
ALSA: off by one bug in snd_riptide_joystick_probe()
lmedm04: Fix usb_submit_urb BOGUS urb xfer, pipe 1 != type 3 in interrupt urb
cpufreq: speedstep-smi: enable interrupts when waiting
PCI: Fix infinite loop with ROM image of size 0
PCI: Generate uppercase hex for modalias var in uevent
HID: i2c-hid: Limit reads to wMaxInputLength bytes for input events
iwlwifi: mvm: always use mac color zero
iwlwifi: mvm: fix failure path when power_update fails in add_interface
iwlwifi: mvm: validate tid and sta_id in ba_notif
iwlwifi: pcie: disable the SCD_BASE_ADDR when we resume from WoWLAN
fsnotify: fix handling of renames in audit
xfs: set superblock buffer type correctly
xfs: inode unlink does not set AGI buffer type
xfs: ensure buffer types are set correctly
Bluetooth: ath3k: workaround the compatibility issue with xHCI controller
Linux 3.10.70
rbd: drop an unsafe assertion
media/rc: Send sync space information on the lirc device
net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param
ppp: deflate: never return len larger than output buffer
ipv4: tcp: get rid of ugly unicast_sock
tcp: ipv4: initialize unicast_sock sk_pacing_rate
bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify
ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too
ping: Fix race in free in receive path
udp_diag: Fix socket skipping within chain
ipv4: try to cache dst_entries which would cause a redirect
net: sctp: fix slab corruption from use after free on INIT collisions
netxen: fix netxen_nic_poll() logic
ipv6: stop sending PTB packets for MTU < 1280
net: rps: fix cpu unplug
ip: zero sockaddr returned on error queue
Linux 3.10.69
crypto: crc32c - add missing crypto module alias
x86,kvm,vmx: Preserve CR4 across VM entry
kvm: vmx: handle invvpid vm exit gracefully
smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread()
ALSA: ak411x: Fix stall in work callback
ASoC: sgtl5000: add delay before first I2C access
ASoC: atmel_ssc_dai: fix start event for I2S mode
lib/checksum.c: fix build for generic csum_tcpudp_nofold
ext4: prevent bugon on race between write/fcntl
arm64: Fix up /proc/cpuinfo
nilfs2: fix deadlock of segment constructor over I_SYNC flag
lib/checksum.c: fix carry in csum_tcpudp_nofold
mm: pagewalk: call pte_hole() for VM_PFNMAP during walk_page_range
MIPS: Fix kernel lockup or crash after CPU offline/online
MIPS: IRQ: Fix disable_irq on CPU IRQs
PCI: Add NEC variants to Stratus ftServer PCIe DMI check
gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low
gpio: sysfs: fix memory leak in gpiod_export_link
Linux 3.10.68
target: Drop arbitrary maximum I/O size limit
iser-target: Fix implicit termination of connections
iser-target: Handle ADDR_CHANGE event for listener cm_id
iser-target: Fix connected_handler + teardown flow race
iser-target: Parallelize CM connection establishment
iser-target: Fix flush + disconnect completion handling
iscsi,iser-target: Initiate termination only once
vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
tcm_loop: Fix wrong I_T nexus association
vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
ib_isert: Add max_send_sge=2 minimum for control PDU responses
IB/isert: Adjust CQ size to HW limits
workqueue: fix subtle pool management issue which can stall whole worker_pool
gpio: squelch a compiler warning
efi-pstore: Make efi-pstore return a unique id
pstore/ram: avoid atomic accesses for ioremapped regions
pstore: Fix NULL pointer fault if get NULL prz in ramoops_get_next_prz
pstore: skip zero size persistent ram buffer in traverse
pstore: clarify clearing of _read_cnt in ramoops_context
pstore: d_alloc_name() doesn't return an ERR_PTR
pstore: Fail to unlink if a driver has not defined pstore_erase
ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE
ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear
ARM: DMA: ensure that old section mappings are flushed from the TLB
ARM: 7931/1: Correct virt_addr_valid
ARM: fix asm/memory.h build error
ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg().
ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h
ARM: lpae: fix definition of PTE_HWTABLE_PTRS
ARM: fix type of PHYS_PFN_OFFSET to unsigned long
ARM: LPAE: use phys_addr_t in alloc_init_pud()
ARM: LPAE: use signed arithmetic for mask definitions
ARM: mm: correct pte_same behaviour for LPAE.
ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables
drivers: net: cpsw: discard dual emac default vlan configuration
regulator: core: fix race condition in regulator_put()
spi/pxa2xx: Clear cur_chip pointer before starting next message
dm cache: fix missing ERR_PTR returns and handling
dm thin: don't allow messages to be sent to a pool target in READ_ONLY or FAIL mode
nl80211: fix per-station group key get/del and memory leak
NFSv4.1: Fix an Oops in nfs41_walk_client_list
nfs: fix dio deadlock when O_DIRECT flag is flipped
Input: i8042 - add noloop quirk for Medion Akoya E7225 (MD98857)
ALSA: seq-dummy: remove deadlock-causing events on close
powerpc/xmon: Fix another endiannes issue in RTAS call from xmon
can: kvaser_usb: Fix state handling upon BUS_ERROR events
can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
can: kvaser_usb: Send correct context to URB completion
can: kvaser_usb: Do not sleep in atomic context
ASoC: wm8960: Fix capture sample rate from 11250 to 11025
spi: dw-mid: fix FIFO size
Signed-off-by: Ian Maund <imaund@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJUyuGRAAoJEDjbvchgkmk+7EwQALYPOeh+AManQFB1MQvFuOgZ
/4ulpjhGXw/RPTKHMeyHo8vRfUhMOx8UPF62uql+g1l9b/Zt2bs6qXu4QcxRRsQc
trSTUpi+U14y1hkgqOVOcFYP2ZaTjNEBQgLJ4eGn46CliLqme+rfoyRYm2GXzcR4
6cbSAr3mufdFIpi9/8Dn62Gv0aws5lIv3qkHJXznyuux3tisPT5y6Ux2KJoivPn/
SqADtRpwo+7lTjl15fE++9AqNsGMorV6toT2OO/7nXP+824psInKLmREAT2qC99b
BG61vcYdxOuHtzmwrvCf1jSRjxhvZT0j2xhBr/vCKcxy08AT0vDv68zrV1r6TIuu
U7/CKXtFBY95cjfnkTLJuswBSuIA/+sQHV6DaddH0V8fcZ6rQMLrblQ9ZcFFFkmT
2SG6lmlXqZvcEKYGMnL/Dcow1rkRhB5stiGgTkYxjiRSRpzAHISRJ/GGpsT+rRqK
HpBs5p9JshvRl7RWKwAu+DNGaEK1X/WYxc4/jw6dZFWX7lEWSMIPlr9zXgZCZ39y
V6lV1VVlT9/CSs1swKHUyhHHehlFsnIlQ6Fkiycr/KkuqBLs92Hyb7WhpVa819yX
osXdxSm6J54skiOLKYpBWHpnY09Tc+p28VEfMpErTExgp2oE8F34K7kdhoQPQb97
2mHiXNa+J4CLUNQ+sRmw
=HDBo
-----END PGP SIGNATURE-----
Merge commit 'v3.10.67' into msm-3.10
This merge brings us up to date with upstream kernel.org tag v3.10.67.
It also contains changes to allow forbidden warnings introduced in
the commit 'core, nfqueue, openvswitch: Orphan frags in skb_zerocopy
and handle errors'. Once upstream has corrected these warnings, the
changes to scripts/gcc-wrapper.py, in this commit, can be reverted.
* commit 'v3.10.67' (915 commits)
Linux 3.10.67
md/raid5: fetch_block must fetch all the blocks handle_stripe_dirtying wants.
ext4: fix warning in ext4_da_update_reserve_space()
quota: provide interface for readding allocated space into reserved space
crypto: add missing crypto module aliases
crypto: include crypto- module prefix in template
crypto: prefix module autoloading with "crypto-"
drbd: merge_bvec_fn: properly remap bvm->bi_bdev
Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"
ipvs: uninitialized data with IP_VS_IPV6
KEYS: close race between key lookup and freeing
sata_dwc_460ex: fix resource leak on error path
x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
x86, tls: Interpret an all-zero struct user_desc as "no segment"
x86, tls, ldt: Stop checking lm in LDT_empty
x86/tsc: Change Fast TSC calibration failed from error to info
x86, hyperv: Mark the Hyper-V clocksource as being continuous
clocksource: exynos_mct: Fix bitmask regression for exynos4_mct_write
can: dev: fix crtlmode_supported check
bus: mvebu-mbus: fix support of MBus window 13
ARM: dts: imx25: Fix PWM "per" clocks
time: adjtimex: Validate the ADJ_FREQUENCY values
time: settimeofday: Validate the values of tv from user
dm cache: share cache-metadata object across inactive and active DM tables
ipr: wait for aborted command responses
drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES
scripts/recordmcount.pl: There is no -m32 gcc option on Super-H anymore
ALSA: usb-audio: Add mic volume fix quirk for Logitech Webcam C210
libata: prevent HSM state change race between ISR and PIO
pinctrl: Fix two deadlocks
gpio: sysfs: fix gpio device-attribute leak
gpio: sysfs: fix gpio-chip device-attribute leak
Linux 3.10.66
s390/3215: fix tty output containing tabs
s390/3215: fix hanging console issue
fsnotify: next_i is freed during fsnotify_unmount_inodes.
netfilter: ipset: small potential read beyond the end of buffer
mmc: sdhci: Fix sleep in atomic after inserting SD card
LOCKD: Fix a race when initialising nlmsvc_timeout
x86, um: actually mark system call tables readonly
um: Skip futex_atomic_cmpxchg_inatomic() test
decompress_bunzip2: off by one in get_next_block()
ARM: shmobile: sh73a0 legacy: Set .control_parent for all irqpin instances
ARM: omap5/dra7xx: Fix frequency typos
ARM: clk-imx6q: fix video divider for rev T0 1.0
ARM: imx6q: drop unnecessary semicolon
ARM: dts: imx25: Fix the SPI1 clocks
Input: I8042 - add Acer Aspire 7738 to the nomux list
Input: i8042 - reset keyboard to fix Elantech touchpad detection
can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
can: kvaser_usb: Reset all URB tx contexts upon channel close
can: kvaser_usb: Don't free packets when tight on URBs
USB: keyspan: fix null-deref at probe
USB: cp210x: add IDs for CEL USB sticks and MeshWorks devices
USB: cp210x: fix ID for production CEL MeshConnect USB Stick
usb: dwc3: gadget: Stop TRB preparation after limit is reached
usb: dwc3: gadget: Fix TRB preparation during SG
OHCI: add a quirk for ULi M5237 blocking on reset
gpiolib: of: Correct error handling in of_get_named_gpiod_flags
NFSv4.1: Fix client id trunking on Linux
ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
vfio-pci: Fix the check on pci device type in vfio_pci_probe()
uvcvideo: Fix destruction order in uvc_delete()
smiapp: Take mutex during PLL update in sensor initialisation
af9005: fix kernel panic on init if compiled without IR
smiapp-pll: Correct clock debug prints
video/logo: prevent use of logos after they have been freed
storvsc: ring buffer failures may result in I/O freeze
iscsi-target: Fail connection on short sendmsg writes
hp_accel: Add support for HP ZBook 15
cfg80211: Fix 160 MHz channels with 80+80 and 160 MHz drivers
ARC: [nsimosci] move peripherals to match model to FPGA
drm/i915: Force the CS stall for invalidate flushes
drm/i915: Invalidate media caches on gen7
drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw
drm/radeon: check the right ring in radeon_evict_flags()
drm/vmwgfx: Fix fence event code
enic: fix rx skb checksum
alx: fix alx_poll()
tcp: Do not apply TSO segment limit to non-TSO packets
tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts
netlink: Don't reorder loads/stores before marking mmap netlink frame as available
netlink: Always copy on mmap TX.
Linux 3.10.65
mm: Don't count the stack guard page towards RLIMIT_STACK
mm: propagate error from stack expansion even for guard page
mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed
perf session: Do not fail on processing out of order event
perf: Fix events installation during moving group
perf/x86/intel/uncore: Make sure only uncore events are collected
Btrfs: don't delay inode ref updates during log replay
ARM: mvebu: disable I/O coherency on non-SMP situations on Armada 370/375/38x/XP
scripts/kernel-doc: don't eat struct members with __aligned
nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
nfsd4: fix xdr4 inclusion of escaped char
fs: nfsd: Fix signedness bug in compare_blob
serial: samsung: wait for transfer completion before clock disable
writeback: fix a subtle race condition in I_DIRTY clearing
cdc-acm: memory leak in error case
genhd: check for int overflow in disk_expand_part_tbl()
USB: cdc-acm: check for valid interfaces
ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs
ALSA: hda - using uninitialized data
ALSA: usb-audio: extend KEF X300A FU 10 tweak to Arcam rPAC
driver core: Fix unbalanced device reference in drivers_probe
x86, vdso: Use asm volatile in __getcpu
x86_64, vdso: Fix the vdso address randomization algorithm
HID: Add a new id 0x501a for Genius MousePen i608X
HID: add battery quirk for USB_DEVICE_ID_APPLE_ALU_WIRELESS_2011_ISO keyboard
HID: roccat: potential out of bounds in pyra_sysfs_write_settings()
HID: i2c-hid: prevent buffer overflow in early IRQ
HID: i2c-hid: fix race condition reading reports
iommu/vt-d: Fix an off-by-one bug in __domain_mapping()
UBI: Fix double free after do_sync_erase()
UBI: Fix invalid vfree()
pstore-ram: Allow optional mapping with pgprot_noncached
pstore-ram: Fix hangs by using write-combine mappings
PCI: Restore detection of read-only BARs
ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap
ASoC: max98090: Fix ill-defined sidetone route
ASoC: sigmadsp: Refuse to load firmware files with a non-supported version
ath5k: fix hardware queue index assignment
swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single
can: peak_usb: fix memset() usage
can: peak_usb: fix cleanup sequence order in case of error during init
ath9k: fix BE/BK queue order
ath9k_hw: fix hardware queue allocation
ocfs2: fix journal commit deadlock
Linux 3.10.64
Btrfs: fix fs corruption on transaction abort if device supports discard
Btrfs: do not move em to modified list when unpinning
eCryptfs: Remove buggy and unnecessary write in file name decode routine
eCryptfs: Force RO mount when encrypted view is enabled
udf: Verify symlink size before loading it
exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
ncpfs: return proper error from NCP_IOC_SETROOT ioctl
crypto: af_alg - fix backlog handling
userns: Unbreak the unprivileged remount tests
userns: Allow setting gid_maps without privilege when setgroups is disabled
userns: Add a knob to disable setgroups on a per user namespace basis
userns: Rename id_map_mutex to userns_state_mutex
userns: Only allow the creator of the userns unprivileged mappings
userns: Check euid no fsuid when establishing an unprivileged uid mapping
userns: Don't allow unprivileged creation of gid mappings
userns: Don't allow setgroups until a gid mapping has been setablished
userns: Document what the invariant required for safe unprivileged mappings.
groups: Consolidate the setgroups permission checks
umount: Disallow unprivileged mount force
mnt: Update unprivileged remount test
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount
mac80211: free management frame keys when removing station
mac80211: fix multicast LED blinking and counter
KEYS: Fix stale key registration at error path
isofs: Fix unchecked printing of ER records
x86/tls: Don't validate lm in set_thread_area() after all
dm space map metadata: fix sm_bootstrap_get_nr_blocks()
dm bufio: fix memleak when using a dm_buffer's inline bio
nfs41: fix nfs4_proc_layoutget error handling
megaraid_sas: corrected return of wait_event from abort frame path
mmc: block: add newline to sysfs display of force_ro
mfd: tc6393xb: Fail ohci suspend if full state restore is required
md/bitmap: always wait for writes on unplug.
x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
x86_64, switch_to(): Load TLS descriptors before switching DS and ES
x86/tls: Disallow unusual TLS segments
x86/tls: Validate TLS entries to protect espfix
isofs: Fix infinite looping over CE entries
Linux 3.10.63
ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery
powerpc: 32 bit getcpu VDSO function uses 64 bit instructions
ARM: sched_clock: Load cycle count after epoch stabilizes
igb: bring link up when PHY is powered up
ext2: Fix oops in ext2_get_block() called from ext2_quota_write()
nEPT: Nested INVEPT
net: sctp: use MAX_HEADER for headroom reserve in output path
net: mvneta: fix Tx interrupt delay
rtnetlink: release net refcnt on error in do_setlink()
net/mlx4_core: Limit count field to 24 bits in qp_alloc_res
tg3: fix ring init when there are more TX than RX channels
ipv6: gre: fix wrong skb->protocol in WCCP
sata_fsl: fix error handling of irq_of_parse_and_map
ahci: disable MSI on SAMSUNG 0xa800 SSD
AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller
media: smiapp: Only some selection targets are settable
drm/i915: Unlock panel even when LVDS is disabled
drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6
i2c: davinci: generate STP always when NACK is received
i2c: omap: fix i207 errata handling
i2c: omap: fix NACK and Arbitration Lost irq handling
xen-netfront: Remove BUGs on paged skb data which crosses a page boundary
mm: fix swapoff hang after page migration and fork
mm: frontswap: invalidate expired data on a dup-store failure
Linux 3.10.62
nfsd: Fix ACL null pointer deref
powerpc/powernv: Honor the generic "no_64bit_msi" flag
bnx2fc: do not add shared skbs to the fcoe_rx_list
nfsd4: fix leak of inode reference on delegation failure
nfsd: Fix slot wake up race in the nfsv4.1 callback code
rt2x00: do not align payload on modern H/W
can: dev: avoid calling kfree_skb() from interrupt context
spi: dw: Fix dynamic speed change.
iser-target: Handle DEVICE_REMOVAL event on network portal listener correctly
target: Don't call TFO->write_pending if data_length == 0
srp-target: Retry when QP creation fails with ENOMEM
Input: xpad - use proper endpoint type
ARM: 8222/1: mvebu: enable strex backoff delay
ARM: 8216/1: xscale: correct auxiliary register in suspend/resume
ALSA: usb-audio: Add ctrl message delay quirk for Marantz/Denon devices
can: esd_usb2: fix memory leak on disconnect
USB: xhci: don't start a halted endpoint before its new dequeue is set
usb-quirks: Add reset-resume quirk for MS Wireless Laser Mouse 6000
usb: serial: ftdi_sio: add PIDs for Matrix Orbital products
USB: serial: cp210x: add IDs for CEL MeshConnect USB Stick
USB: keyspan: fix tty line-status reporting
USB: keyspan: fix overrun-error reporting
USB: ssu100: fix overrun-error reporting
iio: Fix IIO_EVENT_CODE_EXTRACT_DIR bit mask
powerpc/pseries: Fix endiannes issue in RTAS call from xmon
powerpc/pseries: Honor the generic "no_64bit_msi" flag
of/base: Fix PowerPC address parsing hack
ASoC: wm_adsp: Avoid attempt to free buffers that might still be in use
ASoC: sgtl5000: Fix SMALL_POP bit definition
PCI/MSI: Add device flag indicating that 64-bit MSIs don't work
ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg
pptp: fix stack info leak in pptp_getname()
qmi_wwan: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem
ieee802154: fix error handling in ieee802154fake_probe()
ipv4: Fix incorrect error code when adding an unreachable route
inetdevice: fixed signed integer overflow
sparc64: Fix constraints on swab helpers.
uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME
x86, mm: Set NX across entire PMD at boot
x86: Require exact match for 'noxsave' command line option
x86_64, traps: Rework bad_iret
x86_64, traps: Stop using IST for #SS
x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
MIPS: Loongson: Make platform serial setup always built-in.
MIPS: oprofile: Fix backtrace on 64-bit kernel
Linux 3.10.61
mm: memcg: handle non-error OOM situations more gracefully
mm: memcg: do not trap chargers with full callstack on OOM
mm: memcg: rework and document OOM waiting and wakeup
mm: memcg: enable memcg OOM killer only for user faults
x86: finish user fault error path with fatal signal
arch: mm: pass userspace fault flag to generic fault handler
arch: mm: do not invoke OOM killer on kernel fault OOM
arch: mm: remove obsolete init OOM protection
mm: invoke oom-killer from remaining unconverted page fault handlers
net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
net: sctp: fix panic on duplicate ASCONF chunks
net: sctp: fix remote memory pressure from excessive queueing
KVM: x86: Don't report guest userspace emulation error to userspace
SCSI: hpsa: fix a race in cmd_free/scsi_done
net/mlx4_en: Fix BlueFlame race
ARM: Correct BUG() assembly to ensure it is endian-agnostic
perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge
mei: bus: fix possible boundaries violation
perf: Handle compat ioctl
MIPS: Fix forgotten preempt_enable() when CPU has inclusive pcaches
dell-wmi: Fix access out of memory
ARM: probes: fix instruction fetch order with <asm/opcodes.h>
br: fix use of ->rx_handler_data in code executed on non-rx_handler path
netfilter: nf_nat: fix oops on netns removal
netfilter: xt_bpf: add mising opaque struct sk_filter definition
netfilter: nf_log: release skbuff on nlmsg put failure
netfilter: nfnetlink_log: fix maximum packet length logged to userspace
netfilter: nf_log: account for size of NLMSG_DONE attribute
ipc: always handle a new value of auto_msgmni
clocksource: Remove "weak" from clocksource_default_clock() declaration
kgdb: Remove "weak" from kgdb_arch_pc() declaration
media: ttusb-dec: buffer overflow in ioctl
NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
nfs: Fix use of uninitialized variable in nfs_getattr()
NFS: Don't try to reclaim delegation open state if recovery failed
NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
Input: alps - allow up to 2 invalid packets without resetting device
Input: alps - ignore potential bare packets when device is out of sync
dm raid: ensure superblock's size matches device's logical block size
dm btree: fix a recursion depth bug in btree walking code
block: Fix computation of merged request priority
parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls
scsi: only re-lock door after EH on devices that were reset
nfs: fix pnfs direct write memory leak
firewire: cdev: prevent kernel stack leaking into ioctl arguments
arm64: __clear_user: handle exceptions on strb
ARM: 8198/1: make kuser helpers depend on MMU
drm/radeon: add missing crtc unlock when setting up the MC
mac80211: fix use-after-free in defragmentation
macvtap: Fix csum_start when VLAN tags are present
iwlwifi: configure the LTR
libceph: do not crash on large auth tickets
xtensa: re-wire umount syscall to sys_oldumount
ALSA: usb-audio: Fix memory leak in FTU quirk
ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks
ahci: Add Device IDs for Intel Sunrise Point PCH
audit: keep inode pinned
x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit
sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks
sparc64: Do irq_{enter,exit}() around generic_smp_call_function*().
sparc64: Fix crashes in schizo_pcierr_intr_other().
sunvdc: don't call VD_OP_GET_VTOC
vio: fix reuse of vio_dring slot
sunvdc: limit each sg segment to a page
sunvdc: compute vdisk geometry from capacity
sunvdc: add cdrom and v1.1 protocol support
net: sctp: fix memory leak in auth key management
net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
gre6: Move the setting of dev->iflink into the ndo_init functions.
ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function.
Linux 3.10.60
libceph: ceph-msgr workqueue needs a resque worker
Btrfs: fix kfree on list_head in btrfs_lookup_csums_range error cleanup
of: Fix overflow bug in string property parsing functions
sysfs: driver core: Fix glue dir race condition by gdp_mutex
i2c: at91: don't account as iowait
acer-wmi: Add acpi_backlight=video quirk for the Acer KAV80
rbd: Fix error recovery in rbd_obj_read_sync()
drm/radeon: remove invalid pci id
usb: gadget: udc: core: fix kernel oops with soft-connect
usb: gadget: function: acm: make f_acm pass USB20CV Chapter9
usb: dwc3: gadget: fix set_halt() bug with pending transfers
crypto: algif - avoid excessive use of socket buffer in skcipher
mm: Remove false WARN_ON from pagecache_isize_extended()
x86, apic: Handle a bad TSC more gracefully
posix-timers: Fix stack info leak in timer_create()
mac80211: fix typo in starting baserate for rts_cts_rate_idx
PM / Sleep: fix recovery during resuming from hibernation
tty: Fix high cpu load if tty is unreleaseable
quota: Properly return errors from dquot_writeback_dquots()
ext3: Don't check quota format when there are no quota files
nfsd4: fix crash on unknown operation number
cpc925_edac: Report UE events properly
e7xxx_edac: Report CE events properly
i3200_edac: Report CE events properly
i82860_edac: Report CE events properly
scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND
lib/bitmap.c: fix undefined shift in __bitmap_shift_{left|right}()
cgroup/kmemleak: add kmemleak_free() for cgroup deallocations.
usb: Do not allow usb_alloc_streams on unconfigured devices
USB: opticon: fix non-atomic allocation in write path
usb-storage: handle a skipped data phase
spi: pxa2xx: toggle clocks on suspend if not disabled by runtime PM
spi: pl022: Fix incorrect dma_unmap_sg
usb: dwc3: gadget: Properly initialize LINK TRB
wireless: rt2x00: add new rt2800usb device
USB: option: add Haier CE81B CDMA modem
usb: option: add support for Telit LE910
USB: cdc-acm: only raise DTR on transitions from B0
USB: cdc-acm: add device id for GW Instek AFG-2225
usb: serial: ftdi_sio: add "bricked" FTDI device PID
usb: serial: ftdi_sio: add Awinda Station and Dongle products
USB: serial: cp210x: add Silicon Labs 358x VID and PID
serial: Fix divide-by-zero fault in uart_get_divisor()
staging:iio:ade7758: Remove "raw" from channel name
staging:iio:ade7758: Fix check if channels are enabled in prenable
staging:iio:ade7758: Fix NULL pointer deref when enabling buffer
staging:iio:ad5933: Drop "raw" from channel names
staging:iio:ad5933: Fix NULL pointer deref when enabling buffer
OOM, PM: OOM killed task shouldn't escape PM suspend
freezer: Do not freeze tasks killed by OOM killer
ext4: fix oops when loading block bitmap failed
cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy
ext4: fix overflow when updating superblock backups after resize
ext4: check s_chksum_driver when looking for bg csum presence
ext4: fix reservation overflow in ext4_da_write_begin
ext4: add ext4_iget_normal() which is to be used for dir tree lookups
ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT
ext4: don't check quota format when there are no quota files
ext4: check EA value offset when loading
jbd2: free bh when descriptor block checksum fails
MIPS: tlbex: Properly fix HUGE TLB Refill exception handler
target: Fix APTPL metadata handling for dynamic MappedLUNs
target: Fix queue full status NULL pointer for SCF_TRANSPORT_TASK_SENSE
qla_target: don't delete changed nacls
ARC: Update order of registers in KGDB to match GDB 7.5
ARC: [nsimosci] Allow "headless" models to boot
KVM: x86: Emulator fixes for eip canonical checks on near branches
KVM: x86: Fix wrong masking on relative jump/call
kvm: x86: don't kill guest on unknown exit reason
KVM: x86: Check non-canonical addresses upon WRMSR
KVM: x86: Improve thread safety in pit
KVM: x86: Prevent host from panicking on shared MSR writes.
kvm: fix excessive pages un-pinning in kvm_iommu_map error path.
media: tda7432: Fix setting TDA7432_MUTE bit for TDA7432_RF register
media: ds3000: fix LNB supply voltage on Tevii S480 on initialization
media: em28xx-v4l: give back all active video buffers to the vb2 core properly on streaming stop
media: v4l2-common: fix overflow in v4l_bound_align_image()
drm/nouveau/bios: memset dcb struct to zero before parsing
drm/tilcdc: Fix the error path in tilcdc_load()
drm/ast: Fix HW cursor image
Input: i8042 - quirks for Fujitsu Lifebook A544 and Lifebook AH544
Input: i8042 - add noloop quirk for Asus X750LN
framebuffer: fix border color
modules, lock around setting of MODULE_STATE_UNFORMED
dm log userspace: fix memory leak in dm_ulog_tfr_init failure path
block: fix alignment_offset math that assumes io_min is a power-of-2
drbd: compute the end before rb_insert_augmented()
dm bufio: update last_accessed when relinking a buffer
virtio_pci: fix virtio spec compliance on restore
selinux: fix inode security list corruption
pstore: Fix duplicate {console,ftrace}-efi entries
mfd: rtsx_pcr: Fix MSI enable error handling
mnt: Prevent pivot_root from creating a loop in the mount tree
UBI: add missing kmem_cache_free() in process_pool_aeb error path
random: add and use memzero_explicit() for clearing data
crypto: more robust crypto_memneq
fix misuses of f_count() in ppp and netlink
kill wbuf_queued/wbuf_dwork_lock
ALSA: pcm: Zero-clear reserved fields of PCM status ioctl in compat mode
evm: check xattr value length and type in evm_inode_setxattr()
x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE
x86_64, entry: Fix out of bounds read on sysenter
x86_64, entry: Filter RFLAGS.NT on entry from userspace
x86, flags: Rename X86_EFLAGS_BIT1 to X86_EFLAGS_FIXED
x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()
x86, fpu: __restore_xstate_sig()->math_state_restore() needs preempt_disable()
x86: Reject x32 executables if x32 ABI not supported
vfs: fix data corruption when blocksize < pagesize for mmaped data
UBIFS: fix free log space calculation
UBIFS: fix a race condition
UBIFS: remove mst_mutex
fs: Fix theoretical division by 0 in super_cache_scan().
fs: make cont_expand_zero interruptible
mmc: rtsx_pci_sdmmc: fix incorrect last byte in R2 response
libata-sff: Fix controllers with no ctl port
pata_serverworks: disable 64-KB DMA transfers on Broadcom OSB4 IDE Controller
Revert "percpu: free percpu allocation info for uniprocessor system"
lockd: Try to reconnect if statd has moved
drivers/net: macvtap and tun depend on INET
ipv4: dst_entry leak in ip_send_unicast_reply()
ax88179_178a: fix bonding failure
ipv4: fix nexthop attlen check in fib_nh_match
tracing/syscalls: Ignore numbers outside NR_syscalls' range
Linux 3.10.59
ecryptfs: avoid to access NULL pointer when write metadata in xattr
ARM: at91/PMC: don't forget to write PMC_PCDR register to disable clocks
ALSA: usb-audio: Add support for Steinberg UR22 USB interface
ALSA: emu10k1: Fix deadlock in synth voice lookup
ALSA: pcm: use the same dma mmap codepath both for arm and arm64
arm64: compat: fix compat types affecting struct compat_elf_prpsinfo
spi: dw-mid: terminate ongoing transfers at exit
kernel: add support for gcc 5
fanotify: enable close-on-exec on events' fd when requested in fanotify_init()
mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set
Bluetooth: Fix issue with USB suspend in btusb driver
Bluetooth: Fix HCI H5 corrupted ack value
rt2800: correct BBP1_TX_POWER_CTRL mask
PCI: Generate uppercase hex for modalias interface class
PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size
iwlwifi: Add missing PCI IDs for the 7260 series
NFSv4.1: Fix an NFSv4.1 state renewal regression
NFSv4: fix open/lock state recovery error handling
NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
lzo: check for length overrun in variable length encoding.
Revert "lzo: properly check for overruns"
Documentation: lzo: document part of the encoding
m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()
Drivers: hv: vmbus: Fix a bug in vmbus_open()
Drivers: hv: vmbus: Cleanup vmbus_establish_gpadl()
Drivers: hv: vmbus: Cleanup vmbus_teardown_gpadl()
Drivers: hv: vmbus: Cleanup vmbus_post_msg()
firmware_class: make sure fw requests contain a name
qla2xxx: Use correct offset to req-q-out for reserve calculation
mptfusion: enable no_write_same for vmware scsi disks
be2iscsi: check ip buffer before copying
regmap: fix NULL pointer dereference in _regmap_write/read
regmap: debugfs: fix possbile NULL pointer dereference
spi: dw-mid: check that DMA was inited before exit
spi: dw-mid: respect 8 bit mode
x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead
kvm: don't take vcpu mutex for obviously invalid vcpu ioctls
KVM: s390: unintended fallthrough for external call
kvm: x86: fix stale mmio cache bug
fs: Add a missing permission check to do_umount
Btrfs: fix race in WAIT_SYNC ioctl
Btrfs: fix build_backref_tree issue with multiple shared blocks
Btrfs: try not to ENOSPC on log replay
Linux 3.10.58
USB: cp210x: add support for Seluxit USB dongle
USB: serial: cp210x: added Ketra N1 wireless interface support
USB: Add device quirk for ASUS T100 Base Station keyboard
ipv6: reallocate addrconf router for ipv6 address when lo device up
tcp: fixing TLP's FIN recovery
sctp: handle association restarts when the socket is closed.
ip6_gre: fix flowi6_proto value in xmit path
hyperv: Fix a bug in netvsc_start_xmit()
tg3: Allow for recieve of full-size 8021AD frames
tg3: Work around HW/FW limitations with vlan encapsulated frames
l2tp: fix race while getting PMTU on PPP pseudo-wire
openvswitch: fix panic with multiple vlan headers
packet: handle too big packets for PACKET_V3
tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()
sit: Fix ipip6_tunnel_lookup device matching criteria
myri10ge: check for DMA mapping errors
Linux 3.10.57
cpufreq: ondemand: Change the calculation of target frequency
cpufreq: Fix wrong time unit conversion
nl80211: clear skb cb before passing to netlink
drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
jiffies: Fix timeval conversion to jiffies
md/raid5: disable 'DISCARD' by default due to safety concerns.
media: vb2: fix VBI/poll regression
mm: numa: Do not mark PTEs pte_numa when splitting huge pages
mm, thp: move invariant bug check out of loop in __split_huge_page_map
ring-buffer: Fix infinite spin in reading buffer
init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu
perf: fix perf bug in fork()
udf: Avoid infinite loop when processing indirect ICBs
Linux 3.10.56
vm_is_stack: use for_each_thread() rather then buggy while_each_thread()
oom_kill: add rcu_read_lock() into find_lock_task_mm()
oom_kill: has_intersects_mems_allowed() needs rcu_read_lock()
oom_kill: change oom_kill.c to use for_each_thread()
introduce for_each_thread() to replace the buggy while_each_thread()
kernel/fork.c:copy_process(): unify CLONE_THREAD-or-thread_group_leader code
arm: multi_v7_defconfig: Enable Zynq UART driver
ext2: Fix fs corruption in ext2_get_xip_mem()
serial: 8250_dma: check the result of TX buffer mapping
ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
netfilter: nf_conntrack: avoid large timeout for mid-stream pickup
PM / sleep: Use valid_state() for platform-dependent sleep states only
PM / sleep: Add state field to pm_states[] entries
ipvs: fix ipv6 hook registration for local replies
ipvs: Maintain all DSCP and ECN bits for ipv6 tun forwarding
ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack
md/raid1: fix_read_error should act on all non-faulty devices.
media: cx18: fix kernel oops with tda8290 tuner
Fix nasty 32-bit overflow bug in buffer i/o code.
perf kmem: Make it work again on non NUMA machines
perf: Fix a race condition in perf_remove_from_context()
alarmtimer: Lock k_itimer during timer callback
alarmtimer: Do not signal SIGEV_NONE timers
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
powerpc/perf: Fix ABIv2 kernel backtraces
sched: Fix unreleased llc_shared_mask bit during CPU hotplug
ocfs2/dlm: do not get resource spinlock if lockres is new
nilfs2: fix data loss with mmap()
fs/notify: don't show f_handle if exportfs_encode_inode_fh failed
fsnotify/fdinfo: use named constants instead of hardcoded values
kcmp: fix standard comparison bug
Revert "mac80211: disable uAPSD if all ACs are under ACM"
usb: dwc3: core: fix ordering for PHY suspend
usb: dwc3: core: fix order of PM runtime calls
usb: host: xhci: fix compliance mode workaround
genhd: fix leftover might_sleep() in blk_free_devt()
lockd: fix rpcbind crash on lockd startup failure
rtlwifi: rtl8192cu: Add new ID
percpu: perform tlb flush after pcpu_map_pages() failure
percpu: fix pcpu_alloc_pages() failure path
percpu: free percpu allocation info for uniprocessor system
ata_piix: Add Device IDs for Intel 9 Series PCH
Input: i8042 - add nomux quirk for Avatar AVIU-145A6
Input: i8042 - add Fujitsu U574 to no_timeout dmi table
Input: atkbd - do not try 'deactivate' keyboard on any LG laptops
Input: elantech - fix detection of touchpad on ASUS s301l
Input: synaptics - add support for ForcePads
Input: serport - add compat handling for SPIOCSTYPE ioctl
dm crypt: fix access beyond the end of allocated space
block: Fix dev_t minor allocation lifetime
workqueue: apply __WQ_ORDERED to create_singlethread_workqueue()
Revert "iwlwifi: dvm: don't enable CTS to self"
SCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu
NFC: microread: Potential overflows in microread_target_discovered()
iscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid
iscsi-target: avoid NULL pointer in iscsi_copy_param_list failure
Target/iser: Don't put isert_conn inside disconnected handler
Target/iser: Get isert_conn reference once got to connected_handler
iio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name
iio:magnetometer: bugfix magnetometers gain values
iio: adc: ad_sigma_delta: Fix indio_dev->trig assignment
iio: st_sensors: Fix indio_dev->trig assignment
iio: meter: ade7758: Fix indio_dev->trig assignment
iio: inv_mpu6050: Fix indio_dev->trig assignment
iio: gyro: itg3200: Fix indio_dev->trig assignment
iio:trigger: modify return value for iio_trigger_get
CIFS: Fix SMB2 readdir error handling
CIFS: Fix directory rename error
ASoC: davinci-mcasp: Correct rx format unit configuration
shmem: fix nlink for rename overwrite directory
x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8
KVM: x86: handle idiv overflow at kvm_write_tsc
regmap: Fix handling of volatile registers for format_write() chips
ACPICA: Update to GPIO region handler interface.
MIPS: mcount: Adjust stack pointer for static trace in MIPS32
MIPS: ZBOOT: add missing <linux/string.h> include
ARM: 8165/1: alignment: don't break misaligned NEON load/store
ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel
ARM: 8133/1: use irq_set_affinity with force=false when migrating irqs
ARM: 8128/1: abort: don't clear the exclusive monitors
NFSv4: Fix another bug in the close/open_downgrade code
NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()
usb:hub set hub->change_bits when over-current happens
usb: dwc3: omap: fix ordering for runtime pm calls
USB: EHCI: unlink QHs even after the controller has stopped
USB: storage: Add quirks for Entrega/Xircom USB to SCSI converters
USB: storage: Add quirk for Ariston Technologies iConnect USB to SCSI adapter
USB: storage: Add quirk for Adaptec USBConnect 2000 USB-to-SCSI Adapter
storage: Add single-LUN quirk for Jaz USB Adapter
usb: hub: take hub->hdev reference when processing from eventlist
xhci: fix oops when xhci resumes from hibernate with hw lpm capable devices
xhci: Fix null pointer dereference if xhci initialization fails
USB: zte_ev: fix removed PIDs
USB: ftdi_sio: add support for NOVITUS Bono E thermal printer
USB: sierra: add 1199:68AA device ID
USB: sierra: avoid CDC class functions on "68A3" devices
USB: zte_ev: remove duplicate Qualcom PID
USB: zte_ev: remove duplicate Gobi PID
Revert "USB: option,zte_ev: move most ZTE CDMA devices to zte_ev"
USB: option: add VIA Telecom CDS7 chipset device id
USB: option: reduce interrupt-urb logging verbosity
USB: serial: fix potential heap buffer overflow
USB: sisusb: add device id for Magic Control USB video
USB: serial: fix potential stack buffer overflow
USB: serial: pl2303: add device id for ztek device
xtensa: fix a6 and a7 handling in fast_syscall_xtensa
xtensa: fix TLBTEMP_BASE_2 region handling in fast_second_level_miss
xtensa: fix access to THREAD_RA/THREAD_SP/THREAD_DS
xtensa: fix address checks in dma_{alloc,free}_coherent
xtensa: replace IOCTL code definitions with constants
drm/radeon: add connector quirk for fujitsu board
drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle
drm/ast: AST2000 cannot be detected correctly
drm/i915: Wait for vblank before enabling the TV encoder
drm/i915: Remove bogus __init annotation from DMI callbacks
HID: logitech-dj: prevent false errors to be shown
HID: magicmouse: sanity check report size in raw_event() callback
HID: picolcd: sanity check report size in raw_event() callback
cfq-iosched: Fix wrong children_weight calculation
ALSA: pcm: fix fifo_size frame calculation
ALSA: hda - Fix invalid pin powermap without jack detection
ALSA: hda - Fix COEF setups for ALC1150 codec
ALSA: core: fix buffer overflow in snd_info_get_line()
arm64: ptrace: fix compat hardware watchpoint reporting
trace: Fix epoll hang when we race with new entries
i2c: at91: Fix a race condition during signal handling in at91_do_twi_xfer.
i2c: at91: add bound checking on SMBus block length bytes
arm64: flush TLS registers during exec
ibmveth: Fix endian issues with rx_no_buffer statistic
ahci: add pcid for Marvel 0x9182 controller
ahci: Add Device IDs for Intel 9 Series PCH
pata_scc: propagate return value of scc_wait_after_reset
drm/i915: read HEAD register back in init_ring_common() to enforce ordering
drm/radeon: load the lm63 driver for an lm64 thermal chip.
drm/ttm: Choose a pool to shrink correctly in ttm_dma_pool_shrink_scan().
drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan().
drm/tilcdc: fix double kfree
drm/tilcdc: fix release order on exit
drm/tilcdc: panel: fix leak when unloading the module
drm/tilcdc: tfp410: fix dangling sysfs connector node
drm/tilcdc: slave: fix dangling sysfs connector node
drm/tilcdc: panel: fix dangling sysfs connector node
carl9170: fix sending URBs with wrong type when using full-speed
Linux 3.10.55
libceph: gracefully handle large reply messages from the mon
libceph: rename ceph_msg::front_max to front_alloc_len
tpm: Provide a generic means to override the chip returned timeouts
vfs: fix bad hashing of dentries
dcache.c: get rid of pointless macros
IB/srp: Fix deadlock between host removal and multipathd
blkcg: don't call into policy draining if root_blkg is already gone
mtd: nand: omap: Fix 1-bit Hamming code scheme, omap_calculate_ecc()
mtd/ftl: fix the double free of the buffers allocated in build_maps()
CIFS: Fix wrong restart readdir for SMB1
CIFS: Fix wrong filename length for SMB2
CIFS: Fix wrong directory attributes after rename
CIFS: Possible null ptr deref in SMB2_tcon
CIFS: Fix async reading on reconnects
CIFS: Fix STATUS_CANNOT_DELETE error mapping for SMB2
libceph: do not hard code max auth ticket len
libceph: add process_one_ticket() helper
libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly
md/raid1,raid10: always abort recover on write error.
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't dirty buffers beyond EOF
xfs: quotacheck leaves dquot buffers without verifiers
RDMA/iwcm: Use a default listen backlog if needed
md/raid10: Fix memory leak when raid10 reshape completes.
md/raid10: fix memory leak when reshaping a RAID10.
md/raid6: avoid data corruption during recovery of double-degraded RAID6
Bluetooth: Avoid use of session socket after the session gets freed
Bluetooth: never linger on process exit
mnt: Add tests for unprivileged remount cases that have found to be faulty
mnt: Change the default remount atime from relatime to the existing value
mnt: Correct permission checks in do_remount
mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags into do_remount
mnt: Only change user settable mount flags in remount
ring-buffer: Up rb_iter_peek() loop count to 3
ring-buffer: Always reset iterator to reader page
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
ACPI: Run fixed event device notifications in process context
ACPICA: Utilities: Fix memory leak in acpi_ut_copy_iobject_to_iobject
bfa: Fix undefined bit shift on big-endian architectures with 32-bit DMA address
ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE
ASoC: max98090: Fix missing free_irq
ASoC: samsung: Correct I2S DAI suspend/resume ops
ASoC: wm_adsp: Add missing MODULE_LICENSE
ASoC: pcm: fix dpcm_path_put in dpcm runtime update
openrisc: Rework signal handling
MIPS: Fix accessing to per-cpu data when flushing the cache
MIPS: OCTEON: make get_system_type() thread-safe
MIPS: asm: thread_info: Add _TIF_SECCOMP flag
MIPS: Cleanup flags in syscall flags handlers.
MIPS: asm/reg.h: Make 32- and 64-bit definitions available at the same time
MIPS: Remove BUG_ON(!is_fpu_owner()) in do_ade()
MIPS: tlbex: Fix a missing statement for HUGETLB
MIPS: Prevent user from setting FCSR cause bits
MIPS: GIC: Prevent array overrun
drivers: scsi: storvsc: Correctly handle TEST_UNIT_READY failure
Drivers: scsi: storvsc: Implement a eh_timed_out handler
powerpc/pseries: Failure on removing device node
powerpc/mm: Use read barrier when creating real_pte
powerpc/mm/numa: Fix break placement
regulator: arizona-ldo1: remove bypass functionality
mfd: omap-usb-host: Fix improper mask use.
kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path
CAPABILITIES: remove undefined caps from all processes
tpm: missing tpm_chip_put in tpm_get_random()
firmware: Do not use WARN_ON(!spin_is_locked())
spi: omap2-mcspi: Configure hardware when slave driver changes mode
spi: orion: fix incorrect handling of cell-index DT property
iommu/amd: Fix cleanup_domain for mass device removal
media: media-device: Remove duplicated memset() in media_enum_entities()
media: au0828: Only alt setting logic when needed
media: xc4000: Fix get_frequency()
media: xc5000: Fix get_frequency()
Linux 3.10.54
USB: fix build error with CONFIG_PM_RUNTIME disabled
NFSv4: Fix problems with close in the presence of a delegation
NFSv3: Fix another acl regression
svcrdma: Select NFSv4.1 backchannel transport based on forward channel
NFSD: Decrease nfsd_users in nfsd_startup_generic fail
usb: hub: Prevent hub autosuspend if usbcore.autosuspend is -1
USB: whiteheat: Added bounds checking for bulk command response
USB: ftdi_sio: Added PID for new ekey device
USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID
ARM: OMAP2+: hwmod: Rearm wake-up interrupts for DT when MUSB is idled
usb: xhci: amd chipset also needs short TX quirk
xhci: Treat not finding the event_seg on COMP_STOP the same as COMP_STOP_INVAL
Staging: speakup: Update __speakup_paste_selection() tty (ab)usage to match vt
jbd2: fix infinite loop when recovering corrupt journal blocks
mei: nfc: fix memory leak in error path
mei: reset client state on queued connect request
Btrfs: fix csum tree corruption, duplicate and outdated checksums
hpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctl
x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub
x86_64/vsyscall: Fix warn_bad_vsyscall log output
x86: don't exclude low BIOS area when allocating address space for non-PCI cards
drm/radeon: add additional SI pci ids
ext4: fix BUG_ON in mb_free_blocks()
kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)
Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"
KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use
KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
KVM: x86: Inter-privilege level ret emulation is not implemeneted
crypto: ux500 - make interrupt mode plausible
serial: core: Preserve termios c_cflag for console resume
ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct
drivers/i2c/busses: use correct type for dma_map/unmap
hwmon: (dme1737) Prevent overflow problem when writing large limits
hwmon: (ads1015) Fix out-of-bounds array access
hwmon: (lm85) Fix various errors on attribute writes
hwmon: (ads1015) Fix off-by-one for valid channel index checking
hwmon: (gpio-fan) Prevent overflow problem when writing large limits
hwmon: (lm78) Fix overflow problems seen when writing large temperature limits
hwmon: (sis5595) Prevent overflow problem when writing large limits
drm: omapdrm: fix compiler errors
ARM: OMAP3: Fix choice of omap3_restore_es function in OMAP34XX rev3.1.2 case.
mei: start disconnect request timer consistently
ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co
ALSA: hda/ca0132 - Don't try loading firmware at resume when already failed
ALSA: virtuoso: add Xonar Essence STX II support
ALSA: hda - fix an external mic jack problem on a HP machine
USB: Fix persist resume of some SS USB devices
USB: ehci-pci: USB host controller support for Intel Quark X1000
USB: serial: ftdi_sio: Add support for new Xsens devices
USB: serial: ftdi_sio: Annotate the current Xsens PID assignments
USB: OHCI: don't lose track of EDs when a controller dies
isofs: Fix unbounded recursion when processing relocated directories
HID: fix a couple of off-by-ones
HID: logitech: perform bounds checking on device_id early enough
stable_kernel_rules: Add pointer to netdev-FAQ for network patches
Linux 3.10.53
arch/sparc/math-emu/math_32.c: drop stray break operator
sparc64: ldc_connect() should not return EINVAL when handshake is in progress.
sunsab: Fix detection of BREAK on sunsab serial console
bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
sparc64: Guard against flushing openfirmware mappings.
sparc64: Do not insert non-valid PTEs into the TSB hash table.
sparc64: Add membar to Niagara2 memcpy code.
sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault addresses.
sparc64: Fix top-level fault handling bugs.
sparc64: Handle 32-bit tasks properly in compute_effective_address().
sparc64: Make itc_sync_lock raw
sparc64: Fix argument sign extension for compat_sys_futex().
sctp: fix possible seqlock seadlock in sctp_packet_transmit()
iovec: make sure the caller actually wants anything in memcpy_fromiovecend
net: Correctly set segment mac_len in skb_segment().
macvlan: Initialize vlan_features to turn on offload support.
net: sctp: inherit auth_capable on INIT collisions
tcp: Fix integer-overflow in TCP vegas
tcp: Fix integer-overflows in TCP veno
net: sendmsg: fix NULL pointer dereference
ip: make IP identifiers less predictable
inetpeer: get rid of ip_id_count
bnx2x: fix crash during TSO tunneling
Linux 3.10.52
x86/espfix/xen: Fix allocation of pages for paravirt page tables
lib/btree.c: fix leak of whole btree nodes
net/l2tp: don't fall back on UDP [get|set]sockopt
net: mvneta: replace Tx timer with a real interrupt
net: mvneta: add missing bit descriptions for interrupt masks and causes
net: mvneta: do not schedule in mvneta_tx_timeout
net: mvneta: use per_cpu stats to fix an SMP lock up
net: mvneta: increase the 64-bit rx/tx stats out of the hot path
Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan"
staging: vt6655: Fix Warning on boot handle_irq_event_percpu.
x86_64/entry/xen: Do not invoke espfix64 on Xen
x86, espfix: Make it possible to disable 16-bit support
x86, espfix: Make espfix64 a Kconfig option, fix UML
x86, espfix: Fix broken header guard
x86, espfix: Move espfix definitions into a separate header file
x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks
printk: rename printk_sched to printk_deferred
iio: buffer: Fix demux table creation
staging: vt6655: Fix disassociated messages every 10 seconds
mm, thp: do not allow thp faults to avoid cpuset restrictions
scsi: handle flush errors properly
rapidio/tsi721_dma: fix failure to obtain transaction descriptor
cfg80211: fix mic_failure tracing
ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout
crypto: af_alg - properly label AF_ALG socket
Linux 3.10.51
core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors
x86/efi: Include a .bss section within the PE/COFF headers
s390/ptrace: fix PSW mask check
Fix gcc-4.9.0 miscompilation of load_balance() in scheduler
mm: hugetlb: fix copy_hugetlb_page_range()
x86_32, entry: Store badsys error code in %eax
hwmon: (smsc47m192) Fix temperature limit and vrm write operations
parisc: Remove SA_RESTORER define
coredump: fix the setting of PF_DUMPCORE
Input: fix defuzzing logic
slab_common: fix the check for duplicate slab names
slab_common: Do not check for duplicate slab names
tracing: Fix wraparound problems in "uptime" trace clock
blkcg: don't call into policy draining if root_blkg is already gone
ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
libata: introduce ata_host->n_tags to avoid oops on SAS controllers
libata: support the ata host which implements a queue depth less than 32
block: don't assume last put of shared tags is for the host
block: provide compat ioctl for BLKZEROOUT
media: tda10071: force modulation to QPSK on DVB-S
media: hdpvr: fix two audio bugs
Linux 3.10.50
ARC: Implement ptrace(PTRACE_GET_THREAD_AREA)
sched: Fix possible divide by zero in avg_atom() calculation
locking/mutex: Disable optimistic spinning on some architectures
PM / sleep: Fix request_firmware() error at resume
dm cache metadata: do not allow the data block size to change
dm thin metadata: do not allow the data block size to change
alarmtimer: Fix bug where relative alarm timers were treated as absolute
drm/radeon: avoid leaking edid data
drm/qxl: return IRQ_NONE if it was not our irq
drm/radeon: set default bl level to something reasonable
irqchip: gic: Fix core ID calculation when topology is read from DT
irqchip: gic: Add support for cortex a7 compatible string
ring-buffer: Fix polling on trace_pipe
mwifiex: fix Tx timeout issue
perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
ipv4: fix buffer overflow in ip_options_compile()
dns_resolver: Null-terminate the right string
dns_resolver: assure that dns_query() result is null-terminated
sunvnet: clean up objects created in vnet_new() on vnet_exit()
net: pppoe: use correct channel MTU when using Multilink PPP
net: sctp: fix information leaks in ulpevent layer
tipc: clear 'next'-pointer of message fragments before reassembly
be2net: set EQ DB clear-intr bit in be_open()
netlink: Fix handling of error from netlink_dump().
net: mvneta: Fix big endian issue in mvneta_txq_desc_csum()
net: mvneta: fix operation in 10 Mbit/s mode
appletalk: Fix socket referencing in skb
tcp: fix false undo corner cases
igmp: fix the problem when mc leave group
net: qmi_wwan: add two Sierra Wireless/Netgear devices
net: qmi_wwan: Add ID for Telewell TW-LTE 4G v2
ipv4: icmp: Fix pMTU handling for rare case
tcp: Fix divide by zero when pushing during tcp-repair
bnx2x: fix possible panic under memory stress
net: fix sparse warning in sk_dst_set()
ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix
ipv4: fix dst race in sk_dst_get()
8021q: fix a potential memory leak
net: sctp: check proc_dointvec result in proc_sctp_do_auth
tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb
ip_tunnel: fix ip_tunnel_lookup
shmem: fix splicing from a hole while it's punched
shmem: fix faulting into a hole, not taking i_mutex
shmem: fix faulting into a hole while it's punched
iwlwifi: dvm: don't enable CTS to self
igb: do a reset on SR-IOV re-init if device is down
hwmon: (adt7470) Fix writes to temperature limit registers
hwmon: (da9052) Don't use dash in the name attribute
hwmon: (da9055) Don't use dash in the name attribute
tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs
tracing: Fix graph tracer with stack tracer on other archs
fuse: handle large user and group ID
Bluetooth: Ignore H5 non-link packets in non-active state
Drivers: hv: util: Fix a bug in the KVP code
media: gspca_pac7302: Add new usb-id for Genius i-Look 317
usb: Check if port status is equal to RxDetect
Signed-off-by: Ian Maund <imaund@codeaurora.org>
commit c72efb658f7c8b27ca3d0efb5cfd5ded9fcac89e upstream.
From 1ebf33901ecc75d9496862dceb1ef0377980587c Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Mon, 23 Mar 2015 00:08:19 -0400
2f800fbd77 ("writeback: fix dirtied pages accounting on redirty")
introduced account_page_redirty() which reverts stat updates for a
redirtied page, making BDI_DIRTIED no longer monotonically increasing.
bdi_update_write_bandwidth() uses the delta in BDI_DIRTIED as the
basis for bandwidth calculation. While unlikely, since the above
patch, the newer value may be lower than the recorded past value and
underflow the bandwidth calculation leading to a wild result.
Fix it by subtracing min of the old and new values when calculating
delta. AFAIK, there hasn't been any report of it happening but the
resulting erratic behavior would be non-critical and temporary, so
it's possible that the issue is happening without being reported. The
risk of the fix is very low, so tagged for -stable.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Fixes: 2f800fbd77 ("writeback: fix dirtied pages accounting on redirty")
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d70e15480c0450d2bfafaad338a32e884fc215e upstream.
global_update_bandwidth() uses static variable update_time as the
timestamp for the last update but forgets to initialize it to
INITIALIZE_JIFFIES.
This means that global_dirty_limit will be 5 mins into the future on
32bit and some large amount jiffies into the past on 64bit. This
isn't critical as the only effect is that global_dirty_limit won't be
updated for the first 5 mins after booting on 32bit machines,
especially given the auxiliary nature of global_dirty_limit's role -
protecting against global dirty threshold's sudden dips; however, it
does lead to unintended suboptimal behavior. Fix it.
Fixes: c42843f2f0 ("writeback: introduce smoothed global dirty limit")
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With this patch, anon pages of incative tasks can be reclaimed,
depending on memory pressure. Memory pressure is detected
using vmpressure events. 'N' best tasks in terms of anon
size is selected and pages proportional to their tasksize
is reclaimed. The total number of pages reclaimed at each
run of the swap work, can be tuned from userspace, the
default being SWAP_CLUSTER_MAX * 32.
The patch also adds tracepoints to debug and tune the
feature.
echo 1 > /sys/module/process_reclaim/parameters/enable_process_reclaim
to enable the feature.
echo <pages> > /sys/module/process_reclaim/parameters/per_swap_size,
to set the number of pages reclaimed in each scan.
/sys/module/process_reclaim/parameters/reclaim_avg_efficiency, provides
the average efficiency (scan to reclaim ratio) of the algorithm.
/sys/module/process_reclaim/parameters/swap_eff_win, to set the window
period (in unit of number of times reclaim is triggered) to detect
low efficiency runs.
/sys/module/process_reclaim/parameters/swap_opt_eff, to set the optimal
efficiency threshold for low efficiency detection.
Change-Id: I895986f10c997d1715761eaaadc4bbbee60db9d2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This patch adds address range reclaim of a process.
The requirement is following as,
Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.
echo [addr] [size-byte] > /proc/pid/reclaim
The addr should be page-aligned.
So now reclaim konb's interface is following as.
echo file > /proc/pid/reclaim
reclaim file-backed pages only
echo anon > /proc/pid/reclaim
reclaim anonymous pages only
echo all > /proc/pid/reclaim
reclaim all pages
echo 0x100000 8K > /proc/pid/reclaim
reclaim pages in (0x100000 - 0x102000)
Change-Id: I111131d31be1cfcfa246617b634a9a8bc4078098
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 08:39:01
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.
This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.
Change-Id: I7e5f34f2e947f5db6d405867fe2ad34863ca40f7
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:27
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Shrink_page_list expects all pages come from a same zone
but it's too limited to use.
This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.
Change-Id: I34469b7f0a79f2b79e30e40033ba8b3e1dd5f2d0
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:25
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).
One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.
This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.
It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.
Reclaim file-backed pages only.
echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
echo anon > /proc/PID/reclaim
Reclaim all pages
echo all > /proc/PID/reclaim
Change-Id: Iabdb7bc2ef3dc4d94e3ea005fbe18f4cd06739ab
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:24
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Now, local variable references in shrink_page_list is
PAGEREF_RECLAIM_CLEAN as default. It is for preventing to reclaim
dirty pages when CMA try to migrate pages.
Strictly speaking, we don't need it because CMA already didn't allow
to write out by .may_writepage = 0 in reclaim_clean_pages_from_list.
Morever, it has a problem to prevent anonymous pages's swap out when
we use force_reclaim = true in shrink_page_list(ex, per process reclaim
can do it)
So this patch makes references's default value to PAGEREF_RECLAIM
and declare .may_writepage = 0 of scan_control in CMA part to make
code more clear.
Change-Id: I5edc3c955d106ecebc4949ce27daf5b7b7a18089
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Minkyung Kim <minkyung88@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:23
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
The existing calculation of vmpressure takes into account only
the ratio of reclaimed to scanned pages, but not the time spent
or the difficulty in reclaiming those pages. For e.g. when there
are quite a number of file pages in the system, an allocation
request can be satisfied by reclaiming the file pages alone. If
such a reclaim is succesful, the vmpressure value will remain low
irrespective of the time spent by the reclaim code to free up the
file pages. With a feature like lowmemorykiller, killing a task
can be faster than reclaiming the file pages alone. So if the
vmpressure values reflect the reclaim difficulty level, clients
can make a decision based on that, for e.g. to kill a task early.
This patch monitors the number of pages scanned in the direct
reclaim path and scales the vmpressure level according to that.
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Change-Id: I6e643d29a9a1aa0814309253a8b690ad86ec0b13
Since memtest might be used by other architectures pass input parameters
as phys_addr_t instead of long to prevent overflow.
Change-Id: If189b91fb308315369631a5016ca6eda92ca13ab
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Patch-mainline: linux-arm-kernel @ 03/09/15, 10:27
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
There is nothing platform dependent in the core memtest code, so other platform
might benefit of this feature too.
Change-Id: I2f1fca080cffe1d887fe724885e337e7117482d8
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Patch-mainline: linux-arm-kernel @ 03/09/15, 10:27
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
This reverts commit 1ba8ad85c1.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: I7fca0bd378d6183dfc0f8bc302397d03d04fe865
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This reverts commit 37b13061d3.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: I46f1e9fa0921edbccbf9625f82ba8d6506094d52
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This reverts commit 47f7dcdd58.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: I9469c1ce4ff7db60749cf8fd62567c9ad3a2ef97
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This reverts commit d75907c9fa.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: I4c76dff00e5728f16a8fb0bda2529fd2bfd837d7
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This reverts commit ae5ffa6b56.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: I8c1133552fb4fa0dfd19e98fc6cf8040c1fd8472
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
This reverts commit f2d841b160.
This patch is one among the 6 patches that were initially picked
to fix an issue where tasks were getting blocked in reclaim path.
But these patches are found to cause cpu wakeups.
Change-Id: Id8fd4fd39e76faf1f3ae0825dc77cfbd4b5a8670
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
It was found that a number of tasks were blocked in the reclaim path
(throttle_vm_writeout) for seconds, because of vmstat_diff not being
synced in time. Fix that by adding a new function
global_page_state_snapshot.
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Change-Id: Iec167635ad724a55c27bdbd49eb8686e7857216c
Commit "mm: vmscan: fix the page state calculation in too_many_isolated"
fixed an issue where a number of tasks were blocked in reclaim path
for seconds, because of vmstat_diff not being synced in time.
A similar problem can happen in isolate_migratepages_block, where
similar calculation is performed. This patch fixes that.
Change-Id: Ie74f108ef770da688017b515fe37faea6f384589
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Use the 'allow_attach' handler for the 'mem' cgroup to allow
non-root processes to add arbitrary processes to a 'mem' cgroup
if it has the CAP_SYS_NICE capability set.
Bug: 18260435
Change-Id: If7d37bf90c1544024c4db53351adba6a64966250
Signed-off-by: Rom Lemarchand <romlem@android.com>
Git-commit: cce78bc02ff0ea2d21e88e3438d65272b898aa35
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Commit d1037ba0b8 (mm/page_alloc: restrict max order of
merging on isolated pageblock) changed the logic of unset_migratetype_isolate
to check the buddy allocator and explicitly call __free_pages to
merge. The page that is being freed in this path never had prep_new_page
called so set_page_refcounted is called explicitly but there is
no call to kernel_map_pages. With the default kernel_map_pages this
is mostly harmless but if kernel_map_pages does any manipulation
of the page tables (unmapping or setting pages to read only) this
may trigger a fault:
alloc_contig_range test_pages_isolated(ceb00, ced00) failed
Unable to handle kernel paging request at virtual address ffffffc0cec00000
pgd = ffffffc045fc4000
[ffffffc0cec00000] *pgd=0000000000000000
Internal error: Oops: 9600004f [#1] PREEMPT SMP
Modules linked in: exfatfs
CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 3.10.49-gc72ad36-dirty #1
task: ffffffc03de52100 ti: ffffffc015388000 task.ti: ffffffc015388000
PC is at memset+0xc8/0x1c0
LR is at kernel_map_pages+0x1ec/0x244
Fix this by calling kernel_map_pages to ensure the page is set in the
page table properly
Change-Id: Ie0c7f38fce24683b6ddebf95874be662ef25021b
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Add page owner support for dma allocations.
CRs-Fixed: 809977
Change-Id: I0d5e785b9bf29a99c263d7f90bc80ab26f7b2ff5
Signed-off-by: Liam Mark <lmark@codeaurora.org>
commit 9cb12d7b4ccaa976f97ce0c5fd0f1b6a83bc2a75 upstream.
For whatever reason, generic_access_phys() only remaps one page, but
actually allows to access arbitrary size. It's quite easy to trigger
large reads, like printing out large structure with gdb, which leads to a
crash. Fix it by remapping correct size.
Fixes: 28b2ee20c7 ("access_process_vm device memory infrastructure")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 372549c2a3778fd3df445819811c944ad54609ca upstream.
What we want to check here is whether there is highorder freepage in buddy
list of other migratetype in order to steal it without fragmentation.
But, current code just checks cc->order which means allocation request
order. So, this is wrong.
Without this fix, non-movable synchronous compaction below pageblock order
would not stopped until compaction is complete, because migratetype of
most pageblocks are movable and high order freepage made by compaction is
usually on movable type buddy list.
There is some report related to this bug. See below link.
http://www.spinics.net/lists/linux-mm/msg81666.html
Although the issued system still has load spike comes from compaction,
this makes that system completely stable and responsive according to his
report.
stress-highalloc test in mmtests with non movable order 7 allocation
doesn't show any notable difference in allocation success rate, but, it
shows more compaction success rate.
Compaction success rate (Compaction success * 100 / Compaction stalls, %)
18.47 : 28.94
Fixes: 1fb3f8ca0e ("mm: compaction: capture a suitable high-order page immediately when it is made available")
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8138a67a5557ffea3a21dfd6f037842d4e748513 upstream.
I noticed that "allowed" can easily overflow by falling below 0, because
(total_vm / 32) can be larger than "allowed". The problem occurs in
OVERCOMMIT_NONE mode.
In this case, a huge allocation can success and overcommit the system
(despite OVERCOMMIT_NONE mode). All subsequent allocations will fall
(system-wide), so system become unusable.
The problem was masked out by commit c9b1d0981f
("mm: limit growth of 3% hardcoded other user reserve"),
but it's easy to reproduce it on older kernels:
1) set overcommit_memory sysctl to 2
2) mmap() large file multiple times (with VM_SHARED flag)
3) try to malloc() large amount of memory
It also can be reproduced on newer kernels, but miss-configured
sysctl_user_reserve_kbytes is required.
Fix this issue by switching to signed arithmetic here.
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5703b087dc8eaf47bfb399d6cf512d471beff405 upstream.
I noticed, that "allowed" can easily overflow by falling below 0,
because (total_vm / 32) can be larger than "allowed". The problem
occurs in OVERCOMMIT_NONE mode.
In this case, a huge allocation can success and overcommit the system
(despite OVERCOMMIT_NONE mode). All subsequent allocations will fall
(system-wide), so system become unusable.
The problem was masked out by commit c9b1d0981f
("mm: limit growth of 3% hardcoded other user reserve"),
but it's easy to reproduce it on older kernels:
1) set overcommit_memory sysctl to 2
2) mmap() large file multiple times (with VM_SHARED flag)
3) try to malloc() large amount of memory
It also can be reproduced on newer kernels, but miss-configured
sysctl_user_reserve_kbytes is required.
Fix this issue by switching to signed arithmetic here.
[akpm@linux-foundation.org: use min_t]
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fbc1f635fd0bd28cb32550211bf095753ac637a upstream.
If __unmap_hugepage_range() tries to unmap the address range over which
hugepage migration is on the way, we get the wrong page because pte_page()
doesn't work for migration entries. This patch simply clears the pte for
migration entries as we do for hwpoison entries.
Fixes: 290408d4a2 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently if kmemleak is disabled, the kmemleak objects can never be
freed, no matter if it's disabled by a user or due to fatal errors.
Those objects can be a big waste of memory.
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
1200264 1197433 99% 0.30K 46164 26 369312K kmemleak_object
With this patch, after kmemleak was disabled you can reclaim memory
with:
# echo clear > /sys/kernel/debug/kmemleak
Also inform users about this with a printk.
Change-Id: I13fdd5f0439bf6bc23824d37a44813eb0fe6a392
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: c89da70c7360294e715df5abd4b7239db3274c86
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Currently if you stop kmemleak thread before disabling kmemleak,
kmemleak objects will be freed and so you won't be able to check
previously reported leaks.
With this patch, kmemleak objects won't be freed if there're leaks that
can be reported.
Change-Id: I31c837fa63f99f65a553471de46729d8d8e08ed5
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: dc9b3f424903f7d6992778b69b1e35d864914ae5
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Commit 248ac0e194 ("mm/vmalloc: remove guard page from between vmap
blocks") had the side effect of making vmap_area.va_end member point to
the next vmap_area.va_start. This was creating an artificial reference
to vmalloc'ed objects and kmemleak was rarely reporting vmalloc() leaks.
This patch marks the vmap_area containing pointers explicitly and
reduces the min ref_count to 2 as vm_struct still contains a reference
to the vmalloc'ed object. The kmemleak add_scan_area() function has
been improved to allow a SIZE_MAX argument covering the rest of the
object (for simpler calling sites).
Change-Id: I237dd983d000c956ede36b5492dfe046a3060ce4
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 7f88f88f83ed609650a01b18572e605ea50cd163
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
It was noted that the vm stat shepherd runs every 2 seconds and that the
vmstat update is then scheduled 2 seconds in the future.
This yields an interval of double the time interval which is not desired.
Change the shepherd so that it does not delay the vmstat update on the
other cpu. We stil have to use schedule_delayed_work since we are using a
delayed_work_struct but we can set the delay to 0.
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 57c2e36b6f4dd52e7e90f4c748a665b13fa228d2
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
CRs-Fixed: 802343
Change-Id: I452090aec907c55fed60f107313d617e023f80d2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Vinayak Menon has reported that an excessive number of tasks was throttled
in the direct reclaim inside too_many_isolated() because NR_ISOLATED_FILE
was relatively high compared to NR_INACTIVE_FILE. However it turned out
that the real number of NR_ISOLATED_FILE was 0 and the per-cpu
vm_stat_diff wasn't transferred into the global counter.
vmstat_work which is responsible for the sync is defined as deferrable
delayed work which means that the defined timeout doesn't wake up an idle
CPU. A CPU might stay in an idle state for a long time and general effort
is to keep such a CPU in this state as long as possible which might lead
to all sorts of troubles for vmstat consumers as can be seen with the
excessive direct reclaim throttling.
This patch basically reverts 39bf6270f5 ("VM statistics: Make timer
deferrable") but it shouldn't cause any problems for idle CPUs because
only CPUs with an active per-cpu drift are woken up since 7cc36bbddde5
("vmstat: on-demand vmstat workers v8") and CPUs which are idle for a
longer time shouldn't have per-cpu drift.
Fixes: 39bf6270f5 (VM statistics: Make timer deferrable)
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: ba4877b9ca51f80b5d30f304a46762f0509e1635
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
CRs-Fixed: 802343
Change-Id: I4cb34a859b7cba307268fa529964fd729e833ef3
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
vmstat workers are used for folding counter differentials into the zone,
per node and global counters at certain time intervals. They currently
run at defined intervals on all processors which will cause some holdoff
for processors that need minimal intrusion by the OS.
The current vmstat_update mechanism depends on a deferrable timer firing
every other second by default which registers a work queue item that runs
on the local CPU, with the result that we have 1 interrupt and one
additional schedulable task on each CPU every 2 seconds If a workload
indeed causes VM activity or multiple tasks are running on a CPU, then
there are probably bigger issues to deal with.
However, some workloads dedicate a CPU for a single CPU bound task. This
is done in high performance computing, in high frequency financial
applications, in networking (Intel DPDK, EZchip NPS) and with the advent
of systems with more and more CPUs over time, this may become more and
more common to do since when one has enough CPUs one cares less about
efficiently sharing a CPU with other tasks and more about efficiently
monopolizing a CPU per task.
The difference of having this timer firing and workqueue kernel thread
scheduled per second can be enormous. An artificial test measuring the
worst case time to do a simple "i++" in an endless loop on a bare metal
system and under Linux on an isolated CPU with dynticks and with and
without this patch, have Linux match the bare metal performance (~700
cycles) with this patch and loose by couple of orders of magnitude (~200k
cycles) without it[*]. The loss occurs for something that just calculates
statistics. For networking applications, for example, this could be the
difference between dropping packets or sustaining line rate.
Statistics are important and useful, but it would be great if there would
be a way to not cause statistics gathering produce a huge performance
difference. This patche does just that.
This patch creates a vmstat shepherd worker that monitors the per cpu
differentials on all processors. If there are differentials on a
processor then a vmstat worker local to the processors with the
differentials is created. That worker will then start folding the diffs
in regular intervals. Should the worker find that there is no work to be
done then it will make the shepherd worker monitor the differentials
again.
With this patch it is possible then to have periods longer than
2 seconds without any OS event on a "cpu" (hardware thread).
The patch shows a very minor increased in system performance.
hackbench -s 512 -l 2000 -g 15 -f 25 -P
Results before the patch:
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.992
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.971
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 5.063
Hackbench after the patch:
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.973
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.990
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.993
[fengguang.wu@intel.com: cpu_stat_off can be static]
Signed-off-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Max Krasnyansky <maxk@qti.qualcomm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 7cc36bbddde5cd0c98f0c06e3304ab833d662565
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
CRs-Fixed: 802343
Change-Id: I30e9cef8fd22efaae3908c3e43b9558d19a2e33a
[vinmenon@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Both functions that update global counters use the same mechanism.
Create a function that contains the common code.
Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 4edb0748b23887140578d68f5f4e6e2de337a481
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
CRs-Fixed: 802343
Change-Id: I950dd72d854a8a078ea4e6d90422fa7b5e50b61c
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
The main idea behind this patchset is to reduce the vmstat update overhead
by avoiding interrupt enable/disable and the use of per cpu atomics.
This patch (of 3):
It is better to have a separate folding function because
refresh_cpu_vm_stats() also does other things like expire pages in the
page allocator caches.
If we have a separate function then refresh_cpu_vm_stats() is only called
from the local cpu which allows additional optimizations.
The folding function is only called when a cpu is being downed and
therefore no other processor will be accessing the counters. Also
simplifies synchronization.
[akpm@linux-foundation.org: fix UP build]
Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 2bb921e526656556e68f99f5f15a4a1bf2691844
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
CRs-Fixed: 802343
Change-Id: Iaa04b1418a86072b73babc38186c4e29b9e28cf1
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Currently, vmpressure is tied to memcg and its events are
available only to userspace clients. This patch removes
the dependency on CONFIG_MEMCG and adds a mechanism for
in-kernel clients to subscribe for vmpressure events (in
fact raw vmpressure values are delivered instead of vmpressure
levels, to provide clients more flexibility to take actions
on custom pressure levels which are not currently defined
by vmpressure module).
Change-Id: I38010f166546e8d7f12f5f355b5dbfd6ba04d587
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
The lowmem_shrink function discounts all the swap cache pages from
the file cache count. The zone aware code also discounts all file
cache pages from a certain zone. This results in some swap cache
pages being discounted twice, which can result in the low memory
killer being unnecessarily aggressive.
Fix the low memory killer to only discount the swap cache pages
once.
Change-Id: I650bbfbf0fbbabd01d82bdb3502b57ff59c3e14f
Signed-off-by: Liam Mark <lmark@codeaurora.org>
There are couple of issues with swapcache usage when ZRAM is used
as swap device.
1) Kernel does a swap readahead which can be around 6 to 8 pages
depending on total ram, which is not required for zram since
accesses are fast.
2) Kernel delays the freeing up of swapcache expecting a later hit,
which again is useless in the case of zram.
3) This is not related to swapcache, but zram usage itself.
As mentioned in (2) kernel delays freeing of swapcache, but along with
that it delays zram compressed page free also. i.e. there can be 2 copies,
though one is compressed.
This patch addresses these issues using two new flags
QUEUE_FLAG_FAST and SWP_FAST, to indicate that accesses to the device
will be fast and cheap, and instructs the swap layer to free up
swap space agressively, and not to do read ahead.
Change-Id: I5d2d5176a5f9420300bb2f843f6ecbdb25ea80e4
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
commit 23aaed6659df9adfabe9c583e67a36b54e21df46 upstream.
walk_page_range() silently skips vma having VM_PFNMAP set, which leads
to undesirable behaviour at client end (who called walk_page_range).
Userspace applications get the wrong data, so the effect is like just
confusing users (if the applications just display the data) or sometimes
killing the processes (if the applications do something with
misunderstanding virtual addresses due to the wrong data.)
For example for pagemap_read, when no callbacks are called against
VM_PFNMAP vma, pagemap_read may prepare pagemap data for next virtual
address range at wrong index.
Eventually userspace may get wrong pagemap data for a task.
Corresponding to a VM_PFNMAP marked vma region, kernel may report
mappings from subsequent vma regions. User space in turn may account
more pages (than really are) to the task.
In my case I was using procmem, procrack (Android utility) which uses
pagemap interface to account RSS pages of a task. Due to this bug it
was giving a wrong picture for vmas (with VM_PFNMAP set).
Fixes: a9ff785e44 ("mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas")
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we have kmemleak_stack_scan enabled by default.
This can hog the cpu with pre-emption disabled for a long
time starving other tasks.
Make this optional at compile time, since if required
we can always write to sysfs entry and enable this option.
Change-Id: Ie30447861c942337c7ff25ac269b6025a527e8eb
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
walk_page_range silently skips vma having VM_PFNMAP set,
which leads to undesirable behaviour at client end (who
called walk_page_range). For example for pagemap_read,
when no callbacks are called against VM_PFNMAP vma,
pagemap_read may prepare pagemap data at wrong index.
Change-Id: I057b5c8ede1ae4bb9e3f8639e10bd4fcbf23da7e
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
commit 690eac53daff34169a4d74fc7bfbd388c4896abb upstream.
Commit fee7e49d4514 ("mm: propagate error from stack expansion even for
guard page") made sure that we return the error properly for stack
growth conditions. It also theorized that counting the guard page
towards the stack limit might break something, but also said "Let's see
if anybody notices".
Somebody did notice. Apparently android-x86 sets the stack limit very
close to the limit indeed, and including the guard page in the rlimit
check causes the android 'zygote' process problems.
So this adds the (fairly trivial) code to make the stack rlimit check be
against the actual real stack size, rather than the size of the vma that
includes the guard page.
Reported-and-tested-by: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: Jay Foad <jay.foad@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fee7e49d45149fba60156f5b59014f764d3e3728 upstream.
Jay Foad reports that the address sanitizer test (asan) sometimes gets
confused by a stack pointer that ends up being outside the stack vma
that is reported by /proc/maps.
This happens due to an interaction between RLIMIT_STACK and the guard
page: when we do the guard page check, we ignore the potential error
from the stack expansion, which effectively results in a missing guard
page, since the expected stack expansion won't have been done.
And since /proc/maps explicitly ignores the guard page (commit
d7824370e2: "mm: fix up some user-visible effects of the stack guard
page"), the stack pointer ends up being outside the reported stack area.
This is the minimal patch: it just propagates the error. It also
effectively makes the guard page part of the stack limit, which in turn
measn that the actual real stack is one page less than the stack limit.
Let's see if anybody notices. We could teach acct_stack_growth() to
allow an extra page for a grow-up/grow-down stack in the rlimit test,
but I don't want to add more complexity if it isn't needed.
Reported-and-tested-by: Jay Foad <jay.foad@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e5e3661727eaf960d3480213f8e87c8d67b6956 upstream.
Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
stuck in a busy loop with nothing left to balance, but
kswapd_try_to_sleep() failing to sleep. Their analysis found the cause
to be a combination of several factors:
1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait
2. The process has been killed (by OOM in this case), but has not yet been
scheduled to remove itself from the waitqueue and die.
3. kswapd checks for throttled processes in prepare_kswapd_sleep():
if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
wake_up(&pgdat->pfmemalloc_wait);
return false; // kswapd will not go to sleep
}
However, for a process that was already killed, wake_up() does not remove
the process from the waitqueue, since try_to_wake_up() checks its state
first and returns false when the process is no longer waiting.
4. kswapd is running on the same CPU as the only CPU that the process is
allowed to run on (through cpus_allowed, or possibly single-cpu system).
5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
encounters no voluntary preemption points and repeatedly fails
prepare_kswapd_sleep(), blocking the process from running and removing
itself from the waitqueue, which would let kswapd sleep.
So, the source of the problem is that we prevent kswapd from going to
sleep until there are processes waiting on the pfmemalloc_wait queue,
and a process waiting on a queue is guaranteed to be removed from the
queue only when it gets scheduled. This was done to make sure that no
process is left sleeping on pfmemalloc_wait when kswapd itself goes to
sleep.
However, it isn't necessary to postpone kswapd sleep until the
pfmemalloc_wait queue actually empties. To prevent processes from being
left sleeping, it's actually enough to guarantee that all processes
waiting on pfmemalloc_wait queue have been woken up by the time we put
kswapd to sleep.
This patch therefore fixes this issue by substituting 'wake_up' with
'wake_up_all' and removing 'return false' in the code snippet from
prepare_kswapd_sleep() above. Note that if any process puts itself in
the queue after this waitqueue_active() check, or after the wake up
itself, it means that the process will also wake up kswapd - and since
we are under prepare_to_wait(), the wake up won't be missed. Also we
update the comment prepare_kswapd_sleep() to hopefully more clearly
describe the races it is preventing.
Fixes: 5515061d22 ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is observed that sometimes multiple tasks get blocked in
the congestion_wait loop below, in shrink_inactive_list.
(__schedule) from [<c0a03328>]
(schedule_timeout) from [<c0a04940>]
(io_schedule_timeout) from [<c01d585c>]
(congestion_wait) from [<c01cc9d8>]
(shrink_inactive_list) from [<c01cd034>]
(shrink_zone) from [<c01cdd08>]
(try_to_free_pages) from [<c01c442c>]
(__alloc_pages_nodemask) from [<c01f1884>]
(new_slab) from [<c09fcf60>]
(__slab_alloc) from [<c01f1a6c>]
In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
returned 92, and the gfp_flag was GFP_KERNEL which resulted
in too_many_isolated to return true. But one of the CPU pageset
vmstat diff had NR_ISOLATED_FILE as -14. As there weren't any more
update to per cpu pageset, the threshold wasn't met, and the
tasks were blocked in the congestion wait.
This patch uses zone_page_state_snapshot instead, but restricts
its usage to avoid performance penalty.
Change-Id: Iec767a548e524729c7ed79a92fe4718cdd08ce69
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Current pageblock isolation logic could isolate each pageblock
individually. This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock. We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.
A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging. So, the patch also
makes undoing pageblock isolation consider freepage merge. When
un-isolation, freepage with more than pageblock order and it's buddy are
checked. If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.
CRs-fixed: 771472
Change-Id: I50d132eeea59de58e68e82f797edf85334512468
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 3c605096d3158216ba9326a16266f6ba128c2c8d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[lmark@codeaurora.org: fix merge conflicts]
Signed-off-by: Liam Mark <lmark@codeaurora.org>
commit 2022b4d18a491a578218ce7a4eca8666db895a73 upstream.
I've been seeing swapoff hangs in recent testing: it's cycling around
trying unsuccessfully to find an mm for some remaining pages of swap.
I have been exercising swap and page migration more heavily recently,
and now notice a long-standing error in copy_one_pte(): it's trying to
add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing
so even when it's a migration entry or an hwpoison entry.
Which wouldn't matter much, except it adds dst_mm next to src_mm,
assuming src_mm is already on the mmlist: which may not be so. Then if
pages are later swapped out from dst_mm, swapoff won't be able to find
where to replace them.
There's already a !non_swap_entry() test for stats: move that up before
the swap_duplicate() and the addition to mmlist.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb993fa1a2f669215fa03a09eed7848f2663e336 upstream.
If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
1. use zswap as the frontswap backend with writeback feature
2. store a swap page(version_1) to entry A, success
3. dup-store a newer page(version_2) to the same entry A, fail
4. use __swap_writepage() write version_2 page to swapfile, success
5. zswap do shrink, writeback version_1 page to swapfile
6. version_2 page is overwrited by version_1, data corrupt.
This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a system like Android, a process with SYS_ADMIN rights
controls the system for things like moving process from
one cgroup to another. The native cgroup capabilities
are only allowed to execute by root user and not system.
While adding a new cgroup sub-system, one may override
and relax the permission so that 'system' can also control
cgroup. Here, memcg is one such cgroup sub system which
requires system level control for that.
Allow non-root processes to add arbitrary into 'memory'
cgroups if it has 'CAP_SYS_ADMIN' capability set.
Change-Id: I43d4468186f142c176cb5b5f060751bb1b160344
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
shrink_inactive_list() used to wait 0.1s to avoid congestion when all
the pages that were isolated from the inactive list were dirty but not
under active writeback. That makes no real sense, and apparently causes
major interactivity issues under some loads since 3.11.
The ostensible reason for it was to wait for kswapd to start writing
pages, but that seems questionable as well, since the congestion wait
code seems to trigger for kswapd itself as well. Also, the logic behind
delaying anything when we haven't actually started writeback is not
clear - it only delays actually starting that writeback.
We'll still trigger the congestion waiting if
(a) the process is kswapd, and we hit pages flagged for immediate
reclaim
(b) the process is not kswapd, and the zone backing dev writeback is
actually congested.
This probably needs to be revisited, but as it is this fixes a reported
regression.
Reported-by: Felipe Contreras <felipe.contreras@gmail.com>
Pinpointed-by: Hillf Danton <dhillf@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: b738d764652dc5aab1c8939f637112981fce9e0e
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I4fbcbb10d7ba242caf80da06bd8ed11770571cff
[vinmenon@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>