Commit graph

306734 commits

Author SHA1 Message Date
Mikulas Patocka
f093342759 crypto: arm-aes - fix encryption of unaligned data
Fix the same alignment bug as in arm64 - we need to pass residue
unprocessed bytes as the last argument to blkcipher_walk_done.

Change-Id: I8d49b8a190327b46801a3db4884e2b309138525b
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# 3.13+
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-07 10:45:07 -08:00
Russell King
6799ccfaf1 CRYPTO: Fix more AES build errors
Building a multi-arch kernel results in:

arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt':
sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt'
arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt':
sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt'
arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt':
sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks'
arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt':
sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt'

This code is already runtime-conditional on NEON being supported, so
there's no point compiling it out depending on the minimum build
architecture.

Change-Id: I219dc496b3ad60754f95a6db2a71ce73d037a6e0
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-01-07 10:45:06 -08:00
Russell King
bdc278ed1d ARM: add .gitignore entry for aesbs-core.S
This avoids this file being incorrectly added to git.

Change-Id: Ibafeec2c5d3ca806737f8d865716d3b2ea419e93
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-01-07 10:45:05 -08:00
Ard Biesheuvel
fe7aa76ea0 ARM: add support for bit sliced AES using NEON instructions
Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption
and around 25% for decryption. This implementation of the AES algorithm
does not rely on any lookup tables so it is believed to be invulnerable
to cache timing attacks.

This algorithm processes up to 8 blocks in parallel in constant time. This
means that it is not usable by chaining modes that are strictly sequential
in nature, such as CBC encryption. CBC decryption, however, can benefit from
this implementation and runs about 25% faster. The other chaining modes
implemented in this module, XTS and CTR, can execute fully in parallel in
both directions.

The core code has been adopted from the OpenSSL project (in collaboration
with the original author, on cc). For ease of maintenance, this version is
identical to the upstream OpenSSL code, i.e., all modifications that were
required to make it suitable for inclusion into the kernel have been made
upstream. The original can be found here:

    http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130

Note to integrators:
While this implementation is significantly faster than the existing table
based ones (generic or ARM asm), especially in CTR mode, the effects on
power efficiency are unclear as of yet. This code does fundamentally more
work, by calculating values that the table based code obtains by a simple
lookup; only by doing all of that work in a SIMD fashion, it manages to
perform better.

Change-Id: I936dc7142b91133c55c7cf0af6a565d219d62e11
Cc: Andy Polyakov <appro@openssl.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-01-07 10:45:03 -08:00
Ard Biesheuvel
8096db709c ARM: move AES typedefs and function prototypes to separate header
Put the struct definitions for AES keys and the asm function prototypes in a
separate header and export the asm functions from the module.
This allows other drivers to use them directly.

Change-Id: I5ce0cf285e2981755adb55b66a846eb738cedd58
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-01-07 10:45:02 -08:00
Ard Biesheuvel
495306e9f6 ARM: 7837/3: fix Thumb-2 bug in AES assembler code
commit 40190c85f4 upstream.

Patch 638591c enabled building the AES assembler code in Thumb2 mode.
However, this code used arithmetic involving PC rather than adr{l}
instructions to generate PC-relative references to the lookup tables,
and this needs to take into account the different PC offset when
running in Thumb mode.

Change-Id: Iadf37cb5db3a826ced7b99e5ee6d298479355cbd
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-07 10:45:00 -08:00
Ard Biesheuvel
e849816c83 ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling
Make the SHA1 asm code ABI conformant by making sure all stack
accesses occur above the stack pointer.

Origin:
http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=1a9d60d2

Change-Id: I1f17f23f168d40de14b907f470476b7fd9bdd274
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-01-07 10:44:59 -08:00
Dave Martin
37eada158d ARM: 7626/1: arm/crypto: Make asm SHA-1 and AES code Thumb-2 compatible
This patch fixes aes-armv4.S and sha1-armv4-large.S to work
natively in Thumb.  This allows ARM/Thumb interworking workarounds
to be removed.

I also take the opportunity to convert some explicit assembler
directives for exported functions to the standard
ENTRY()/ENDPROC().

For the code itself:

  * In sha1_block_data_order, use of TEQ with sp is deprecated in
    ARMv7 and not supported in Thumb.  For the branches back to
    .L_00_15 and .L_40_59, the TEQ is converted to a CMP, under the
    assumption that clobbering the C flag here will not cause
    incorrect behaviour.

    For the first branch back to .L_20_39_or_60_79 the C flag is
    important, so sp is moved temporarily into another register so
    that TEQ can be used for the comparison.

  * In the AES code, most forms of register-indexed addressing with
    shifts and rotates are not permitted for loads and stores in
    Thumb, so the address calculation is done using a separate
    instruction for the Thumb case.

The resulting code is unlikely to be optimally scheduled, but it
should not have a large impact given the overall size of the code.
I haven't run any benchmarks.

Change-Id: I8b015aa239e5513d43680d82aeb93db07c5adf9f
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Tested-by: David McCullough <ucdevel@gmail.com> (ARM only)
Acked-by: David McCullough <ucdevel@gmail.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-01-07 10:44:58 -08:00
David McCullough
d29bf6311a arm/crypto: Add optimized AES and SHA1 routines
Add assembler versions of AES and SHA1 for ARM platforms.  This has provided
up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.

Platform   CPU SPeed    Endian   Before (bps)   After (bps)   Improvement

IXP425      533 MHz      big     11217042        15566294        ~38%
KS8695      166 MHz     little    3828549         5795373        ~51%

Change-Id: I6e950d8c858ef1134352bf959804eeaf5b879d7e
Signed-off-by: David McCullough <ucdevel@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-07 10:44:57 -08:00
Patrick Tjin
f09dd72856 Fix sizeof-pointer-memaccess warnings
Changed to use strcmp since sizeof buffer is not known.
Changed sizeof to PAGE_SIZE in snprintf.

Signed-off-by: Patrick Tjin <pattjin@google.com>
2015-01-06 09:22:29 -08:00
Mike Galbraith
2cfd3b5853 sched,cgroup: Fix up task_groups list
With multiple instances of task_groups, for_each_rt_rq() is a noop,
no task groups having been added to the rt.c list instance.  This
renders __enable/disable_runtime() and print_rt_stats() noop, the
user (non) visible effect being that rt task groups are missing in
/proc/sched_debug.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: stable@kernel.org # v3.3+
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344308413.6846.7.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Bug: 18729519
2015-01-06 09:21:31 -08:00
Shubhraprakash Das
fbe4de5b7b msm: kgsl: Get rid of KGSL_FLAGS_STARTED
The KGSL_FLAGS_STARTED is just redundant since the device start
and stop already set a flag to indicate device start/stop state.

Change-Id: I17f3ab7fc2aca7b58b610c3b3414c125babc273e
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
2014-12-10 15:26:33 -08:00
Carter Cooper
c56d10a6d4 msm: kgsl: Clear pending transactions from VBIF on hang
When resetting device on a hang the pending transactions in the
VBIF should be cleared since the GPU is hung and unable to accept
any transactions. These pending transactions can cause VBIF pipe
to block the IOMMU so clear them.

Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Change-Id: I6e0171a6e61c0dd831ce7afdc177775b2ae3f07f
2014-12-10 15:26:32 -08:00
Carter Cooper
f523656514 msm: kgsl: Modify which MMU clocks are enabled/disabled
There is no need to try to attach a clock if it is already attached,
or detach a clock if it is already detached. Restructure this
logic to only attach/detach the clocks when needed. As well as
protect ourselves using the MMU lock more readily.

Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Change-Id: Ib5edfe7800cc246bc4b5e9aca8e02621aa6f7c3c
2014-12-10 15:26:32 -08:00
Tarun Karra
3a919bd760 msm: kgsl: Prevent adreno stop after gpu is power collapsed
When GPU is power collapsed GPU is already stopped. If kgsl release
gets called do not try to stop the GPU again. Trying to stop
already stopped GPU can lead to errors.

When content protection is enabled we cannot write to VBIF
registers with iommu detached. With this limitation if
adreno stop gets called twice, the second adreno stop will
cause NOC errors/XPU violations because trustzone will
XPU lock down all VBIF registers after first adreno stop.

Prevent adreno stop getting called twice by checking if device
is started, only if device is started go ahead with adreno
stop.

CRs-fixed: 726670
Change-Id: I4e3c7a9b37eb88d458d65763ed6818a4fd96bd06
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
2014-12-10 15:26:31 -08:00
Shubhraprakash Das
6399588be3 msm: kgsl: Check for mmu pagefault before recovery
Check whether there is a pagefault before running recovery.
If recovery runs before the bottom pagefault handler runs
then there could be a pending pagefault at end of recovery that
can stall the IOMMU. With the IOMMU stalled the GPU would only
read back zeroes even after recovery.

CRs-Fixed: 642562
Change-Id: I78fb225b2ee57e87ac6ebd1f2c9bca18aa81d942
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
2014-12-10 15:26:31 -08:00
Carter Cooper
964a85a898 msm: kgsl: Fix IOMMU version naming for old driver
One minor discrepancy between new and old IOMMU drivers.

Change-Id: Ia3b74c2b1ecbcc70e2a2836212bbea9b49c9770d
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
2014-12-10 15:26:30 -08:00
Erik Kline
1e46eaf4f2 net: ipv6: allow choosing optimistic addresses with use_optimistic
The use_optimistic sysctl makes optimistic IPv6 addresses
equivalent to preferred addresses for source address selection
(e.g., when calling connect()), but it does not allow an
application to bind to optimistic addresses. This behaviour is
inconsistent - for example, it doesn't make sense for bind() to
an optimistic address fail with EADDRNOTAVAIL, but connect() to
choose that address outgoing address on the same socket.

Bug: 17769720
Bug: 18609055
Change-Id: I9de0d6c92ac45e29d28e318ac626c71806666f13
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-12-10 09:58:22 +09:00
Shengzhe Zhao
0b778076d9 vfs: check if f_count is 0 or negative
filp_close is using !file_count(filp) to check if f_count is 0. if it is
0, filp_close think it is a closed file then will return. However, for a
closed file, f_count could be reduced to -1, then !file_count(filp) is
false, filp_close will proceed to handle this file then could panic.
This change will check if f_count is 0 or negative instead of only
checking 0 to avoid panic.

b/18200219 LRX21M: kernel_panic

Change-Id: I5117853dcbebec399021abf34338b1f6aff6ad14
Signed-off-by: Shengzhe Zhao <a18689@motorola.com>
Reviewed-by: Yi-Wei Zhao <gbjc64@motorola.com>
Signed-off-by: Iliyan Malchev <malchev@google.com>
2014-12-04 13:01:58 -08:00
Naseer Ahmed
a7a5f36fe2 Rotator getting stuck leading to fence timeout
Even though cancel_delayed_work should cancel the worker thread
in some race condition it can fail and get scheduled.
To avoid this situation use cancel_delayed_work_sync.
Also rotator_lock mutex need not be unlocked while waiting for isr
as isr does not aquire this mutex for doing its operations.
It is after this unlock of mutex sometimes in race condition rotator
clock is getting disabled via the msm_rotator_rot_clk_work_f

Conflicts:
	drivers/char/msm_rotator.c

Change-Id: I5405f2c4d9505c1b288d1f1ac3d9892955306f87

Signed-off-by: Justin Philip <jphili@codeaurora.org>
Signed-off-by: Naseer Ahmed <naseer@codeaurora.org>
2014-12-03 01:00:23 +00:00
Naseer Ahmed
1349c4cab2 msm: rotator: Wait for the pending commits in finish IOCTL
Due to asynchronuous rotator mechanism, sometimes the
MSM_ROTATOR_IOCTL_FINISH arrives before the previously queued
do_rotate work is completed. This causes fence to be signalled
before the buffer is used by rotator. In case of fast YUV 2 pass
scenario, this causes IOMMU page fault on 2 pass buffers, since
the buffer is unmapped when the rotator is still using it. Hence,
wait for the pending commit works to be finished before releasing
the fence and freeing the 2 pass buffers.

Change-Id: Iec9edd11406d102c7dd102c2ad7935184bbbba93
Signed-off-by: Padmanabhan Komanduru <pkomandu@codeaurora.org>
Signed-off-by: Naseer Ahmed <naseer@codeaurora.org>
2014-12-02 16:54:01 -08:00
Jane Zhou
3cc8cc4884 net/ping: handle protocol mismatching scenario
ping_lookup() may return a wrong sock if sk_buff's and sock's protocols
dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's
sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong
sock will be returned.
the fix is to "continue" the searching, if no matching, return NULL.

[cherry-pick of net 91a0b60346]

Bug: 18512516
Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Jane Zhou <a17711@motorola.com>
Signed-off-by: Yiwei Zhao <gbjc64@motorola.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
(cherry picked from commit 08a90f2f7ecbe6e03e43acfd7aaf26bc68c73354)
2014-12-01 16:45:58 -08:00
Devin Kim
d6524f8187 cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}()
These 2 syncronize_rcu()s make attaching a task to a cgroup
quite slow, and it can't be ignored in some situations.

A real case from Colin Cross: Android uses cgroups heavily to
manage thread priorities, putting threads in a background group
with reduced cpu.shares when they are not visible to the user,
and in a foreground group when they are. Some RPCs from foreground
threads to background threads will temporarily move the background
thread into the foreground group for the duration of the RPC.
This results in many calls to cgroup_attach_task.

In cgroup_attach_task() it's task->cgroups that is protected by RCU,
and put_css_set() calls kfree_rcu() to free it.

If we remove this synchronize_rcu(), there can be threads in RCU-read
sections accessing their old cgroup via current->cgroups with
concurrent rmdir operation, but this is safe.

 # time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; }

real    0m2.524s
user    0m0.008s
sys     0m0.004s

With this patch:

real    0m0.004s
user    0m0.004s
sys     0m0.000s

tj: These synchronize_rcu()s are utterly confused.  synchornize_rcu()
    necessarily has to come between two operations to guarantee that
    the changes made by the former operation are visible to all rcu
    readers before proceeding to the latter operation.  Here,
    synchornize_rcu() are at the end of attach operations with nothing
    beyond it.  Its only effect would be delaying completion of
    write(2) to sysfs tasks/procs files until all rcu readers see the
    change, which doesn't mean anything.

cherry-picked from:
5d65bc0ca1

Bug: 17709419
Change-Id: I98dacd6c13da27cb3496fe4a24a24084e46bdd9c
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Colin Cross <ccross@google.com>
Signed-off-by: Devin Kim <dojip.kim@lge.com>
2014-12-01 16:09:15 -08:00
Devin Kim
d7da214bbb usb: dwc3: gadget: Protect against ep disabling during completion
In dwc3_cleanup_done_reqs(), a potential race condition
could arise when dwc3_gadget_giveback() temporarily
releases the main spinlock.  If during this window the
very endpoint being handled becomes disabled, it would
lead to a NULL pointer dereference in the code that
follows.  Guard against this by making sure the endpoint
is still enabled after returning from the giveback call.

cherry-picked from:
https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.10/commit/drivers/usb/dwc3/gadget.c?h=msm-3.10&id=b7ed96c4fc37351d77af87c792cd5d11ceb1e6e4

Change-Id: Idb7651c57db3273623cf664153e7cbaf0bf9dd9d
CRs-fixed: 628972
Bug: 18541764
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Devin Kim <dojip.kim@lge.com>
2014-12-01 16:08:33 -08:00
Erik Kline
efe8261b88 net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes.  Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.

This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).

The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.

For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.

Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set).  A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.

Also: add an entry in ip-sysctl.txt for optimistic_dad.

[backport of net-next 7fd2561e4ebdd070ebba6d3326c4c5b13942323f]

Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 17769720
Bug: 18180674
Change-Id: I440a9b8c788db6767d191bbebfd2dff481aa9e0d
2014-12-01 19:37:26 +00:00
Colin Cross
445d7b856d ARM: msm: flo: add limit_mem= kernel command line parameter
Add a command line parameter to limit available memory (after
all carveouts are reserved) to the specified size.  This can
be used to help test low memory situations.

Change-Id: Ia25e028315260b706365afe820e6e9986e8e7e2d
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Iliyan Malchev <malchev@google.com>
2014-12-01 18:46:31 +00:00
Iliyan Malchev
3af5ec050e Effectively revert "gpu: ion: replace __GFP_ZERO with manual zero'ing"
commit d21375bd0e
	Author: Mitchel Humpherys <mitchelh@codeaurora.org>
	Date:   Thu Jan 31 10:30:40 2013 -0800

	    gpu: ion: replace __GFP_ZERO with manual zero'ing

	    As a performance optimization, omit the __GFP_ZERO flag when
	    allocating individual pages and, instead, zero out all of the pages in
	    one fell swoop.

	    CRs-Fixed: 449035
	    Change-Id: Ieb9a895d8792727a8a40b1e27cb1bbeae098f581
	    Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>

b/18402205 External reports: Video playback failing on Flo after upgrade to
	   Lollipop

Change-Id: Ibd07d3ac0edd11278306d4dbe72050408cc8e09b
Signed-off-by: Iliyan Malchev <malchev@google.com>
(cherry picked from commit 154bef423c)
2014-11-20 22:56:00 +00:00
Iliyan Malchev
319c000d44 kgsl: do not vmap/memset to zero-out pages
b/18402205 External reports: Video playback failing on Flo after upgrade to
	   Lollipop

Change-Id: I358328ba2bd543d77e4218f32b0695c2f6f6e6c9
Signed-off-by: Iliyan Malchev <malchev@google.com>
(cherry picked from commit f6e71eaa5d)
2014-11-20 22:55:11 +00:00
Liam Mark
8d48547f37 lowmemorykiller: enhance debug information
Add extra debug information to make it easier to both determine
why the lowmemorykiller killed a process and to help find the source
of memory leaks.

Also increase the debug level for "select" statements to help prevent
flooding the log.

Bug: 17871993
Change-Id: I3b6876c5ecdf192ecc271aed3f37579f66d47a08
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
Signed-off-by: Iliyan Malchev <malchev@google.com>

Conflicts:
	drivers/staging/android/lowmemorykiller.c
2014-11-18 15:13:25 -08:00
Liam Mark
f096fee99b mm, oom: make dump_tasks public
Allow other functions to dump the list of tasks.
Useful for when debugging memory leaks.

Bug: 17871993
Change-Id: I76c33a118a9765b4c2276e8c76de36399c78dbf6
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
2014-11-18 15:13:25 -08:00
Jordan Crouse
5075f68b3f fs/seq_file: Use vmalloc by default for allocations > PAGE_SIZE
Some OOM implementations are pretty trigger happy when it comes to
releasing memory for kmalloc() allocations.  We might as well head
straight to vmalloc for allocations over PAGE_SIZE.

Bug: 17871993
Change-Id: Ic0dedbadc8bf551d34cc5d77c8073938d4adef80
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
2014-11-18 15:13:24 -08:00
Heiko Carstens
ce1247b1e2 fs/seq_file: fallback to vmalloc allocation
There are a couple of seq_files which use the single_open() interface.
This interface requires that the whole output must fit into a single
buffer.

E.g.  for /proc/stat allocation failures have been observed because an
order-4 memory allocation failed due to memory fragmentation.  In such
situations reading /proc/stat is not possible anymore.

Therefore change the seq_file code to fallback to vmalloc allocations
which will usually result in a couple of order-0 allocations and hence
also work if memory is fragmented.

For reference a call trace where reading from /proc/stat failed:

  sadc: page allocation failure: order:4, mode:0x1040d0
  CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
  [...]
  Call Trace:
    show_stack+0x6c/0xe8
    warn_alloc_failed+0xd6/0x138
    __alloc_pages_nodemask+0x9da/0xb68
    __get_free_pages+0x2e/0x58
    kmalloc_order_trace+0x44/0xc0
    stat_open+0x5a/0xd8
    proc_reg_open+0x8a/0x140
    do_dentry_open+0x1bc/0x2c8
    finish_open+0x46/0x60
    do_last+0x382/0x10d0
    path_openat+0xc8/0x4f8
    do_filp_open+0x46/0xa8
    do_sys_open+0x114/0x1f0
    sysc_tracego+0x14/0x1a

Conflicts:
	fs/seq_file.c

Bug: 17871993
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
Cc: Andrea Righi <andrea@betterlinux.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 058504edd0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: Iad795a92fee1983c300568429a6283c48625bd9a
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
2014-11-18 15:13:24 -08:00
Al Viro
e0441174bb nick kvfree() from apparmor
too many places open-code it

Conflicts:
	mm/util.c

Bug: 17871993
Change-Id: I007f4b663d7af564b2ce4009f5e13eeeeb82929a
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Git-commit: 39f1f78d53
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[jgebben@codeaurora.org: Remove redundant apparmor code not present upstream]
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
2014-11-18 15:13:23 -08:00
Colin Cross
02660b51be lowmemorykiller: make default lowmemorykiller debug message useful
lowmemorykiller debug messages are inscrutable and mostly useful
for debugging the lowmemorykiller, not explaining why a process
was killed.  Make the messages more useful by prefixing them
with "lowmemorykiller: " and explaining in more readable terms
what was killed, who it was killed for, and why it was killed.

The messages now look like:
[   76.997631] lowmemorykiller: Killing 'droid.gallery3d' (2172), adj 1000,
[   76.997635]    to free 27436kB on behalf of 'kswapd0' (29) because
[   76.997638]    cache 122624kB is below limit 122880kB for oom_score_adj 1000
[   76.997641]    Free memory is -53356kB above reserved

A negative number for free memory above reserved means some of the
reserved memory has been used and is being regenerated by kswapd,
which is likely what called the shrinkers.

Bug: 17871993
Change-Id: I1fe983381e73e124b90aa5d91cb66e55eaca390f
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Naveen Ramaraj <nramaraj@codeaurora.org>
2014-11-18 15:13:22 -08:00
Robert Sesek
ba88cbdbc0 flo_defconfig: Enable CONFIG_SECCOMP.
Bug: 15986335
Change-Id: I1bcb8206f3bb9809ce9c7012556e2ae342d7f201
Signed-off-by: Iliyan Malchev <malchev@google.com>
2014-11-01 12:09:35 -07:00
Robert Sesek
e8c64bc644 seccomp: Use atomic operations that are present in kernel 3.4.
Signed-off-by: Robert Sesek <rsesek@google.com>
2014-10-31 19:46:31 -07:00
Kees Cook
6756f10b76 seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
Applying restrictive seccomp filter programs to large or diverse
codebases often requires handling threads which may be started early in
the process lifetime (e.g., by code that is linked in). While it is
possible to apply permissive programs prior to process start up, it is
difficult to further restrict the kernel ABI to those threads after that
point.

This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for
synchronizing thread group seccomp filters at filter installation time.

When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC,
filter) an attempt will be made to synchronize all threads in current's
threadgroup to its new seccomp filter program. This is possible iff all
threads are using a filter that is an ancestor to the filter current is
attempting to synchronize to. NULL filters (where the task is running as
SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be
transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS,
...) has been set on the calling thread, no_new_privs will be set for
all synchronized threads too. On success, 0 is returned. On failure,
the pid of one of the failing threads will be returned and no filters
will have been applied.

The race conditions against another thread are:
- requesting TSYNC (already handled by sighand lock)
- performing a clone (already handled by sighand lock)
- changing its filter (already handled by sighand lock)
- calling exec (handled by cred_guard_mutex)
The clone case is assisted by the fact that new threads will have their
seccomp state duplicated from their parent before appearing on the tasklist.

Holding cred_guard_mutex means that seccomp filters cannot be assigned
while in the middle of another thread's exec (potentially bypassing
no_new_privs or similar). The call to de_thread() may kill threads waiting
for the mutex.

Changes across threads to the filter pointer includes a barrier.

Based on patches by Will Drewry.

Suggested-by: Julien Tinnes <jln@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	include/linux/seccomp.h
	include/uapi/linux/seccomp.h
2014-10-31 19:46:31 -07:00
Oleg Nesterov
8c2c32a33e introduce for_each_thread() to replace the buggy while_each_thread()
while_each_thread() and next_thread() should die, almost every lockless
usage is wrong.

1. Unless g == current, the lockless while_each_thread() is not safe.

   while_each_thread(g, t) can loop forever if g exits, next_thread()
   can't reach the unhashed thread in this case. Note that this can
   happen even if g is the group leader, it can exec.

2. Even if while_each_thread() itself was correct, people often use
   it wrongly.

   It was never safe to just take rcu_read_lock() and loop unless
   you verify that pid_alive(g) == T, even the first next_thread()
   can point to the already freed/reused memory.

This patch adds signal_struct->thread_head and task->thread_node to
create the normal rcu-safe list with the stable head.  The new
for_each_thread(g, t) helper is always safe under rcu_read_lock() as
long as this task_struct can't go away.

Note: of course it is ugly to have both task_struct->thread_node and the
old task_struct->thread_group, we will kill it later, after we change
the users of while_each_thread() to use for_each_thread().

Perhaps we can kill it even before we convert all users, we can
reimplement next_thread(t) using the new thread_head/thread_node.  But
we can't do this right now because this will lead to subtle behavioural
changes.  For example, do/while_each_thread() always sees at least one
task, while for_each_thread() can do nothing if the whole thread group
has died.  Or thread_group_empty(), currently its semantics is not clear
unless thread_group_leader(p) and we need to audit the callers before we
can change it.

So this patch adds the new interface which has to coexist with the old
one for some time, hopefully the next changes will be more or less
straightforward and the old one will go away soon.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Sergey Dyasly <dserrg@gmail.com>
Tested-by: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: Sameer Nanda <snanda@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Conflicts:
	kernel/fork.c
2014-10-31 19:46:30 -07:00
Kees Cook
63d9416f1f seccomp: allow mode setting across threads
This changes the mode setting helper to allow threads to change the
seccomp mode from another thread. We must maintain barriers to keep
TIF_SECCOMP synchronized with the rest of the seccomp state.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	kernel/seccomp.c
2014-10-31 19:46:30 -07:00
Kees Cook
52cc75eef8 seccomp: introduce writer locking
Normally, task_struct.seccomp.filter is only ever read or modified by
the task that owns it (current). This property aids in fast access
during system call filtering as read access is lockless.

Updating the pointer from another task, however, opens up race
conditions. To allow cross-thread filter pointer updates, writes to the
seccomp fields are now protected by the sighand spinlock (which is shared
by all threads in the thread group). Read access remains lockless because
pointer updates themselves are atomic.  However, writes (or cloning)
often entail additional checking (like maximum instruction counts)
which require locking to perform safely.

In the case of cloning threads, the child is invisible to the system
until it enters the task list. To make sure a child can't be cloned from
a thread and left in a prior state, seccomp duplication is additionally
moved under the sighand lock. Then parent and child are certain have
the same seccomp state when they exit the lock.

Based on patches by Will Drewry and David Drysdale.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	kernel/fork.c
2014-10-31 19:46:29 -07:00
Kees Cook
dbfe5a4223 seccomp: split filter prep from check and apply
In preparation for adding seccomp locking, move filter creation away
from where it is checked and applied. This will allow for locking where
no memory allocation is happening. The validation, filter attachment,
and seccomp mode setting can all happen under the future locks.

For extreme defensiveness, I've added a BUG_ON check for the calculated
size of the buffer allocation in case BPF_MAXINSN ever changes, which
shouldn't ever happen. The compiler should actually optimize out this
check since the test above it makes it impossible.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	kernel/seccomp.c
2014-10-31 19:46:28 -07:00
Kees Cook
0901f9aec4 sched: move no_new_privs into new atomic flags
Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	fs/exec.c
	include/linux/sched.h
	kernel/sys.c
2014-10-31 19:46:28 -07:00
Kees Cook
61d45b4a98 ARM: add seccomp syscall
Wires up the new seccomp syscall.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>

Conflicts:
	arch/arm/include/uapi/asm/unistd.h
	arch/arm/kernel/calls.S
2014-10-31 19:46:27 -07:00
Kees Cook
18540f293a seccomp: add "seccomp" syscall
This adds the new "seccomp" syscall with both an "operation" and "flags"
parameter for future expansion. The third argument is a pointer value,
used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).

In addition to the TSYNC flag later in this patch series, there is a
non-zero chance that this syscall could be used for configuring a fixed
argument area for seccomp-tracer-aware processes to pass syscall arguments
in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
for this syscall. Additionally, this syscall uses operation, flags,
and user pointer for arguments because strictly passing arguments via
a user pointer would mean seccomp itself would be unable to trivially
filter the seccomp syscall itself.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	arch/x86/syscalls/syscall_32.tbl
	arch/x86/syscalls/syscall_64.tbl
	include/linux/syscalls.h
	include/uapi/asm-generic/unistd.h
	include/uapi/linux/seccomp.h
	kernel/seccomp.c
	kernel/sys_ni.c
2014-10-31 19:46:27 -07:00
Kees Cook
40f7177aae seccomp: split mode setting routines
Separates the two mode setting paths to make things more readable with
fewer #ifdefs within function bodies.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-10-31 19:46:26 -07:00
Kees Cook
d7a9b42a52 seccomp: extract check/assign mode helpers
To support splitting mode 1 from mode 2, extract the mode checking and
assignment logic into common functions.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-10-31 19:46:25 -07:00
Kees Cook
00bd3c7881 seccomp: create internal mode-setting function
In preparation for having other callers of the seccomp mode setting
logic, split the prctl entry point away from the core logic that performs
seccomp mode setting.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-10-31 19:46:25 -07:00
Kees Cook
e038ce5ef6 MAINTAINERS: create seccomp entry
Add myself as seccomp maintainer.

Suggested-by: James Morris <jmorris@namei.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2014-10-31 19:46:24 -07:00
Will Drewry
5ec08ba218 CHROMIUM: ARM: r1->r0 for get/set arguments
ARM reuses r0 as the first argument. This fixes the mistaken
assumption in the original patchset.  These will be merged
into one change when sent upstream.

Signed-off-by: Will Drewry <wad@chromium.org>
TEST=emerge tegra2_kaen; run seccomp testsuite
BUG=chromium-os:27878

Change-Id: Iaaa09995d35f78ee8cef7b600d526e71f3b2fcec
Reviewed-on: https://gerrit.chromium.org/gerrit/21342
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Will Drewry <wad@chromium.org>
Tested-by: Will Drewry <wad@chromium.org>
Signed-off-by: Sasha Levitskiy <sanek@google.com>
2014-10-31 19:46:24 -07:00
Will Drewry
db3d3076a5 CHROMIUM: seccomp: set -ENOSYS if there is no tracer
[Will attempt to add to -next, but this may need to wait
 until there is a motivating usecase, like ARM, since x86
 does the right thing already.]

On some arches, -ENOSYS is not set as the default system call
return value.  This means that a skipped or invalid system call
does not yield this response.  That behavior is not inline with
the stated ABI of seccomp filter.  To that end, we ensure we set
that value here to avoid arch idiosyncrasies.

Signed-off-by: Will Drewry <wad@chromium.org>
TEST=tegra2_kaen; boot, strace works, seccomp testsuite  trace tests pass
BUG=chromium-os:27878

Change-Id: I03a5e633d2fbb5d3d3cc33c067b2887068364c17
Reviewed-on: https://gerrit.chromium.org/gerrit/21337
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Will Drewry <wad@chromium.org>
Tested-by: Will Drewry <wad@chromium.org>
Signed-off-by: Sasha Levitskiy <sanek@google.com>
2014-10-31 19:46:23 -07:00