Commit graph

307397 commits

Author SHA1 Message Date
Vladis Dronov
e74a14ad5f ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call
create_fixed_stream_quirk(), snd_usb_parse_audio_interface() and
create_uaxx_quirk() functions allocate the audioformat object by themselves
and free it upon error before returning. However, once the object is linked
to a stream, it's freed again in snd_usb_audio_pcm_free(), thus it'll be
double-freed, eventually resulting in a memory corruption.

This patch fixes these failures in the error paths by unlinking the audioformat
object before freeing it.

Based on a patch by Takashi Iwai <tiwai@suse.de>

[Note for stable backports:
 this patch requires the commit 902eb7fd1e4a ('ALSA: usb-audio: Minor
 code cleanup in create_fixed_stream_quirk()')]

Change-Id: I129dc4f3b0ae4cb6f790c16d24dd768c9ee06822
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1283358
Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Cc: <stable@vger.kernel.org> # see the note above
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-11-11 13:37:28 +11:00
Takashi Iwai
74f53621d2 ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()
Just a minor code cleanup: unify the error paths.

Change-Id: I31346b08ed1024819c58eff797c63bb42c283512
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-11-11 13:37:09 +11:00
Calvin Owens
aa37926b7e sg: Fix double-free when drives detach during SG_IO
In sg_common_write(), we free the block request and return -ENODEV if
the device is detached in the middle of the SG_IO ioctl().

Unfortunately, sg_finish_rem_req() also tries to free srp->rq, so we
end up freeing rq->cmd in the already free rq object, and then free
the object itself out from under the current user.

This ends up corrupting random memory via the list_head on the rq
object. The most common crash trace I saw is this:

  ------------[ cut here ]------------
  kernel BUG at block/blk-core.c:1420!
  Call Trace:
  [<ffffffff81281eab>] blk_put_request+0x5b/0x80
  [<ffffffffa0069e5b>] sg_finish_rem_req+0x6b/0x120 [sg]
  [<ffffffffa006bcb9>] sg_common_write.isra.14+0x459/0x5a0 [sg]
  [<ffffffff8125b328>] ? selinux_file_alloc_security+0x48/0x70
  [<ffffffffa006bf95>] sg_new_write.isra.17+0x195/0x2d0 [sg]
  [<ffffffffa006cef4>] sg_ioctl+0x644/0xdb0 [sg]
  [<ffffffff81170f80>] do_vfs_ioctl+0x90/0x520
  [<ffffffff81258967>] ? file_has_perm+0x97/0xb0
  [<ffffffff811714a1>] SyS_ioctl+0x91/0xb0
  [<ffffffff81602afb>] tracesys+0xdd/0xe2
    RIP [<ffffffff81281e04>] __blk_put_request+0x154/0x1a0

The solution is straightforward: just set srp->rq to NULL in the
failure branch so that sg_finish_rem_req() doesn't attempt to re-free
it.

Additionally, since sg_rq_end_io() will never be called on the object
when this happens, we need to free memory backing ->cmd if it isn't
embedded in the object itself.

KASAN was extremely helpful in finding the root cause of this bug.

Change-Id: I8c2389a4e2e1b5f753a47f8af60502a761b891b5
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-11 13:36:31 +11:00
Omar Sandoval
dbe124efba block: fix use-after-free in sys_ioprio_get()
get_task_ioprio() accesses the task->io_context without holding the task
lock and thus can race with exit_io_context(), leading to a
use-after-free. The reproducer below hits this within a few seconds on
my 4-core QEMU VM:

int main(int argc, char **argv)
{
	pid_t pid, child;
	long nproc, i;

	/* ioprio_set(IOPRIO_WHO_PROCESS, 0, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0)); */
	syscall(SYS_ioprio_set, 1, 0, 0x6000);

	nproc = sysconf(_SC_NPROCESSORS_ONLN);

	for (i = 0; i < nproc; i++) {
		pid = fork();
		assert(pid != -1);
		if (pid == 0) {
			for (;;) {
				pid = fork();
				assert(pid != -1);
				if (pid == 0) {
					_exit(0);
				} else {
					child = wait(NULL);
					assert(child == pid);
				}
			}
		}

		pid = fork();
		assert(pid != -1);
		if (pid == 0) {
			for (;;) {
				/* ioprio_get(IOPRIO_WHO_PGRP, 0); */
				syscall(SYS_ioprio_get, 2, 0);
			}
		}
	}

	for (;;) {
		/* ioprio_get(IOPRIO_WHO_PGRP, 0); */
		syscall(SYS_ioprio_get, 2, 0);
	}

	return 0;
}

This gets us KASAN dumps like this:

[   35.526914] ==================================================================
[   35.530009] BUG: KASAN: out-of-bounds in get_task_ioprio+0x7b/0x90 at addr ffff880066f34e6c
[   35.530009] Read of size 2 by task ioprio-gpf/363
[   35.530009] =============================================================================
[   35.530009] BUG blkdev_ioc (Not tainted): kasan: bad access detected
[   35.530009] -----------------------------------------------------------------------------

[   35.530009] Disabling lock debugging due to kernel taint
[   35.530009] INFO: Allocated in create_task_io_context+0x2b/0x370 age=0 cpu=0 pid=360
[   35.530009] 	___slab_alloc+0x55d/0x5a0
[   35.530009] 	__slab_alloc.isra.20+0x2b/0x40
[   35.530009] 	kmem_cache_alloc_node+0x84/0x200
[   35.530009] 	create_task_io_context+0x2b/0x370
[   35.530009] 	get_task_io_context+0x92/0xb0
[   35.530009] 	copy_process.part.8+0x5029/0x5660
[   35.530009] 	_do_fork+0x155/0x7e0
[   35.530009] 	SyS_clone+0x19/0x20
[   35.530009] 	do_syscall_64+0x195/0x3a0
[   35.530009] 	return_from_SYSCALL_64+0x0/0x6a
[   35.530009] INFO: Freed in put_io_context+0xe7/0x120 age=0 cpu=0 pid=1060
[   35.530009] 	__slab_free+0x27b/0x3d0
[   35.530009] 	kmem_cache_free+0x1fb/0x220
[   35.530009] 	put_io_context+0xe7/0x120
[   35.530009] 	put_io_context_active+0x238/0x380
[   35.530009] 	exit_io_context+0x66/0x80
[   35.530009] 	do_exit+0x158e/0x2b90
[   35.530009] 	do_group_exit+0xe5/0x2b0
[   35.530009] 	SyS_exit_group+0x1d/0x20
[   35.530009] 	entry_SYSCALL_64_fastpath+0x1a/0xa4
[   35.530009] INFO: Slab 0xffffea00019bcd00 objects=20 used=4 fp=0xffff880066f34ff0 flags=0x1fffe0000004080
[   35.530009] INFO: Object 0xffff880066f34e58 @offset=3672 fp=0x0000000000000001
[   35.530009] ==================================================================

Fix it by grabbing the task lock while we poke at the io_context.

Change-Id: I4261aaf076fab943a80a45b0a77e023aa4ecbbd8
Cc: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-11 13:35:57 +11:00
Vegard Nossum
415e7642f8 block: fix use-after-free in seq file
I got a KASAN report of use-after-free:

    ==================================================================
    BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508
    Read of size 8 by task trinity-c1/315
    =============================================================================
    BUG kmalloc-32 (Not tainted): kasan: bad access detected
    -----------------------------------------------------------------------------

    Disabling lock debugging due to kernel taint
    INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315
            ___slab_alloc+0x4f1/0x520
            __slab_alloc.isra.58+0x56/0x80
            kmem_cache_alloc_trace+0x260/0x2a0
            disk_seqf_start+0x66/0x110
            traverse+0x176/0x860
            seq_read+0x7e3/0x11a0
            proc_reg_read+0xbc/0x180
            do_loop_readv_writev+0x134/0x210
            do_readv_writev+0x565/0x660
            vfs_readv+0x67/0xa0
            do_preadv+0x126/0x170
            SyS_preadv+0xc/0x10
            do_syscall_64+0x1a1/0x460
            return_from_SYSCALL_64+0x0/0x6a
    INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315
            __slab_free+0x17a/0x2c0
            kfree+0x20a/0x220
            disk_seqf_stop+0x42/0x50
            traverse+0x3b5/0x860
            seq_read+0x7e3/0x11a0
            proc_reg_read+0xbc/0x180
            do_loop_readv_writev+0x134/0x210
            do_readv_writev+0x565/0x660
            vfs_readv+0x67/0xa0
            do_preadv+0x126/0x170
            SyS_preadv+0xc/0x10
            do_syscall_64+0x1a1/0x460
            return_from_SYSCALL_64+0x0/0x6a

    CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G    B           4.7.0+ #62
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
     ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480
     ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480
     ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970
    Call Trace:
     [<ffffffff81d6ce81>] dump_stack+0x65/0x84
     [<ffffffff8146c7bd>] print_trailer+0x10d/0x1a0
     [<ffffffff814704ff>] object_err+0x2f/0x40
     [<ffffffff814754d1>] kasan_report_error+0x221/0x520
     [<ffffffff8147590e>] __asan_report_load8_noabort+0x3e/0x40
     [<ffffffff83888161>] klist_iter_exit+0x61/0x70
     [<ffffffff82404389>] class_dev_iter_exit+0x9/0x10
     [<ffffffff81d2e8ea>] disk_seqf_stop+0x3a/0x50
     [<ffffffff8151f812>] seq_read+0x4b2/0x11a0
     [<ffffffff815f8fdc>] proc_reg_read+0xbc/0x180
     [<ffffffff814b24e4>] do_loop_readv_writev+0x134/0x210
     [<ffffffff814b4c45>] do_readv_writev+0x565/0x660
     [<ffffffff814b8a17>] vfs_readv+0x67/0xa0
     [<ffffffff814b8de6>] do_preadv+0x126/0x170
     [<ffffffff814b92ec>] SyS_preadv+0xc/0x10

This problem can occur in the following situation:

open()
 - pread()
    - .seq_start()
       - iter = kmalloc() // succeeds
       - seqf->private = iter
    - .seq_stop()
       - kfree(seqf->private)
 - pread()
    - .seq_start()
       - iter = kmalloc() // fails
    - .seq_stop()
       - class_dev_iter_exit(seqf->private) // boom! old pointer

As the comment in disk_seqf_stop() says, stop is called even if start
failed, so we need to reinitialise the private pointer to NULL when seq
iteration stops.

An alternative would be to set the private pointer to NULL when the
kmalloc() in disk_seqf_start() fails.

Change-Id: I41ee55505a213f99a92ce630885e6c31b4b60232
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-11 13:32:03 +11:00
Linus Torvalds
82a3f4741a mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db975 ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404 ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better).  The
s390 dirty bit was implemented in abf09bed3c ("s390/mm: implement
software dirty bits") which made it into v3.9.  Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Change-Id: Ifcb16e37ceb6b8845d7a77a97f5fda6670d08378
Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask;
     s/faultin_page/__get_user_page]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59747d5d21)
2016-10-31 05:37:50 -07:00
David Howells
9537b8ff2e KEYS: Fix short sprintf buffer in /proc/keys show function
Fix a short sprintf buffer in proc_keys_show().  If the gcc stack protector
is turned on, this can cause a panic due to stack corruption.

The problem is that xbuf[] is not big enough to hold a 64-bit timeout
rendered as weeks:

	(gdb) p 0xffffffffffffffffULL/(60*60*24*7)
	$2 = 30500568904943

That's 14 chars plus NUL, not 11 chars plus NUL.

Expand the buffer to 16 chars.

I think the unpatched code apparently works if the stack-protector is not
enabled because on a 32-bit machine the buffer won't be overflowed and on a
64-bit machine there's a 64-bit aligned pointer at one side and an int that
isn't checked again on the other side.

The panic incurred looks something like:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe
CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f
 ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6
 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679
Call Trace:
 [<ffffffff813d941f>] dump_stack+0x63/0x84
 [<ffffffff811b2cb6>] panic+0xde/0x22a
 [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0
 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30
 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0
 [<ffffffff81350410>] ? key_validate+0x50/0x50
 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20
 [<ffffffff8126b31c>] seq_read+0x2cc/0x390
 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70
 [<ffffffff81244fc7>] __vfs_read+0x37/0x150
 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0
 [<ffffffff81246156>] vfs_read+0x96/0x130
 [<ffffffff81247635>] SyS_read+0x55/0xc0
 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4

Change-Id: I0787d5a38c730ecb75d3c08f28f0ab36295d59e7
Reported-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ondrej Kozina <okozina@redhat.com>
2016-10-31 23:36:27 +11:00
Eric Dumazet
1c6a02e572 tcp: fix use after free in tcp_xmit_retransmit_queue()
When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Change-Id: I264f97d30d0a623011d9ee811c63fa0e0c2149a2
Fixes: 6859d49475 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31 23:36:25 +11:00
Lars-Peter Clausen
c2f44eb7b6 ALSA: control: Fix replacing user controls
There are two issues with the current implementation for replacing user
controls. The first is that the code does not check if the control is actually a
user control and neither does it check if the control is owned by the process
that tries to remove it. That allows userspace applications to remove arbitrary
controls, which can cause a user after free if a for example a driver does not
expect a control to be removed from under its feed.

The second issue is that on one hand when a control is replaced the
user_ctl_count limit is not checked and on the other hand the user_ctl_count is
increased (even though the number of user controls does not change). This allows
userspace, once the user_ctl_count limit as been reached, to repeatedly replace
a control until user_ctl_count overflows. Once that happens new controls can be
added effectively bypassing the user_ctl_count limit.

Both issues can be fixed by instead of open-coding the removal of the control
that is to be replaced to use snd_ctl_remove_user_ctl(). This function does
proper permission checks as well as decrements user_ctl_count after the control
has been removed.

Note that by using snd_ctl_remove_user_ctl() the check which returns -EBUSY at
beginning of the function if the control already exists is removed. This is not
a problem though since the check is quite useless, because the lock that is
protecting the control list is released between the check and before adding the
new control to the list, which means that it is possible that a different
control with the same settings is added to the list after the check. Luckily
there is another check that is done while holding the lock in snd_ctl_add(), so
we'll rely on that to make sure that the same control is not added twice.

Change-Id: Ia4bd6bff33e86ee8b971031381d07b80bd383171
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-10-31 23:36:23 +11:00
Nick Desaulniers
c3f8d15467 fs: ext4: disable support for fallocate FALLOC_FL_PUNCH_HOLE
Bug: 28760453
Change-Id: I019c2de559db9e4b95860ab852211b456d78c4ca
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
2016-10-31 23:29:10 +11:00
Nick Desaulniers
f1c0c6df64 binder: prevent kptr leak by using %pK format specifier
Works in conjunction with kptr_restrict.
Bug: 30143283

Change-Id: I2b3ce22f4e206e74614d51453a1d59b7080ab05a
2016-10-31 23:26:31 +11:00
Arnaldo Carvalho de Melo
a84ab4f34b net: Fix use after free in the recvmmsg exit path
The syzkaller fuzzer hit the following use-after-free:

  Call Trace:
   [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
   [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
   [<     inline     >] SYSC_recvmmsg net/socket.c:2281
   [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
   [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
  arch/x86/entry/entry_64.S:185

And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.

Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Fixes: a2e2725541 ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Change-Id: Ie3b6ee89ad3e8cd3a0fe8f50f74aaa4834d0b4ca
2016-10-31 23:25:23 +11:00
Weiyin Jiang
1301ad430e ASoC: msm: audio-effects: misc fixes in h/w accelerated
effect

Adding memory copy size check and integer overflow check in h/w
accelerated effect driver.

Change-Id: I17d4cc0a38770f0c5067fa8047cd63e7bf085e48
CRs-Fixed: 1006609
Signed-off-by: Weiyin Jiang <wjiang@codeaurora.org>
2016-10-31 23:20:27 +11:00
Karthikeyan Ramasubramanian
f6c251ef75 net: ipc_router: Bind only a client port as control port
IPC Router binds any port as a control port and moves it from the client
port list to control port list. Misbehaving clients can exploit this
incorrect behavior.

IPC Router to check if the port is a client port before binding it as a
control port.

CRs-Fixed: 974577
Change-Id: I9f189b76967d5f85750218a7cb6537d187a69663
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
2016-10-31 23:08:48 +11:00
Sunil Khatri
c7bc413d93 ashmem: Validate ashmem memory with fops pointer
Validate the ashmem memory entry against f_op pointer
rather then comparing its name with path of the dentry.

This is to avoid any invalid access to ashmem area in cases
where some one deliberately set the dentry name to /ashmem.

Change-Id: I74e50cd244f68cb13009cf2355e528485f4de34b
Signed-off-by: Sunil Khatri <sunilkh@codeaurora.org>
2016-10-31 23:03:53 +11:00
Florian Westphal
71db8fdf0f UPSTREAM: netfilter: x_tables: validate e->target_offset early
(cherry pick from commit bdf533de6968e9686df777dc178486f600c6e617)

We should check that e->target_offset is sane before
mark_source_chains gets called since it will fetch the target entry
for loop detection.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Change-Id: Ic2dbc31c9525d698e94d4d8875886acf3524abbd
Bug: 29637687
2016-10-31 23:01:48 +11:00
Florian Westphal
efeea2b956 netfilter: x_tables: make sure e->next_offset covers remaining blob size
Otherwise this function may read data beyond the ruleset blob.

Change-Id: I22f514057d3e0403d1af61f4d9555403ab9f72ea
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-31 23:00:43 +11:00
Scott Bauer
cd3f4552ef HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
This patch validates the num_values parameter from userland during the
HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
leading to a heap overflow.

Change-Id: I10866ee01c7ba430eab2b5cc3356c9519c7f9730
Cc: stable@vger.kernel.org
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-10-31 22:59:04 +11:00
Eric W. Biederman
61f1c9ad34 mnt: Fail collect_mounts when applied to unmounted mounts
The only users of collect_mounts are in audit_tree.c

In audit_trim_trees and audit_add_tree_rule the path passed into
collect_mounts is generated from kern_path passed an audit_tree
pathname which is guaranteed to be an absolute path.   In those cases
collect_mounts is obviously intended to work on mounted paths and
if a race results in paths that are unmounted when collect_mounts
it is reasonable to fail early.

The paths passed into audit_tag_tree don't have the absolute path
check.  But are used to play with fsnotify and otherwise interact with
the audit_trees, so again operating only on mounted paths appears
reasonable.

Avoid having to worry about what happens when we try and audit
unmounted filesystems by restricting collect_mounts to mounts
that appear in the mount tree.

Change-Id: I2edfee6d6951a2179ce8f53785b65ddb1eb95629
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-10-31 22:51:16 +11:00
Dan Carpenter
0f76e6ad56 KEYS: potential uninitialized variable
If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

Change-Id: I0e28bdba07f645437db2b08daf67ca27f16c6f5c
Fixes: f70e2e0619 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2016-10-31 22:49:16 +11:00
Vladis Dronov
cdd4896645 Input: aiptek - fix crash on detecting device without endpoints
The aiptek driver crashes in aiptek_probe() when a specially crafted USB
device without endpoints is detected. This fix adds a check that the device
has proper configuration expected by the driver. Also an error return value
is changed to more matching one in one of the error paths.

Change-Id: I02fa4ffcbe9a71948947ef5baeb72632688d9d07
Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-10-31 22:46:10 +11:00
Julia Lawall
45fcc17a90 Input: aiptek - adjust error-handling code label
At the point of this error-handling code, aiptek->urb has been allocated,
and it does not appear to be less necessary to free it here than in the
error-handling code just below.

Change-Id: I1b07d7cd62a3df78759dd5a9a5ad27e58350df01
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2016-10-31 22:45:54 +11:00
Benjamin Randazzo
7f5686b2df md: use kzalloc() when bitmap is disabled
commit b6878d9e03043695dbf3fa1caa6dfc09db225b16 upstream.

In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
mdu_bitmap_file_t called "file".

5769         file = kmalloc(sizeof(*file), GFP_NOIO);
5770         if (!file)
5771                 return -ENOMEM;

This structure is copied to user space at the end of the function.

5786         if (err == 0 &&
5787             copy_to_user(arg, file, sizeof(*file)))
5788                 err = -EFAULT

But if bitmap is disabled only the first byte of "file" is initialized
with zero, so it's possible to read some bytes (up to 4095) of kernel
space memory from user space. This is an information leak.

5775         /* bitmap disabled, zero the first byte and copy out */
5776         if (!mddev->bitmap_info.file)
5777                 file->pathname[0] = '\0';

Change-Id: I7cd2a3c7fad2e2cb9edb8b4eff2af8a3a8f40149
Signed-off-by: Benjamin Randazzo <benjamin@randazzo.fr>
Signed-off-by: NeilBrown <neilb@suse.com>
[lizf: Backported to 3.4: fix both branches]
Signed-off-by: Zefan Li <lizefan@huawei.com>
2016-10-31 22:27:40 +11:00
Sabrina Dubroca
955be94706 net: add length argument to skb_copy_and_csum_datagram_iovec
Without this length argument, we can read past the end of the iovec in
memcpy_toiovec because we have no way of knowing the total length of the
iovec's buffers.

This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb
csum races when peeking") has been backported but that don't have the
ioviter conversion, which is almost all the stable trees <= 3.18.

This also fixes a kernel crash for NFS servers when the client uses
 -onfsvers=3,proto=udp to mount the export.

Change-Id: I1865e3d7a1faee42a5008a9ad58c4d3323ea4bab
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
2016-10-31 22:26:43 +11:00
Thierry Strudel
1bbb8f5128 flo: disable CONFIG_OABI_COMPAT
Bug: 28522518
Change-Id: I11ec8e02bdb330c10f06e923c1c3d45a145ced15
Signed-off-by: Thierry Strudel <tstrudel@google.com>
2016-10-29 23:12:41 +08:00
Siqi Lin
ba467f4b95 radio: iris: Remove FM radio driver from defconfig
FM radio is not used on flo.

Bug: 28769368
Bug: 28769546
Change-Id: Ice4c4cb66e7ea7b7e34efe125e29377f896e80f1
Signed-off-by: Siqi Lin <siqilin@google.com>
2016-10-29 23:12:41 +08:00
Mekala Natarajan
15042fc2ed flo: enable SECURITY_PERF_EVENTS_RESTRICT
Bug: 29119870
Change-Id: Ib9d25b69486bb34ea5749e1342453f3f7f3a2920
Signed-off-by: Mekala Natarajan <mnatarajan@google.com>
2016-10-29 23:12:41 +08:00
Rusty Russell
55efcf3ed3 PTR_RET is now PTR_ERR_OR_ZERO
True, it's often used in return statements, but after much bikeshedding
it's probably better to have an explicit name.

(I tried just putting the IS_ERR check inside PTR_ERR itself and gcc
usually generated no more code.  But that clashes current expectations
of how PTR_ERR behaves, so having a separate function is better).

Change-Id: I8379ab1f9517703e51c5a94b213ba9c66013f858
Suggested-by: Julia Lawall <julia.lawall@lip6.fr>
Suggested-by: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-10-29 23:12:41 +08:00
Arnd Bergmann
0639e5b7f3 Turn off -Wmaybe-uninitialized when building with -Os
gcc-4.7 and higher add a lot of false positive warnings about
potential uses of uninitialized warnings, but only when optimizing
for size (-Os). This is the default when building allyesconfig,
which turns on CONFIG_CC_OPTIMIZE_FOR_SIZE.

In order to avoid getting a lot of patches that initialize such
variables and accidentally hide real errors along the way, let's
just turn off this warning on the respective gcc versions
when building with size optimizations. The -Wmaybe-uninitialized
option was introduced in the same gcc version (4.7) that is now
causing the false positives, so there is no effect on older compilers.

A side effect is that when building with CONFIG_CC_OPTIMIZE_FOR_SIZE,
we might now see /fewer/ warnings about possibly uninitialized
warnings than with -O2, but that is still much better than seeing
warnings known to be bogus.

Change-Id: I5a863923ad347526e6dc58c0bbbc259eb20bfe5c
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Git-commit: e74fc973b6
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Bryan Huntsman <bryanh@codeaurora.org>
2016-10-29 23:12:41 +08:00
Stephen Smalley
12d3c9a750 UPSTREAM: selinux: fix bug in conditional rules handling
(cherry picked from commit commit f3bef67992e8698897b584616535803887c4a73e)

commit fa1aa143ac4a ("selinux: extended permissions for ioctls")
introduced a bug into the handling of conditional rules, skipping the
processing entirely when the caller does not provide an extended
permissions (xperms) structure.  Access checks from userspace using
/sys/fs/selinux/access do not include such a structure since that
interface does not presently expose extended permission information.
As a result, conditional rules were being ignored entirely on userspace
access requests, producing denials when access was allowed by
conditional rules in the policy.  Fix the bug by only skipping
computation of extended permissions in this situation, not the entire
conditional rules processing.

Change-Id: I6f81765f6cdb9ce72f93c290d7987d93688651a0
Reported-by: Laurent Bigonville <bigon@debian.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fixed long lines in patch description]
Cc: stable@vger.kernel.org # 4.3
Signed-off-by: Paul Moore <pmoore@redhat.com>
2016-10-29 23:12:41 +08:00
Jeff Vander Stoep
cff5f5bf9b selinux: Android kernel compatibility with M userspace
NOT intended for new Android devices - this commit is unnecessary
for a target device that does not have a previous M variant.

DO NOT upstream. Android only.

Motivation:

This commit mitigates a mismatch between selinux kernel and
selinux userspace. The selinux ioctl white-listing binary policy
format that was accepted into Android M differs slightly from what
was later accepted into the upstream kernel. This leaves Android
master branch kernels incompatible with Android M releases. This
patch restores backwards compatibility. This is important because:

1. kernels may be updated on a different cycle than the rest of the
   OS e.g. security patching.
2. Android M bringup may still be ongoing for some devices. The
   same kernel should work for both M and master.

Backwards compatibility is achieved by checking for an Android M
policy characteristic during initial policy read and converting to
upstream policy format. The inverse conversion is done for policy
write as required for CTS testing.

Bug: 22846070
Change-Id: I2f1ee2eee402f37cf3c9df9f9e03c1b9ddec1929
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
2016-10-29 23:12:40 +08:00
Jeff Vander Stoep
1e0918b3e9 selinux: extended permissions for ioctls
(cherry picked from commit fa1aa143ac4a682c7f5fd52a3cf05f5a6fe44a0a)

Add extended permissions logic to selinux. Extended permissions
provides additional permissions in 256 bit increments. Extend the
generic ioctl permission check to use the extended permissions for
per-command filtering. Source/target/class sets including the ioctl
permission may additionally include a set of commands. Example:

allowxperm <source> <target>:<class> ioctl unpriv_app_socket_cmds
auditallowxperm <source> <target>:<class> ioctl priv_gpu_cmds

Where unpriv_app_socket_cmds and priv_gpu_cmds are macros
representing commonly granted sets of ioctl commands.

When ioctl commands are omitted only the permissions are checked.
This feature is intended to provide finer granularity for the ioctl
permission that may be too imprecise. For example, the same driver
may use ioctls to provide important and benign functionality such as
driver version or socket type as well as dangerous capabilities such
as debugging features, read/write/execute to physical memory or
access to sensitive data. Per-command filtering provides a mechanism
to reduce the attack surface of the kernel, and limit applications
to the subset of commands required.

The format of the policy binary has been modified to include ioctl
commands, and the policy version number has been incremented to
POLICYDB_VERSION_XPERMS_IOCTL=30 to account for the format
change.

The extended permissions logic is deliberately generic to allow
components to be reused e.g. netlink filters

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Bug: 22846070
Change-Id: I299dc776d2f98d593ecc051707110c92a085350f
2016-10-29 23:12:40 +08:00
Jeff Vander Stoep
baaef70cff selinux: remove unnecessary pointer reassignment
(cherry pick from commit 83d4a806ae46397f606de7376b831524bd3a21e5)

Commit f01e1af445 ("selinux: don't pass in NULL avd to avc_has_perm_noaudit")
made this pointer reassignment unnecessary. Avd should continue to reference
the stack-based copy.

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Bug: 22846070
Change-Id: Ie33688d163870705272607309a27fb7c8f870748
2016-10-29 23:12:40 +08:00
Jeff Vander Stoep
2db0d24e13 Revert "SELinux: per-command whitelisting of ioctls"
This reverts commit bc84b4adb1469e3d05ad76c304a4c545feaf1f88.

Bug: 22846070
Change-Id: Ib4cb130b2225ea2e22556ff852313e0de7dddcab
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
2016-10-29 23:12:40 +08:00
Jeff Vander Stoep
2dc8579bce Revert "SELinux: use deletion-safe iterator to free list"
This reverts commit c9a8571249fa3a55a0490bd571eaf0cea097fab0.

Bug: 22846070
Change-Id: I85e2b6322f98bd584ed523b0bd0291375dbc35dc
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
2016-10-29 23:12:40 +08:00
Jeff Vander Stoep
ff628d0d4b Revert "SELinux: ss: Fix policy write for ioctl operations"
This reverts commit c06168226f5eaaaad93af5b2811f213b01382363.

Bug: 22846070
Change-Id: I665c1f2350e10ce890e7c4be1a06e666929d5d7a
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
2016-10-29 23:12:40 +08:00
dcashman
1ebb3a40bf FROMLIST: arm: mm: support ARCH_MMAP_RND_BITS.
(cherry picked from commit https://lkml.org/lkml/2015/12/21/341)

arm: arch_mmap_rnd() uses a hard-code value of 8 to generate the
random offset for the mmap base address.  This value represents a
compromise between increased ASLR effectiveness and avoiding
address-space fragmentation. Replace it with a Kconfig option, which
is sensibly bounded, so that platform developers may choose where to
place this compromise. Keep 8 as the minimum acceptable value.

Bug: 24047224
Signed-off-by: Daniel Cashman <dcashman@android.com>
Signed-off-by: Daniel Cashman <dcashman@google.com>
Change-Id: I2f6c18a0060e1c21b53200ecdcfde9a8c2e3db98
2016-10-29 23:12:40 +08:00
dcashman
dcc94e7ac7 FROMLIST: mm: mmap: Add new /proc tunable for mmap_base ASLR.
(cherry picked from commit https://lkml.org/lkml/2015/12/21/337)

ASLR  only uses as few as 8 bits to generate the random offset for the
mmap base address on 32 bit architectures. This value was chosen to
prevent a poorly chosen value from dividing the address space in such
a way as to prevent large allocations. This may not be an issue on all
platforms. Allow the specification of a minimum number of bits so that
platforms desiring greater ASLR protection may determine where to place
the trade-off.

Bug: 24047224
Signed-off-by: Daniel Cashman <dcashman@android.com>
Signed-off-by: Daniel Cashman <dcashman@google.com>
Change-Id: Ic74424e07710cd9ccb4a02871a829d14ef0cc4bc
2016-10-29 23:12:40 +08:00
Paul E. McKenney
2f8470379e cpu: Handle smpboot_unpark_threads() uniformly
Commit 00df35f99191 (cpu: Defer smpboot kthread unparking until CPU known
to scheduler) put the online path's call to smpboot_unpark_threads()
into a CPU-hotplug notifier.  This commit places the offline-failure
paths call into the same notifier for the sake of uniformity.

Note that it is not currently possible to place the offline path's call to
smpboot_park_threads() into an existing notifier because the CPU_DYING
notifiers run in a restricted environment, and the CPU_UP_PREPARE
notifiers run too soon.

Change-Id: I27fff8de4ec3f0193c1c9cf9ccb5701d08faa1ca
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-10-29 23:12:40 +08:00
Paul E. McKenney
b7703ca834 cpu: Defer smpboot kthread unparking until CPU known to scheduler
Currently, smpboot_unpark_threads() is invoked before the incoming CPU
has been added to the scheduler's runqueue structures.  This might
potentially cause the unparked kthread to run on the wrong CPU, since the
correct CPU isn't fully set up yet.

That causes a sporadic, hard to debug boot crash triggering on some
systems, reported by Borislav Petkov, and bisected down to:

  2a442c9c6453 ("x86: Use common outgoing-CPU-notification code")

This patch places smpboot_unpark_threads() in a CPU hotplug
notifier with priority set so that these kthreads are unparked just after
the CPU has been added to the runqueues.

Change-Id: I8921987de9c2a2f475cc63dc82662d6ebf6e8725
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Git-commit: 00df35f991914db6b8bde8cf09808e19a9cffc3d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
2016-10-29 23:12:40 +08:00
Vignesh Radhakrishnan
061336f2bb smpboot: use kmemleak_not_leak for smpboot_thread_data
Kmemleak reports the following memory leak :

    [<ffffffc0002faef8>] create_object+0x140/0x274
    [<ffffffc000cc3598>] kmemleak_alloc+0x80/0xbc
    [<ffffffc0002f707c>] kmem_cache_alloc_trace+0x148/0x1d8
    [<ffffffc00024504c>] __smpboot_create_thread.part.2+0x2c/0xec
    [<ffffffc0002452b4>] smpboot_register_percpu_thread+0x90/0x118
    [<ffffffc0016067c0>] spawn_ksoftirqd+0x1c/0x30
    [<ffffffc000200824>] do_one_initcall+0xb0/0x14c
    [<ffffffc001600820>] kernel_init_freeable+0x84/0x1e0
    [<ffffffc000cc273c>] kernel_init+0x10/0xcc
    [<ffffffc000203bbc>] ret_from_fork+0xc/0x50

This memory allocated here points to smpboot_thread_data.
Data is used as an argument for this kthread.

This will be used when smpboot_thread_fn runs. Therefore,
is not a leak.

Call kmemleak_not_leak for smpboot_thread_data pointer
to ensure that kmemleak doesn't report it as a memory
leak.

Change-Id: I02b0a7debea3907b606856e069d63d7991b67cd9
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2016-10-29 23:12:40 +08:00
Lai Jiangshan
13ad0c7bc0 smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread()
The following race exists in the smpboot percpu threads management:

CPU0	      	   	     CPU1
cpu_up(2)
  get_online_cpus();
  smpboot_create_threads(2);
			     smpboot_register_percpu_thread();
			     for_each_online_cpu();
			       __smpboot_create_thread();
  __cpu_up(2);

This results in a missing per cpu thread for the newly onlined cpu2 and
in a NULL pointer dereference on a consecutive offline of that cpu.

Proctect smpboot_register_percpu_thread() with get_online_cpus() to
prevent that.

[ tglx: Massaged changelog and removed the change in
        smpboot_unregister_percpu_thread() because that's an
        optimization and therefor not stable material. ]

Change-Id: I8c92a64bf35c3e77c8dd81761e9c8f71b2f94817
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1406777421-12830-1-git-send-email-laijs@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:40 +08:00
Subbaraman Narayanamurthy
c9e9aa8c95 kthread: Fix the race condition when kthread is parked
While stressing the CPU hotplug path, sometimes we hit a problem
as shown below.

[57056.416774] ------------[ cut here ]------------
[57056.489232] ksoftirqd/1 (14): undefined instruction: pc=c01931e8
[57056.489245] Code: e594a000 eb085236 e15a0000 0a000000 (e7f001f2)
[57056.489259] ------------[ cut here ]------------
[57056.492840] kernel BUG at kernel/kernel/smpboot.c:134!
[57056.513236] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[57056.519055] Modules linked in: wlan(O) mhi(O)
[57056.523394] CPU: 0 PID: 14 Comm: ksoftirqd/1 Tainted: G        W  O 3.10.0-g3677c61-00008-g180c060 #1
[57056.532595] task: f0c8b000 ti: f0e78000 task.ti: f0e78000
[57056.537991] PC is at smpboot_thread_fn+0x124/0x218
[57056.542750] LR is at smpboot_thread_fn+0x11c/0x218
[57056.547528] pc : [<c01931e8>]    lr : [<c01931e0>]    psr: 200f0013
[57056.547528] sp : f0e79f30  ip : 00000000  fp : 00000000
[57056.558983] r10: 00000001  r9 : 00000000  r8 : f0e78000
[57056.564192] r7 : 00000001  r6 : c1195758  r5 : f0e78000  r4 : f0e5fd00
[57056.570701] r3 : 00000001  r2 : f0e79f20  r1 : 00000000  r0 : 00000000

This issue was always seen in the context of "ksoftirqd". It seems to
be happening because of a potential race condition in __kthread_parkme
where just after completing the parked completion, before the
ksoftirqd task has been scheduled again, it can go into running state.

Fix this by waiting for the task state to parked after waiting the
parked completion.

CRs-Fixed: 659674
Change-Id: If3f0e9b706eeb5d30d5a32f84378d35bb03fe794
Signed-off-by: Subbaraman Narayanamurthy <subbaram@codeaurora.org>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
34ba3c370d kthread: Prevent unpark race which puts threads on the wrong cpu
The smpboot threads rely on the park/unpark mechanism which binds per
cpu threads on a particular core. Though the functionality is racy:

CPU0	       	 	CPU1  	     	    CPU2
unpark(T)				    wake_up_process(T)
  clear(SHOULD_PARK)	T runs
			leave parkme() due to !SHOULD_PARK
  bind_to(CPU2)		BUG_ON(wrong CPU)

We cannot let the tasks move themself to the target CPU as one of
those tasks is actually the migration thread itself, which requires
that it starts running on the target cpu right away.

The solution to this problem is to prevent wakeups in park mode which
are not from unpark(). That way we can guarantee that the association
of the task to the target cpu is working correctly.

Add a new task state (TASK_PARKED) which prevents other wakeups and
use this state explicitly for the unpark wakeup.

Peter noticed: Also, since the task state is visible to userspace and
all the parked tasks are still in the PID space, its a good hint in ps
and friends that these tasks aren't really there for the moment.

The migration thread has another related issue.

CPU0	      	     	 CPU1
Bring up CPU2
create_thread(T)
park(T)
 wait_for_completion()
			 parkme()
			 complete()
sched_set_stop_task()
			 schedule(TASK_PARKED)

The sched_set_stop_task() call is issued while the task is on the
runqueue of CPU1 and that confuses the hell out of the stop_task class
on that cpu. So we need the same synchronizaion before
sched_set_stop_task().

Change-Id: I9ad6fbe65992ad5b5cb9a252470a56ec51a4ff4f
Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: dhillf@gmail.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
abd57a700e stop_machine: Mark per cpu stopper enabled early
commit 14e568e78 (stop_machine: Use smpboot threads) introduced the
following regression:

Before this commit the stopper enabled bit was set in the online
notifier.

CPU0				CPU1
cpu_up
				cpu online
hotplug_notifier(ONLINE)
  stopper(CPU1)->enabled = true;
...
stop_machine()

The conversion to smpboot threads moved the enablement to the wakeup
path of the parked thread. The majority of users seem to have the
following working order:

CPU0				CPU1
cpu_up
				cpu online
unpark_threads()
  wakeup(stopper[CPU1])
....
				stopper thread runs
				  stopper(CPU1)->enabled = true;
stop_machine()

But Konrad and Sander have observed:

CPU0				CPU1
cpu_up
				cpu online
unpark_threads()
  wakeup(stopper[CPU1])
....
stop_machine()
				stopper thread runs
				  stopper(CPU1)->enabled = true;

Now the stop machinery kicks CPU0 into the stop loop, where it gets
stuck forever because the queue code saw stopper(CPU1)->enabled ==
false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1
stopper work got discarded due to enabled == false.

Add a pre_unpark function to the smpboot thread descriptor and call it
before waking the thread.

This fixes the problem at hand, but the stop_machine code should be
more robust. The stopper->enabled flag smells fishy at best.

Thanks to Konrad for going through a loop of debug patches and
providing the information to decode this issue.

Change-Id: I636875cf71ea5c5315eb0eb8599a8ebb9eadabf8
Reported-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
9b73a14959 stop_machine: Use smpboot threads
Use the smpboot thread infrastructure. Mark the stopper thread
selfparking and park it after it has finished the take_cpu_down()
work.

Change-Id: If478c40c32a6a1c41c6f73da422d0f2401ee8d17
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.686315164@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
87943c7833 stop_machine: Store task reference in a separate per cpu variable
To allow the stopper thread being managed by the smpboot thread
infrastructure separate out the task storage from the stopper data
structure.

Change-Id: I5270d00c93125618ddde65c1d753c9623740d184
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.626690384@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
738c3ffd5f smpboot: Allow selfparking per cpu threads
The stop machine threads are still killed when a cpu goes offline. The
reason is that the thread is used to bring the cpu down, so it can't
be parked along with the other per cpu threads.

Allow a per cpu thread to be excluded from automatic parking, so it
can park itself once it's done

Add a create callback function as well.

Change-Id: I6c7496b9da7984cfd513d2e7ee681f0df3206c26
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.553993267@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Thomas Gleixner
a73b106a40 softirq: Use hotplug thread infrastructure
[ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

Change-Id: I87f9f3a3fb1856f0d5c5712be39b4c2c6ecd468f
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20120716103948.456416747@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00
Paul E. McKenney
c93d48c1a3 rcu: Use smp_hotplug_thread facility for RCUs per-CPU kthread
Bring RCU into the new-age CPU-hotplug fold by modifying RCU's per-CPU
kthread code to use the new smp_hotplug_thread facility.

[ tglx: Adapted it to use callbacks and to the simplified rcu yield ]

Change-Id: I55d4c5448eb4ea05debb75c3442316ce45728647
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20120716103948.673354828@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-29 23:12:39 +08:00