android_kernel_samsung_msm8976/kernel
Peter Zijlstra 0bab0a32f6 sched/core: Fix TASK_DEAD race in finish_task_switch()
commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream.

So the problem this patch is trying to address is as follows:

        CPU0                            CPU1

        context_switch(A, B)
                                        ttwu(A)
                                          LOCK A->pi_lock
                                          A->on_cpu == 0
        finish_task_switch(A)
          prev_state = A->state  <-.
          WMB                      |
          A->on_cpu = 0;           |
          UNLOCK rq0->lock         |
                                   |    context_switch(C, A)
                                   `--  A->state = TASK_DEAD
          prev_state == TASK_DEAD
            put_task_struct(A)
                                        context_switch(A, C)
                                        finish_task_switch(A)
                                          A->state == TASK_DEAD
                                            put_task_struct(A)

The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.

Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.

We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.

The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes: e4a52bcb9a ("sched: Remove rq->lock from the first half of ttwu()")
Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2019-07-27 22:09:34 +02:00
..
cpu sched/idle: Add missing checks to the exit condition of cpu_idle_poll() 2019-07-27 21:50:45 +02:00
debug mm: per-thread vma caching 2019-07-27 22:08:06 +02:00
events perf/core: Fix perf_pmu_unregister() locking 2019-07-27 21:53:14 +02:00
gcov
irq This is the 3.10.99 stable release 2017-04-18 17:17:46 +02:00
locking locking/lockdep: Use for_each_process_thread() for debug_show_all_locks() 2019-07-27 22:09:22 +02:00
power PM: convert do_each_thread to for_each_process_thread 2019-07-27 22:09:18 +02:00
rcu
sched sched/core: Fix TASK_DEAD race in finish_task_switch() 2019-07-27 22:09:34 +02:00
time nohz: Fix local_timer_softirq_pending() 2019-07-27 21:52:58 +02:00
trace ring-buffer: Allow for rescheduling when removing pages 2019-07-27 21:51:54 +02:00
.gitignore
acct.c
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2019-07-27 21:49:48 +02:00
audit.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
audit.h Import latest Samsung release 2017-04-18 03:43:52 +02:00
audit_tree.c
audit_watch.c audit: fix use-after-free in audit_add_watch 2019-07-27 21:51:40 +02:00
auditfilter.c
auditsc.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
backtracetest.c
bounds.c
capability.c
cgroup.c cgroup: prefer %pK to %p 2016-12-06 09:24:09 -08:00
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c cpu: send KOBJ_ONLINE event when enabling cpus 2017-07-24 01:09:04 -07:00
cpu_pm.c
cpuset.c sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs 2019-07-27 22:08:19 +02:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c exit: fix race between wait_consider_task() and wait_task_zombie() 2019-07-27 22:09:34 +02:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2019-07-27 21:44:25 +02:00
fork.c memcg: kill CONFIG_MM_OWNER 2019-07-27 22:09:28 +02:00
freezer.c freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes 2019-07-27 22:09:18 +02:00
futex.c futex: Ensure that futex address is aligned in handle_futex_death() 2019-07-27 22:08:53 +02:00
futex_compat.c
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
hrtimer.c hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) 2019-07-27 21:49:51 +02:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2019-07-27 22:09:21 +02:00
irq_work.c
itimer.c
jump_label.c
kallsyms.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks Import latest Samsung release 2017-04-18 03:43:52 +02:00
Kconfig.preempt
kexec.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
Makefile
modsign_pubkey.c
module-internal.h
module.c module: Invalidate signatures on force-loaded modules 2019-07-27 21:42:00 +02:00
module_signing.c
notifier.c
nsproxy.c
padata.c padata: avoid race in reordering 2019-07-27 21:44:05 +02:00
panic.c printk: do cond_resched() between lines while outputting to consoles 2019-07-27 21:41:46 +02:00
params.c kernel/params.c: align add_sysfs_param documentation with code 2019-07-27 21:45:35 +02:00
pid.c BACKPORT: FROMLIST: pids: make task_tgid_nr_ns() safe 2018-05-26 00:39:33 +02:00
pid_namespace.c
posix-cpu-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
posix-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
printk.c printk: use rcuidle console tracepoint 2019-07-27 21:44:09 +02:00
profile.c
ptrace.c ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock 2019-07-27 21:45:46 +02:00
range.c
relay.c kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE 2019-07-27 21:49:13 +02:00
res_counter.c
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2019-07-27 22:05:58 +02:00
seccomp.c UPSTREAM: seccomp: always propagate NO_NEW_PRIVS on tsync 2019-07-27 21:51:01 +02:00
signal.c signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch] 2019-07-27 22:09:29 +02:00
smp.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
smpboot.c
smpboot.h
softirq.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
stacktrace.c
stop_machine.c
sys.c exit.c: unexport __set_special_pids() 2019-07-27 22:09:28 +02:00
sys_ni.c
sysctl.c pipe: reject F_SETPIPE_SZ with size over UINT_MAX 2019-07-27 21:49:46 +02:00
sysctl_binary.c
system_certificates.S
system_keyring.c
task_work.c
taskstats.c
test_kprobes.c
time.c time: Make sure jiffies_to_msecs() preserves non-zero time periods 2019-07-27 21:52:48 +02:00
timeconst.bc
timer.c timers: Use proper base migration in add_timer_on() 2019-07-27 21:42:23 +02:00
tracepoint.c tracing: syscall_regfunc() should not skip kernel threads 2019-07-27 22:09:17 +02:00
tsacct.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
up.c
user-return-notifier.c
user.c
user_namespace.c userns: move user access out of the mutex 2019-07-27 21:51:26 +02:00
utsname.c
utsname_sysctl.c
watchdog.c
workqueue.c workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq 2019-07-27 21:45:23 +02:00
workqueue_internal.h