android_kernel_samsung_msm8976/kernel
Balasubramani Vivekanandan 4bf0441ae5 tick: broadcast-hrtimer: Fix a race in bc_set_next
commit b9023b91dd020ad7e093baa5122b6968c48cc9e0 upstream.

When a cpu requests broadcasting, before starting the tick broadcast
hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active
using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide
the required synchronization when the callback is active on other core.

The callback could have already executed tick_handle_oneshot_broadcast()
and could have also returned. But still there is a small time window where
the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns
without doing anything, but the next_event of the tick broadcast clock
device is already set to a timeout value.

In the race condition diagram below, CPU #1 is running the timer callback
and CPU #2 is entering idle state and so calls bc_set_next().

In the worst case, the next_event will contain an expiry time, but the
hrtimer will not be started which happens when the racing callback returns
HRTIMER_NORESTART. The hrtimer might never recover if all further requests
from the CPUs to subscribe to tick broadcast have timeout greater than the
next_event of tick broadcast clock device. This leads to cascading of
failures and finally noticed as rcu stall warnings

Here is a depiction of the race condition

CPU #1 (Running timer callback)                   CPU #2 (Enter idle
                                                  and subscribe to
                                                  tick broadcast)
---------------------                             ---------------------

__run_hrtimer()                                   tick_broadcast_enter()

  bc_handler()                                      __tick_broadcast_oneshot_control()

    tick_handle_oneshot_broadcast()

      raw_spin_lock(&tick_broadcast_lock);

      dev->next_event = KTIME_MAX;                  //wait for tick_broadcast_lock
      //next_event for tick broadcast clock
      set to KTIME_MAX since no other cores
      subscribed to tick broadcasting

      raw_spin_unlock(&tick_broadcast_lock);

    if (dev->next_event == KTIME_MAX)
      return HRTIMER_NORESTART
    // callback function exits without
       restarting the hrtimer                      //tick_broadcast_lock acquired
                                                   raw_spin_lock(&tick_broadcast_lock);

                                                   tick_broadcast_set_event()

                                                     clockevents_program_event()

                                                       dev->next_event = expires;

                                                       bc_set_next()

                                                         hrtimer_try_to_cancel()
                                                         //returns -1 since the timer
                                                         callback is active. Exits without
                                                         restarting the timer
  cpu_base->running = NULL;

The comment that hrtimer cannot be armed from within the callback is
wrong. It is fine to start the hrtimer from within the callback. Also it is
safe to start the hrtimer from the enter/exit idle code while the broadcast
handler is active. The enter/exit idle code and the broadcast handler are
synchronized using tick_broadcast_lock. So there is no need for the
existing try to cancel logic. All this can be removed which will eliminate
the race condition as well.

Fixes: 5d1638acb9f6 ("tick: Introduce hrtimer based broadcast")
Change-Id: I4f5d95fad77d252df9334c2bbf997342ecc19d41
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Balasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-12-21 20:02:36 +01:00
..
cpu sched/idle: Add missing checks to the exit condition of cpu_idle_poll() 2019-07-27 21:50:45 +02:00
debug mm: per-thread vma caching 2019-07-27 22:08:06 +02:00
events perf/core: Fix impossible ring-buffer sizes warning 2019-07-27 22:10:22 +02:00
gcov
irq This is the 3.10.99 stable release 2017-04-18 17:17:46 +02:00
locking locking/lockdep: Use for_each_process_thread() for debug_show_all_locks() 2019-07-27 22:09:22 +02:00
power workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented workqueues 2019-07-27 22:11:01 +02:00
rcu rcu: Do RCU GP kthread self-wakeup from softirq and interrupt 2019-07-27 22:11:18 +02:00
sched sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target 2019-07-27 22:11:08 +02:00
time tick: broadcast-hrtimer: Fix a race in bc_set_next 2019-12-21 20:02:36 +01:00
trace trace: Fix preempt_enable_no_resched() abuse 2019-07-27 22:10:40 +02:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks Import latest Samsung release 2017-04-18 03:43:52 +02:00
Kconfig.preempt
Makefile UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
acct.c
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2019-07-27 21:49:48 +02:00
audit.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
audit.h Import latest Samsung release 2017-04-18 03:43:52 +02:00
audit_tree.c
audit_watch.c audit: fix use-after-free in audit_add_watch 2019-07-27 21:51:40 +02:00
auditfilter.c
auditsc.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
backtracetest.c
bounds.c
capability.c
cgroup.c cgroup: prefer %pK to %p 2016-12-06 09:24:09 -08:00
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c cpu: send KOBJ_ONLINE event when enabling cpus 2017-07-24 01:09:04 -07:00
cpu_pm.c
cpuset.c sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs 2019-07-27 22:08:19 +02:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c ANDROID: exec_domains: Disable request_module() call for personalities 2016-05-18 14:34:40 +05:30
exit.c wait/ptrace: assume __WALL if the child is traced 2019-07-27 22:09:56 +02:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2019-07-27 21:44:25 +02:00
fork.c fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) 2019-07-27 22:10:35 +02:00
freezer.c freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes 2019-07-27 22:09:18 +02:00
futex.c futex: Ensure that futex address is aligned in handle_futex_death() 2019-07-27 22:08:53 +02:00
futex_compat.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 11:57:47 -08:00
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
hrtimer.c hrtimer: Store cpu-number in struct hrtimer_cpu_base 2019-12-21 20:01:04 +01:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2019-07-27 22:09:21 +02:00
irq_work.c irq_work: Remove BUG_ON in irq_work_run() 2016-01-07 00:42:12 -08:00
itimer.c
jump_label.c
kallsyms.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kcmp.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 11:57:47 -08:00
kexec.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kmod.c
kprobes.c
ksysfs.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kthread.c kthread: Fix the race condition when kthread is parked 2015-06-04 17:43:41 -07:00
latencytop.c
modsign_pubkey.c
module-internal.h UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
module.c param: hand arguments after -- straight to init 2019-07-27 22:10:44 +02:00
module_signing.c UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
notifier.c
nsproxy.c Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children 2019-07-27 22:10:33 +02:00
padata.c padata: avoid race in reordering 2019-07-27 21:44:05 +02:00
panic.c printk: do cond_resched() between lines while outputting to consoles 2019-07-27 21:41:46 +02:00
params.c param: hand arguments after -- straight to init 2019-07-27 22:10:44 +02:00
pid.c kernel/fork.c:copy_process(): don't add the uninitialized child to thread/task/pid lists 2019-07-27 22:10:32 +02:00
pid_namespace.c Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children 2019-07-27 22:10:33 +02:00
posix-cpu-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
posix-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
printk.c printk: use rcuidle console tracepoint 2019-07-27 21:44:09 +02:00
profile.c
ptrace.c signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO 2019-07-27 22:11:17 +02:00
range.c
relay.c treewide: Fix typo in Documentation/DocBook 2019-07-27 22:10:20 +02:00
res_counter.c
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2019-07-27 22:05:58 +02:00
seccomp.c UPSTREAM: seccomp: always propagate NO_NEW_PRIVS on tsync 2019-07-27 21:51:01 +02:00
signal.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
smp.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
smpboot.c smpboot: use kmemleak_not_leak for smpboot_thread_data 2015-05-11 17:07:29 +05:30
smpboot.h
softirq.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
stacktrace.c
stop_machine.c
sys.c exit.c: unexport __set_special_pids() 2019-07-27 22:09:28 +02:00
sys_ni.c seccomp: add "seccomp" syscall 2015-03-19 14:52:50 -07:00
sysctl.c kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv 2019-07-27 22:11:27 +02:00
sysctl_binary.c
system_certificates.S UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
system_keyring.c UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
task_work.c
taskstats.c
test_kprobes.c
time.c time: Make sure jiffies_to_msecs() preserves non-zero time periods 2019-07-27 21:52:48 +02:00
timeconst.bc
timer.c timers: Use proper base migration in add_timer_on() 2019-07-27 21:42:23 +02:00
tracepoint.c tracing: syscall_regfunc() should not skip kernel threads 2019-07-27 22:09:17 +02:00
tsacct.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
up.c
user-return-notifier.c
user.c
user_namespace.c userns: move user access out of the mutex 2019-07-27 21:51:26 +02:00
utsname.c
utsname_sysctl.c
watchdog.c
workqueue.c workqueue: Add system wide power_efficient workqueues 2019-07-27 22:11:01 +02:00
workqueue_internal.h