android_kernel_samsung_msm8976/kernel
Thomas Gleixner 4d170043ef hrtimer: Reset hrtimer cpu base proper on CPU hotplug
commit d5421ea43d30701e03cadc56a38854c36a8b4433 upstream.

The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.

If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.

Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.

Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.

Fixes: 41d2e49493 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
[bwh: Backported to 3.2:
 - There's no next_timer field to reset
 - Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-07-27 21:46:32 +02:00
..
cpu idle: add a check for need_resched() after rcu_idle_enter 2016-10-03 20:28:27 -07:00
debug
events perf/core: Fix group {cpu,task} validation 2019-07-27 21:45:02 +02:00
gcov
irq This is the 3.10.99 stable release 2017-04-18 17:17:46 +02:00
locking Import latest Samsung release 2017-04-18 03:43:52 +02:00
power PM / sleep: fix device reference leak in test_suspend 2019-07-27 21:42:48 +02:00
rcu rcu: Don't disable CPU hotplug during OOM notifiers 2016-01-06 23:11:06 -08:00
sched sched/topology: Fix overlapping sched_group_mask 2019-07-27 21:45:20 +02:00
time nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() 2019-07-27 21:46:23 +02:00
trace tracing: Fix possible double free on failure of allocating trace buffer 2019-07-27 21:46:22 +02:00
.gitignore
acct.c
async.c
audit.c audit: Partially remove Samsung changes 2018-02-06 13:12:28 +01:00
audit.h Import latest Samsung release 2017-04-18 03:43:52 +02:00
audit_tree.c
audit_watch.c audit: Fix use after free in audit_remove_watch_rule() 2019-07-27 21:45:01 +02:00
auditfilter.c
auditsc.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
backtracetest.c
bounds.c
capability.c
cgroup.c cgroup: prefer %pK to %p 2016-12-06 09:24:09 -08:00
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c cpu: send KOBJ_ONLINE event when enabling cpus 2017-07-24 01:09:04 -07:00
cpu_pm.c
cpuset.c cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags 2019-07-27 21:44:59 +02:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c ANDROID: exec_domains: Disable request_module() call for personalities 2016-05-18 14:34:40 +05:30
exit.c kernel: Only expose su when daemon is running 2017-05-15 14:43:52 +00:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2019-07-27 21:44:25 +02:00
fork.c mm: migrate: prevent racy access to tlb_flush_pending 2019-07-27 21:45:21 +02:00
freezer.c
futex.c futex: Prevent overflow by strengthen input validation 2019-07-27 21:46:29 +02:00
futex_compat.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 11:57:47 -08:00
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
hrtimer.c hrtimer: Reset hrtimer cpu base proper on CPU hotplug 2019-07-27 21:46:32 +02:00
hung_task.c
irq_work.c irq_work: Remove BUG_ON in irq_work_run() 2016-01-07 00:42:12 -08:00
itimer.c
jump_label.c
kallsyms.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
kcmp.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 11:57:47 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks Import latest Samsung release 2017-04-18 03:43:52 +02:00
Kconfig.preempt
kexec.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
Makefile UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
modsign_pubkey.c
module-internal.h UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
module.c module: Invalidate signatures on force-loaded modules 2019-07-27 21:42:00 +02:00
module_signing.c UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
notifier.c
nsproxy.c
padata.c padata: avoid race in reordering 2019-07-27 21:44:05 +02:00
panic.c printk: do cond_resched() between lines while outputting to consoles 2019-07-27 21:41:46 +02:00
params.c kernel/params.c: align add_sysfs_param documentation with code 2019-07-27 21:45:35 +02:00
pid.c BACKPORT: FROMLIST: pids: make task_tgid_nr_ns() safe 2018-05-26 00:39:33 +02:00
pid_namespace.c
posix-cpu-timers.c
posix-timers.c posix-timer: Properly check sigevent->sigev_notify 2019-07-27 21:46:18 +02:00
printk.c printk: use rcuidle console tracepoint 2019-07-27 21:44:09 +02:00
profile.c
ptrace.c ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock 2019-07-27 21:45:46 +02:00
range.c
relay.c
res_counter.c
resource.c This is the 3.10.99 stable release 2017-04-18 17:17:46 +02:00
seccomp.c UPSTREAM: seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO 2016-05-18 14:36:06 +05:30
signal.c signal: Only reschedule timers on signals timers have sent 2019-07-27 21:44:51 +02:00
smp.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
smpboot.c
smpboot.h
softirq.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
stacktrace.c
stop_machine.c
sys.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
sys_ni.c
sysctl.c sched/sysctl: Check user input value of sysctl_sched_time_avg 2019-07-27 21:45:34 +02:00
sysctl_binary.c
system_certificates.S UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
system_keyring.c UPSTREAM: KEYS: Separate the kernel signature checking keyring from module signing 2016-05-18 14:36:10 +05:30
task_work.c
taskstats.c
test_kprobes.c
time.c
timeconst.bc
timer.c timers: Use proper base migration in add_timer_on() 2019-07-27 21:42:23 +02:00
tracepoint.c
tsacct.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
up.c
user-return-notifier.c
user.c
user_namespace.c UPSTREAM: capabilities: ambient capabilities 2018-02-06 13:12:16 +01:00
utsname.c
utsname_sysctl.c
watchdog.c
workqueue.c workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq 2019-07-27 21:45:23 +02:00
workqueue_internal.h