android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Dmitry Shmidt	13e2b3277d	PM: wakeup_reason: add check_wakeup_reason() to verify wakeup source irq Wakeup reason is set before driver resume handlers are called. It is cleared before driver suspend handlers are called, on PM_SUSPEND_PREPARE. Change-Id: I04218c9b0c115a7877e8029c73e6679ff82e0aa4 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:24:04 +03:00
Ruchi Kandoi	e532467dab	power: log the last suspend abort reason. Extends the last_resume_reason to log suspend abort reason. The abort reasons will have "Abort:" appended at the start to distinguish itself from the resume reason. Change-Id: Id3c62fc0cb86ca2e05a69e40de040b94f32be389 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:23:56 +03:00
Ruchi Kandoi	1154a48192	PM: wakeup_reason: add functionality to log the last suspend-abort reason. Extends the last_resume_reason to log suspend abort reason. The abort reasons will have "Abort:" appended at the start to distinguish itself from the resume reason. Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com> Change-Id: I3207f1844e3d87c706dfc298fb10e1c648814c5f	2017-10-15 16:17:14 +03:00
Iliyan Malchev	c9816de694	PM: wakeup_reason: add functions to query and clear wakeup reasons The query results are valid until the next PM_SUSPEND_PREPARE. Change-Id: I6bc2bd47c830262319576a001d39ac9a994916cf Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:17:14 +03:00
jinqian	ec32c90b65	Power: Report suspend times from last_suspend_time This node epxorts two values separated by space. From left to right: 1. time spent in suspend/resume process 2. time spent sleep in suspend state Change-Id: I2cb9a9408a5fd12166aaec11b935a0fd6a408c63	2017-10-15 16:17:13 +03:00
Ruchi Kandoi	92b3e919c4	Power: Add guard condition for maximum wakeup reasons Ensure the array for the wakeup reason IRQs does not overflow. Change-Id: Iddc57a3aeb1888f39d4e7b004164611803a4d37c Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2017-10-15 16:17:13 +03:00
Stepan Moskovchenko	6365c6e304	PM / Sleep: Clean up remnants of workqueue-based sync When legacy wakelock code was removed in commit f85607a715a74c65db812cd3901022888257f966, some of the code for moving calls to sys_sync() from suspend paths into a workqueue item had not been properly removed. Specifically, one of the call sites to suspend_sys_sync_wait() has been mistakenly replaced with a call to sys_sync(), which is not necessary because the corresponding instance of suspend_sys_sync_queue() was already replaced with sys_sync(). Clean up the remnants of the legacy wakelock code by removing the extraneous call to sys_sync() and restoring some of the surrounding printk statements that had been moved to suspend_sys_sync_queue() and subsequently lost. CRs-Fixed: 498669 Change-Id: Ifb2ede7808560f456c824d3d6359a4541c51b73f Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>	2017-10-15 15:46:58 +03:00
Arve Hjønnevåg	516049d328	PM / Sleep: Fix a mistake in a conditional in autosleep_store() The condition check in autosleep_store() is incorrect and prevents /sys/power/autosleep from working as advertised. Fix that. [rjw: Added the changelog.] Change-Id: I231cc24fc3f245003dcf5053ff6a71eb69ffa273 Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Git-commit: `040e5bf65e` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:58 +03:00
Amar Singhal	9801869142	power: main: Add conditional compilation for touch nodes Add conditional compilation for touch event sysfs nodes. Otherwise, if CONFIG_PM_SLEEP is not defined, there could be compilation errors. Change-Id: I1ac7f284ec35eae2cfa076ef8e71c29ddc24817c Signed-off-by: Amar Singhal <asinghal@codeaurora.org> Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:57 +03:00
Rafael J. Wysocki	d364a08bcc	PM / Sleep: User space wakeup sources garbage collector Kconfig option Make it possible to configure out the user space wakeup sources garbage collector for debugging and default Android builds. Change-Id: I85ca6caa92c8e82d863f0fa58d8861b5571c1b4a Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Arve Hjønnevåg <arve@android.com> Git-commit: `4e585d25e1` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:57 +03:00
Rafael J. Wysocki	5265f5b2e8	PM / Sleep: Make the limit of user space wakeup sources configurable Make it possible to configure out the check against the limit of user space wakeup sources for debugging and default Android builds. Change-Id: I8f74d7c8391627df970d2df666938069b012e2fe Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Arve Hjønnevåg <arve@android.com> Git-commit: `c73893e2ca` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00
Anurag Singh	e535fb5292	power: Remove unnecessary options from Kconfig The config options HAS_WAKELOCK and WAKELOCK are not needed any more since a wakeup sources-based wakelock implementation will be used. Change-Id: I4163f048c079ec3d10a02d9db16c3ca6fb5fd759 Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00
Arve Hjønnevåg	f484965275	PM / Sleep: Add wake lock api wrapper on top of wakeup sources Change-Id: Icaad02fe1e8856fdc2e4215f380594a5dde8e002 Signed-off-by: Arve Hjønnevåg <arve@android.com> Git-commit: `e9911f4efd` Git-repo: git://codeaurora.org/kernel/msm.git [anursing@codeaurora.org: replace existing implementation, resolve merge conflicts] Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00
Rafael J. Wysocki	9f3de1bd53	PM / Sleep: Add user space interface for manipulating wakeup sources, v3 Android allows user space to manipulate wakelocks using two sysfs file located in /sys/power/, wake_lock and wake_unlock. Writing a wakelock name and optionally a timeout to the wake_lock file causes the wakelock whose name was written to be acquired (it is created before is necessary), optionally with the given timeout. Writing the name of a wakelock to wake_unlock causes that wakelock to be released. Implement an analogous interface for user space using wakeup sources. Add the /sys/power/wake_lock and /sys/power/wake_unlock files allowing user space to create, activate and deactivate wakeup sources, such that writing a name and optionally a timeout to wake_lock causes the wakeup source of that name to be activated, optionally with the given timeout. If that wakeup source doesn't exist, it will be created and then activated. Writing a name to wake_unlock causes the wakeup source of that name, if there is one, to be deactivated. Wakeup sources created with the help of wake_lock that haven't been used for more than 5 minutes are garbage collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT wakeup sources created with the help of wake_lock present at a time. The data type used to track wakeup sources created by user space is called "struct wakelock" to indicate the origins of this feature. This version of the patch includes an rbtree manipulation fix from John Stultz. Change-Id: Icb452cfd54362b49dcb1cff88345928a2528ad97 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de> Git-commit: `b86ff9820f` Git-repo: git://codeaurora.org/kernel/msm.git [anursing@codeaurora.org: replace existing implementation, resolve merge conflicts] Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:55 +03:00
Rafael J. Wysocki	aeeab22b61	PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Android uses one wakelock statistics that is only necessary for opportunistic sleep. Namely, the prevent_suspend_time field accumulates the total time the given wakelock has been locked while "automatic suspend" was enabled. Add an analogous field, prevent_sleep_time, to wakeup sources and make it behave in a similar way. Change-Id: I4b9719d05da020757d7cc21ed3b52b7c32261bea Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Git-commit: `55850945e8` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:55 +03:00
Rafael J. Wysocki	18c69d40cd	PM / Sleep: Implement opportunistic sleep, v2 Introduce a mechanism by which the kernel can trigger global transitions to a sleep state chosen by user space if there are no active wakeup sources. It consists of a new sysfs attribute, /sys/power/autosleep, that can be written one of the strings returned by reads from /sys/power/state, an ordered workqueue and a work item carrying out the "suspend" operations. If a string representing the system's sleep state is written to /sys/power/autosleep, the work item triggering transitions to that state is queued up and it requeues itself after every execution until user space writes "off" to /sys/power/autosleep. That work item enables the detection of wakeup events using the functions already defined in drivers/base/power/wakeup.c (with one small modification) and calls either pm_suspend(), or hibernate() to put the system into a sleep state. If a wakeup event is reported while the transition is in progress, it will abort the transition and the "system suspend" work item will be queued up again. Change-Id: Ic3214de009c64feab606e93811bd442ccfc49d86 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de> Git-commit: `7483b4a4d9` Git-repo: git://codeaurora.org/kernel/msm.git [anursing@codeaurora.org: replace existing implementation, resolve merge conflicts] Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:54 +03:00
Anurag Singh	960ea7ac6c	power: Remove legacy wakelock code. Remove parts of code that use the legacy wakelock implementation as we make the switch to the upstream kernel's wakeup sources-based implementation. Change-Id: Idab9e3dd54e8a256b059c88606c9368f7ddb1c1b Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-10 15:41:57 +03:00
Anurag Singh	23ad8549f7	Revert "power: main: Add conditional compilation for touch nodes" This reverts commit 8c880ff58fd257d3e9f20e04704cd9d1f3371fc7. Reverting the change is required to allow for cleaner conflict resolution as we move to the upstream kernel's wakeup sources based wakelock implementation. Change-Id: Ie14fe26a0fa244815dde7845330351beeef66f99 Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-10 15:40:54 +03:00
Al Viro	be084ce9f1	move d_rcu from overlapping d_child to overlapping d_alias commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream. Change-Id: I85366e6ce0423ec9620bcc9cd3e7695e81aa1171 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [bwh: Backported to 3.2: - Apply name changes in all the different places we use d_alias and d_child - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [lizf: Backported to 3.4: - adjust context - need one more name change in debugfs]	2017-09-22 19:11:55 +03:00
Al Viro	f787204b2f	get rid of kern_path_parent() all callers want the same thing, actually - a kinda-sorta analog of kern_path_create(). I.e. they want parent vfsmount/dentry (with ->i_mutex held, to make sure the child dentry is still their child) + the child dentry. Signed-off-by Al Viro <viro@zeniv.linux.org.uk> Change-Id: I58cc7b0a087646516db9af69962447d27fb3ee8b	2017-09-22 19:11:54 +03:00
Eric W. Biederman	d621e5dc9c	userns: make each net (net_ns) belong to a user_ns The user namespace which creates a new network namespace owns that namespace and all resources created in it. This way we can target capability checks for privileged operations against network resources to the user_ns which created the network namespace in which the resource lives. Privilege to the user namespace which owns the network namespace, or any parent user namespace thereof, provides the same privilege to the network resource. This patch is reworked from a version originally by Serge E. Hallyn <serge.hallyn@canonical.com> Change-Id: Ifa426537c47cce669099cc96e80b17e1d814457b Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-09-01 13:38:07 +03:00
Eric W. Biederman	8b4596c91f	userns: Add kuid_t and kgid_t and associated infrastructure in uidgid.h Start distinguishing between internal kernel uids and gids and values that userspace can use. This is done by introducing two new types: kuid_t and kgid_t. These types and their associated functions are infrastructure are declared in the new header uidgid.h. Ultimately there will be a different implementation of the mapping functions for use with user namespaces. But to keep it simple we introduce the mapping functions first to separate the meat from the mechanical code conversions. Export overflowuid and overflowgid so we can use from_kuid_munged and from_kgid_munged in modular code. Change-Id: Id934f981c91aa9c656e045d4f45f958c7fc2f1b0 Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-09-01 13:38:06 +03:00
John Stultz	5d231460e1	hrtimer: Provide clock_was_set_delayed() This is a backport of `f55a6faa38` clock_was_set() cannot be called from hard interrupt context because it calls on_each_cpu(). For fixing the widely reported leap seconds issue it is necessary to call it from hard interrupt context, i.e. the timer tick code, which does the timekeeping updates. Provide a new function which denotes it in the hrtimer cpu base structure of the cpu on which it is called and raise the hrtimer softirq. We then execute the clock_was_set() notificiation from softirq context in run_hrtimer_softirq(). The hrtimer softirq is rarely used, so polling the flag there is not a performance issue. [ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get rid of all this ifdeffery ASAP ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Reported-by: Jan Engelhardt <jengelh@inai.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:23 +03:00
Thomas Gleixner	0cf6cbf5c2	tick: Upstream fixes * Addresses the issue of timers being scheduled on offline CPUs. tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline commit `5b39939a4` (nohz: Move ts->idle_calls incrementation into strict idle logic) moved code out of tick_nohz_stop_sched_tick() and missed to bail out when the cpu is offline. That's causing subsequent failures as an offline CPU is supposed to die and not to fiddle with nohz magic. Return false in can_stop_idle_tick() if the cpu is offline. Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz> Reported-and-tested-by: Prarit Bhargava <prarit@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305132138160.2863@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> timers: Consolidate base->next_timer update Another bunch of mindlessly copied code. All callers of internal_add_timer() except the recascading code updates base->next_timer. Move this into internal_add_timer() and let the cascading code call __internal_add_timer(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20120525214819.189946224@linutronix.de timers: Create detach_if_pending() and use it Most callers of detach_timer() have the same pattern around them. Check whether the timer is pending and eventually updating base->next_timer. Create detach_if_pending() and replace the duplicated code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20120525214819.131246037@linutronix.de timers: Add accounting of non deferrable timers The code in get_next_timer_interrupt() is suboptimal as it has to run through the cascade to find the next expiring timer. On a completely idle core we should only do that when there is an active timer enqueued and base->next_timer does not give us a fast answer. Add accounting of the active timers to the now consolidated attach/detach code. I deliberately avoided sanity checks because the code is fully symetric and any fiddling with timers w/o using the API functions will lead to cute explosions anyway. ulong is big enough even on 32bit and if we really run into the situation to have more than 1<<32 timers enqueued there, then we are definitely not in a state to go idle and run through that code. This allows us to fix another shortcoming of get_next_timer_interrupt(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20120525214819.236377028@linutronix.de timers: Improve get_next_timer_interrupt() Gilad reported at http://lkml.kernel.org/r/1336056962-10465-2-git-send-email-gilad@benyossef.com "Current timer code fails to correctly return a value meaning that there is no future timer event, with the result that the timer keeps getting re-armed in HZ one shot mode even when we could turn it off, generating unneeded interrupts. What is happening is that when __next_timer_interrupt() wishes to return a value that signifies "there is no future timer event", it returns (base->timer_jiffies + NEXT_TIMER_MAX_DELTA). However, the code in tick_nohz_stop_sched_tick(), which called __next_timer_interrupt() via get_next_timer_interrupt(), compares the return value to (last_jiffies + NEXT_TIMER_MAX_DELTA) to see if the timer needs to be re-armed. base->timer_jiffies != last_jiffies and so tick_nohz_stop_sched_tick() interperts the return value as indication that there is a distant future event 12 days from now and programs the timer to fire next after KTIME_MAX nsecs instead of avoiding to arm it. This ends up causing a needless interrupt once every KTIME_MAX nsecs." Fix this by using the new active timer accounting. This avoids scans when no active timer is enqueued completely, so we don't have to rely on base->timer_next and base->timer_jiffies anymore. Change-Id: I874ee5e5f837a228cebf5e6a084baf520d33cd0f Reported-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20120525214819.317535385@linutronix.de	2017-08-25 20:00:22 +03:00
Bart Van Assche	6529714b12	timer: Fix jiffies wrap behavior of round_jiffies_common() commit `9e04d3804d` upstream. Direct compare of jiffies related values does not work in the wrap around case. Replace it with time_is_after_jiffies(). Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/519BC066.5080600@acm.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:22 +03:00
John Stultz	e4d3890d6a	time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge commit 833f32d763028c1bb371c64f457788b933773b3e upstream. Currently, leapsecond adjustments are done at tick time. As a result, the leapsecond was applied at the first timer tick after the leapsecond (~1-10ms late depending on HZ), rather then exactly on the second edge. This was in part historical from back when we were always tick based, but correcting this since has been avoided since it adds extra conditional checks in the gettime fastpath, which has performance overhead. However, it was recently pointed out that ABS_TIME CLOCK_REALTIME timers set for right after the leapsecond could fire a second early, since some timers may be expired before we trigger the timekeeping timer, which then applies the leapsecond. This isn't quite as bad as it sounds, since behaviorally it is similar to what is possible w/ ntpd made leapsecond adjustments done w/o using the kernel discipline. Where due to latencies, timers may fire just prior to the settimeofday call. (Also, one should note that all applications using CLOCK_REALTIME timers should always be careful, since they are prone to quirks from settimeofday() disturbances.) However, the purpose of having the kernel do the leap adjustment is to avoid such latencies, so I think this is worth fixing. So in order to properly keep those timers from firing a second early, this patch modifies the ntp and timekeeping logic so that we keep enough state so that the update_base_offsets_now accessor, which provides the hrtimer core the current time, can check and apply the leapsecond adjustment on the second edge. This prevents the hrtimer core from expiring timers too early. This patch does not modify any other time read path, so no additional overhead is incurred. However, this also means that the leap-second continues to be applied at tick time for all other read-paths. Apologies to Richard Cochran, who pushed for similar changes years ago, which I resisted due to the concerns about the performance overhead. While I suspect this isn't extremely critical, folks who care about strict leap-second correctness will likely want to watch this. Potentially a -stable candidate eventually. Originally-suggested-by: Richard Cochran <richardcochran@gmail.com> Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reported-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1434063297-28657-4-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [Yadi: Move do_adjtimex to timekeeping.c and solve context issues] Signed-off-by: Hu <yadi.hu@windriver.com> Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:21 +03:00
John Stultz	3bfcd801f4	timekeeping: Avoid possible deadlock from clock_was_set_delayed commit `6fdda9a9c5` upstream. As part of normal operaions, the hrtimer subsystem frequently calls into the timekeeping code, creating a locking order of hrtimer locks -> timekeeping locks clock_was_set_delayed() was suppoed to allow us to avoid deadlocks between the timekeeping the hrtimer subsystem, so that we could notify the hrtimer subsytem the time had changed while holding the timekeeping locks. This was done by scheduling delayed work that would run later once we were out of the timekeeing code. But unfortunately the lock chains are complex enoguh that in scheduling delayed work, we end up eventually trying to grab an hrtimer lock. Sasha Levin noticed this in testing when the new seqlock lockdep enablement triggered the following (somewhat abrieviated) message: [ 251.100221] ====================================================== [ 251.100221] [ INFO: possible circular locking dependency detected ] [ 251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted [ 251.101967] ------------------------------------------------------- [ 251.101967] kworker/10:1/4506 is trying to acquire lock: [ 251.101967] (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70 [ 251.101967] [ 251.101967] but task is already holding lock: [ 251.101967] (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70 [ 251.101967] [ 251.101967] which lock already depends on the new lock. [ 251.101967] [ 251.101967] [ 251.101967] the existing dependency chain (in reverse order) is: [ 251.101967] -> #5 (hrtimer_bases.lock#11){-.-...}: [snipped] -> #4 (&rt_b->rt_runtime_lock){-.-...}: [snipped] -> #3 (&rq->lock){-.-.-.}: [snipped] -> #2 (&p->pi_lock){-.-.-.}: [snipped] -> #1 (&(&pool->lock)->rlock){-.-...}: [ 251.101967] [<ffffffff81194803>] validate_chain+0x6c3/0x7b0 [ 251.101967] [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580 [ 251.101967] [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0 [ 251.101967] [<ffffffff84398500>] _raw_spin_lock+0x40/0x80 [ 251.101967] [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0 [ 251.101967] [<ffffffff81154168>] queue_work_on+0x98/0x120 [ 251.101967] [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30 [ 251.101967] [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160 [ 251.101967] [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70 [ 251.101967] [<ffffffff843a4b49>] ia32_sysret+0x0/0x5 [ 251.101967] -> #0 (timekeeper_seq){----..}: [snipped] [ 251.101967] other info that might help us debug this: [ 251.101967] [ 251.101967] Chain exists of: timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11 [ 251.101967] Possible unsafe locking scenario: [ 251.101967] [ 251.101967] CPU0 CPU1 [ 251.101967] ---- ---- [ 251.101967] lock(hrtimer_bases.lock#11); [ 251.101967] lock(&rt_b->rt_runtime_lock); [ 251.101967] lock(hrtimer_bases.lock#11); [ 251.101967] lock(timekeeper_seq); [ 251.101967] [ 251.101967] * DEADLOCK * [ 251.101967] [ 251.101967] 3 locks held by kworker/10:1/4506: [ 251.101967] #0: (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530 [ 251.101967] #1: (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530 [ 251.101967] #2: (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70 [ 251.101967] [ 251.101967] stack backtrace: [ 251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 [ 251.101967] Workqueue: events clock_was_set_work So the best solution is to avoid calling clock_was_set_delayed() while holding the timekeeping lock, and instead using a flag variable to decide if we should call clock_was_set() once we've released the locks. This works for the case here, where the do_adjtimex() was the deadlock trigger point. Unfortuantely, in update_wall_time() we still hold the jiffies lock, which would deadlock with the ipi triggered by clock_was_set(), preventing us from calling it even after we drop the timekeeping lock. So instead call clock_was_set_delayed() at that point. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Sasha Levin <sasha.levin@oracle.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:20 +03:00
Dan Carpenter	086c0e4cc9	timekeeping: Cast raw_interval to u64 to avoid shift overflow commit `5b3900cd40` upstream. We fixed a bunch of integer overflows in timekeeping code during the 3.6 cycle. I did an audit based on that and found this potential overflow. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20121009071823.GA19159@elgon.mountain Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ben Hutchings <ben@decadent.org.uk> [ herton: adapt for 3.5, timekeeper instead of tk pointer ] Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:20 +03:00
John Stultz	c15dcac835	time: Move ktime_t overflow checking into timespec_valid_strict commit `cee58483cf` upstream Andreas Bombe reported that the added ktime_t overflow checking added to timespec_valid in commit `4e8b14526c` ("time: Improve sanity checking of timekeeping inputs") was causing problems with X.org because it caused timeouts larger then KTIME_T to be invalid. Previously, these large timeouts would be clamped to KTIME_MAX and would never expire, which is valid. This patch splits the ktime_t overflow checking into a new timespec_valid_strict function, and converts the timekeeping codes internal checking to use this more strict function. Reported-and-tested-by: Andreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:19 +03:00
John Stultz	41d8c950a0	time: Avoid making adjustments if we haven't accumulated anything commit `bf2ac31219` upstream. If update_wall_time() is called and the current offset isn't large enough to accumulate, avoid re-calling timekeeping_adjust which may change the clock freq and can cause 1ns inconsistencies with CLOCK_REALTIME_COARSE/CLOCK_MONOTONIC_COARSE. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-5-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:19 +03:00
John Stultz	91ff746d37	time: Improve sanity checking of timekeeping inputs commit `4e8b14526c` upstream. Unexpected behavior could occur if the time is set to a value large enough to overflow a 64bit ktime_t (which is something larger then the year 2262). Also unexpected behavior could occur if large negative offsets are injected via adjtimex. So this patch improves the sanity check timekeeping inputs by improving the timespec_valid() check, and then makes better use of timespec_valid() to make sure we don't set the time to an invalid negative value or one that overflows ktime_t. Note: This does not protect from setting the time close to overflowing ktime_t and then letting natural accumulation cause the overflow. Reported-by: CAI Qian <caiqian@redhat.com> Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1344454580-17031-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:18 +03:00
Thomas Gleixner	92429b155c	timekeeping: Add missing update call in timekeeping_resume() This is a backport of `3e997130bd` The leap second rework unearthed another issue of inconsistent data. On timekeeping_resume() the timekeeper data is updated, but nothing calls timekeeping_update(), so now the update code in the timer interrupt sees stale values. This has been the case before those changes, but then the timer interrupt was using stale data as well so this went unnoticed for quite some time. Add the missing update call, so all the data is consistent everywhere. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de> Cc: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:17 +03:00
Thomas Gleixner	a3b55f8dcb	timekeeping: Provide hrtimer update function This is a backport of `f6c06abfb3` To finally fix the infamous leap second issue and other race windows caused by functions which change the offsets between the various time bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a function which atomically gets the current monotonic time and updates the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic overhead. The previous patch which provides ktime_t offsets allows us to make this function almost as cheap as ktime_get() which is going to be replaced in hrtimer_interrupt(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:17 +03:00
Thomas Gleixner	b49c7e2fcb	timekeeping: Maintain ktime_t based offsets for hrtimers This is a backport of `5b9fe759a6` We need to update the hrtimer clock offsets from the hrtimer interrupt context. To avoid conversions from timespec to ktime_t maintain a ktime_t based representation of those offsets in the timekeeper. This puts the conversion overhead into the code which updates the underlying offsets and provides fast accessible values in the hrtimer interrupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:16 +03:00
John Stultz	4a1101d81d	timekeeping: Fix leapsecond triggered load spike issue This is a backport of `4873fa070a` The timekeeping code misses an update of the hrtimer subsystem after a leap second happened. Due to that timers based on CLOCK_REALTIME are either expiring a second early or late depending on whether a leap second has been inserted or deleted until an operation is initiated which causes that update. Unless the update happens by some other means this discrepancy between the timekeeping and the hrtimer data stays forever and timers are expired either early or late. The reported immediate workaround - $ data -s "`date`" - is causing a call to clock_was_set() which updates the hrtimer data structures. See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix Add the missing clock_was_set() call to update_wall_time() in case of a leap second event. The actual update is deferred to softirq context as the necessary smp function call cannot be invoked from hard interrupt context. Signed-off-by: John Stultz <johnstul@us.ibm.com> Reported-by: Jan Engelhardt <jengelh@inai.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:16 +03:00
John Stultz	b6bbe327be	timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond commit `fad0c66c4b` upstream. Commit `6b43ae8a61` (ntp: Fix leap-second hrtimer livelock) broke the leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC. Adjust wall_to_monotonic when NTP inserted a leapsecond. Reported-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:15 +03:00
John Stultz	3e8aa46fd1	ntp: Fix STA_INS/DEL clearing bug commit `6b1859dba0` upstream. In commit `6b43ae8a61`, I introduced a bug that kept the STA_INS or STA_DEL bit from being cleared from time_status via adjtimex() without forcing STA_PLL first. Usually once the STA_INS is set, it isn't cleared until the leap second is applied, so its unlikely this affected anyone. However during testing I noticed it took some effort to cancel a leap second once STA_INS was set. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:14 +03:00
Richard Cochran	18de8efa26	ntp: Correct TAI offset during leap second commit `dd48d708ff` upstream. When repeating a UTC time value during a leap second (when the UTC time should be 23:59:60), the TAI timescale should not stop. The kernel NTP code increments the TAI offset one second too late. This patch fixes the issue by incrementing the offset during the leap second itself. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 20:00:14 +03:00
John Stultz	a3e27f99cb	ntp: Fixup adjtimex freq validation on 32-bit systems commit 29183a70b0b828500816bd794b3fe192fce89f73 upstream. Additional validation of adjtimex freq values to avoid potential multiplication overflows were added in commit 5e5aeb4367b (time: adjtimex: Validate the ADJ_FREQUENCY values) Unfortunately the patch used LONG_MAX/MIN instead of LLONG_MAX/MIN, which was fine on 64-bit systems, but being much smaller on 32-bit systems caused false positives resulting in most direct frequency adjustments to fail w/ EINVAL. ntpd only does direct frequency adjustments at startup, so the issue was not as easily observed there, but other time sync applications like ptpd and chrony were more effected by the bug. See bugs: https://bugzilla.kernel.org/show_bug.cgi?id=92481 https://bugzilla.redhat.com/show_bug.cgi?id=1188074 This patch changes the checks to use LLONG_MAX for clarity, and additionally the checks are disabled on 32-bit systems since LLONG_MAX/PPM_SCALE is always larger then the 32-bit long freq value, so multiplication overflows aren't possible there. Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Reported-by: George Joseph <george.joseph@fairview5.com> Tested-by: George Joseph <george.joseph@fairview5.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org [ Prettified the changelog and the comments a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:13 +03:00
Sasha Levin	08f43c9f5e	time: adjtimex: Validate the ADJ_FREQUENCY values commit 5e5aeb4367b450a28f447f6d5ab57d8f2ab16a5f upstream. Verify that the frequency value from userspace is valid and makes sense. Unverified values can cause overflows later on. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> [jstultz: Fix up bug for negative values and drop redunent cap check] Signed-off-by: John Stultz <john.stultz@linaro.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:12 +03:00
John Stultz	fd9599139b	clocksource: Fix abs() usage w/ 64bit values commit 67dfae0cd72fec5cd158b6e5fb1647b7dbe0834c upstream. This patch fixes one cases where abs() was being used with 64-bit nanosecond values, where the result may be capped at 32-bits. This potentially could cause watchdog false negatives on 32-bit systems, so this patch addresses the issue by using abs64(). Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:11 +03:00
Stephen Boyd	32bbf0ac86	clocksource: Extract max nsec calculation into separate function We need to calculate the same number in the clocksource code and the sched_clock code, so extract this code into its own function. We also drop the min_t and just use min() because the two types are the same. Change-Id: I65265c90f8d8e01dba82833ecac7b42af3a66414 Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `87d8b9eb7e` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Ian Maund <imaund@codeaurora.org>	2017-08-25 20:00:11 +03:00
AdrianDC	39aa2bafdf	kernel: alarm: Increase power-on alarm before alarm time to 300 sec * Makes sure the device has completely booted under every circumstances * Prevent missed alarms if the user has a very high amount of apps Change-Id: Ie33c98662a39d59922577a089ce0835d066f98a1 Signed-off-by: AdrianDC <radian.dc@gmail.com>	2017-08-25 20:00:10 +03:00
Kumar Gala	bdbf6eda53	rtc: alarm: init power_on_alarm_lock mutex in alarmtimer_rtc_timer_init Moved mutex_init of power_on_alarm_lock into alarmtimer_rtc_timer_init so that if CONFIG_RTC_CLASS is not enabled, we dont try to initialize the mutex. This fixes a build issue when !CONFIG_RTC_CLASS Change-Id: I270564b94fd54162aa177a2c9767655098662259 Signed-off-by: Kumar Gala <galak@codeaurora.org>	2017-08-25 20:00:09 +03:00
Matthew Qin	a85a8220c5	rtc: alarm: Add power-on alarm feature Android does not support powering up of phone through alarm. Adding shutdown hook in alarm driver which will set alarm while phone is going down so as to power-up the phone after alarm expiration. Change-Id: Ic2611e33ae9c1f8e83f21efdb93e26ac9f9499de Signed-off-by: Matthew Qin <yqin@codeaurora.org>	2017-08-25 20:00:09 +03:00
Mohit Aggarwal	4ea7106deb	rtc: alarm: Change wake-up source Currently, RTC_ALARM is used to wake-up target from suspend state and is also used for power-off alarm feature. This patch uses qtimer to wake-up from suspend state. Change-Id: Ia42cfecd573309be2f03c18b4f1c321be8202d7d Signed-off-by: Mohit Aggarwal <maggarwa@codeaurora.org>	2017-08-25 20:00:08 +03:00
Mao Jinlong	285f9a2ec1	alarmtimer: add verification for rtc dev in power_on_alarm_init Call alarmtimer_get_rtcdev() to get rtcdev instead of using the global variable rtcdev to avoid the potential risk that rtcdev is not initialized before using it. Change-Id: Id186db22849b3667b553f34236a694f9ec6725a7 Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org>	2017-08-25 20:00:04 +03:00
Mao Jinlong	8a4f17f18b	alarm: init power_on_alarm in alarm_dev_init The power_on alarm might be missed during suspend/resume if power_on_alarm was not set in some cases like power-off charging using wall charger. So init power_on_alarm according to RTC alarm reg during booting to guarantee power_on alarm will not be missed by accident. Change-Id: If4e84b5d34fd199897b178081e764f96d2b64194 Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org>	2017-08-25 20:00:04 +03:00
Kumar Gala	3409d259aa	rtc: alarm: init power_on_alarm_lock mutex in alarmtimer_rtc_timer_init Moved mutex_init of power_on_alarm_lock into alarmtimer_rtc_timer_init so that if CONFIG_RTC_CLASS is not enabled, we dont try to initialize the mutex. This fixes a build issue when !CONFIG_RTC_CLASS Change-Id: I270564b94fd54162aa177a2c9767655098662259 Signed-off-by: Kumar Gala <galak@codeaurora.org>	2017-08-25 20:00:03 +03:00
Mao Jinlong	fa1ecf6a50	rtc: alarm: set power_on_alarm again when calling alarmtimer_resume The power_on alarm might be replaced or cleared during alarm suspend and resume process.So restore the power_on alarm into RTC during alarm resume. Change-Id: I1eefb082f9dfbd78517f3da1cc60a989853555fd Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org>	2017-08-25 20:00:03 +03:00
Xiaocheng Li	16b3099434	alarmtimer: add rtc irq support for alarm Add the rtc irq support for alarmtimer to wakeup the alarm during system suspend. Change-Id: I41b774ed4e788359321e1c6a564551cc9cd40c8e Signed-off-by: Xiaocheng Li <lix@codeaurora.org>	2017-08-25 20:00:02 +03:00
John Stultz	1bc537874e	alarmtimers: Squash upstream changes staging: android-alarm: Switch from wakelocks to wakeup sources In their current AOSP tree, the Android in-kernel wakelock infrastructure has been reimplemented in terms of wakeup sources: http://git.linaro.org/gitweb?p=people/jstultz/android.git;a=commitdiff;h=e9911f4efdc55af703b8b3bb8c839e6f5dd173bb The Android alarm driver currently has stubbed out calls to wakelock functionality. So this patch simply converts the stubbed out wakelock calls to wakeup source calls, and removes the empty wakelock macros Greg, would you mind queuing this in staging-next? CC: Colin Cross <ccross@android.com> CC: Arve Hjønnevåg <arve@android.com> CC: Greg KH <gregkh@linuxfoundation.org> CC: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Staging: android: alarm: Rename pr_alarm to alarm_dbg Rename a macro to make it explicit it's for debugging. Use %s: __func__ instead of embedding function names. Coalesce formats, align arguments. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: Android: Fix NULL pointer related warning in alarm-dev.c file Fixes the following sparse warning: drivers/staging/android/alarm-dev.c:259:35: warning: Using plain integer as NULL pointer Cc: Brian Swetland <swetland@google.com> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: android: alarm: remove unnecessary goto statement Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Staging: android: Alarm driver cleanups Little cleanups. Enum value ANDROID_ALARM_TYPE_COUNT was treated as an alarm type within a switch statement. That condition was unreachable though. Signed-off-by: Dae S. Kim <dae@velatum.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: alarm-dev: Drop pre Android 1.0 _OLD ioctls Per Colin's comment: "The "support old userspace code" comment for those two ioctls has been there since pre-Android 1.0. Those apis are not exposed to Android apps, I don't see any problem deleting them." Thus this patch removes the ANDROID_ALARM_SET_OLD and ANDROID_ALARM_SET_AND_WAIT_OLD ioctl compatability logic. Change-Id: I5138aaa3cdbfb758aaef2cc7591cb5340b3640a0 Cc: Serban Constantinescu <serban.constantinescu@arm.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@google.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: alarm-dev: Refactor alarm-dev ioctl code in prep for compat_ioctl Cleanup the Android alarm-dev driver's ioctl code to refactor it in preparation for compat_ioctl support. Cc: Serban Constantinescu <serban.constantinescu@arm.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@google.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: alarm-dev: Implement compat_ioctl support Implement compat_ioctl support for the alarm-dev ioctl. Cc: Serban Constantinescu <serban.constantinescu@arm.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@google.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: alarm-dev: information leak in alarm_ioctl() Smatch complains that if we pass an invalid clock type then "ts" is never set. We need to check for errors earlier, otherwise we end up passing uninitialized stack data to userspace. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> staging: alarm-dev: information leak in alarm_compat_ioctl() If we pass an invalid clock type then "ts" is never set. We need to check for errors earlier, otherwise we end up passing uninitialized stack data to userspace. Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> rtc: alarm: Add power-on alarm feature Android does not support powering up of phone through alarm. Adding shutdown hook in alarm driver which will set alarm while phone is going down so as to power-up the phone after alarm expiration. Change-Id: Ic2611e33ae9c1f8e83f21efdb93e26ac9f9499de Signed-off-by: Matthew Qin <yqin@codeaurora.org> qpnp-rtc: clear alarm register when rtc irq is disabled The rtc alarm register should be cleared when the rtc irq is disabled Change-Id: I97a8bf989ff610093240a6b308a297702da6cb89 Signed-off-by: Xiaocheng Li <lix@codeaurora.org> Signed-off-by: Matthew Qin <yqin@codeaurora.org> alarm : Fix the race conditions in alarm-dev.c There will be race conditions between alarm set and alarm clear if set_power_on_alarm is out of the lock.But set_power_on_alarm can't be put into spin_lock. So add it into mutex_lock. CRs-Fixed: 639115 Change-Id: I4226a95e499211c0d50ff7ce269467a57a410dc7 Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org> switch timerfd_[sg]ettime(2) to fget_light() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> time: Enable alarmtimers Change-Id: I4d1ea553aa707ab0467e00dea86cedc8f6797b78	2017-08-25 20:00:02 +03:00
Venkatesh Yadav Abbarapu	fad9fdf0cc	rtc: alarm: Fix data handling issue with alarm->type Restructure data handling in alarm_start_relative and alarm_try_to_cancel for the possibility of alarm->type referenced incorrectly. Change-Id: I053fcaff28e4746aeef299c9605d757df2dea532 Signed-off-by: Venkatesh Yadav Abbarapu <quicvenkat@codeaurora.org>	2017-08-25 20:00:01 +03:00
Richard Larocque	fcc5b97ac3	alarmtimer: Lock k_itimer during timer callback commit `474e941bed` upstream. Locks the k_itimer's it_lock member when handling the alarm timer's expiry callback. The regular posix timers defined in posix-timers.c have this lock held during timout processing because their callbacks are routed through posix_timer_fn(). The alarm timers follow a different path, so they ought to grab the lock somewhere else. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:00 +03:00
Richard Larocque	2d8f450be3	alarmtimer: Do not signal SIGEV_NONE timers commit `265b81d23a` upstream. Avoids sending a signal to alarm timers created with sigev_notify set to SIGEV_NONE by checking for that special case in the timeout callback. The regular posix timers avoid sending signals to SIGEV_NONE timers by not scheduling any callbacks for them in the first place. Although it would be possible to do something similar for alarm timers, it's simpler to handle this as a special case in the timeout. Prior to this patch, the alarm timer would ignore the sigev_notify value and try to deliver signals to the process anyway. Even worse, the sanity check for the value of sigev_signo is skipped when SIGEV_NONE was specified, so the signal number could be bogus. If sigev_signo was an unitialized value (as it often would be if SIGEV_NONE is used), then it's hard to predict which signal will be sent. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 20:00:00 +03:00
Richard Larocque	f141250132	alarmtimer: Return relative times in timer_gettime commit `e86fea7649` upstream. Returns the time remaining for an alarm timer, rather than the time at which it is scheduled to expire. If the timer has already expired or it is not currently scheduled, the it_value's members are set to zero. This new behavior matches that of the other posix-timers and the POSIX specifications. This is a change in user-visible behavior, and may break existing applications. Hopefully, few users rely on the old incorrect behavior. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> [jstultz: minor style tweak] Signed-off-by: John Stultz <john.stultz@linaro.org> [lizf: Backported to 3.4: - add alarm_expires_remaining() introduced by commit `6cffe00f7d`] Signed-off-by: Zefan Li <lizefan@huawei.com>	2017-08-25 19:59:59 +03:00
John Stultz	29d39ef2fc	alarmtimer: Fix bug where relative alarm timers were treated as absolute commit `16927776ae` upstream. Sharvil noticed with the posix timer_settime interface, using the CLOCK_REALTIME_ALARM or CLOCK_BOOTTIME_ALARM clockid, if the users tried to specify a relative time timer, it would incorrectly be treated as absolute regardless of the state of the flags argument. This patch corrects this, properly checking the absolute/relative flag, as well as adds further error checking that no invalid flag bits are set. Reported-by: Sharvil Nanavati <sharvil@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Link: http://lkml.kernel.org/r/1404767171-6902-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 19:59:59 +03:00
KOSAKI Motohiro	66c2ba6db2	alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist commit `98d6f4dd84` upstream. Fedora Ruby maintainer reported latest Ruby doesn't work on Fedora Rawhide on ARM. (http://bugs.ruby-lang.org/issues/9008) Because of, commit `1c6b39ad3f` (alarmtimers: Return -ENOTSUPP if no RTC device is present) intruduced to return ENOTSUPP when clock_get{time,res} can't find a RTC device. However this is incorrect. First, ENOTSUPP isn't exported to userland (ENOTSUP or EOPNOTSUP are the closest userland equivlents). Second, Posix and Linux man pages agree that clock_gettime and clock_getres should return EINVAL if clk_id argument is invalid. While the arugment that the clockid is valid, but just not supported on this hardware could be made, this is just a technicality that doesn't help userspace applicaitons, and only complicates error handling. Thus, this patch changes the code to use EINVAL. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: Vit Ondruch <v.ondruch@tiscali.cz> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [jstultz: Tweaks to commit message to include full rational] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 19:59:58 +03:00
Todd Poynor	ec80dc9e14	alarmtimer: add alarm_expires_remaining Similar to hrtimer_expires_remaining, return the amount of time remaining until alarm expiry. Change-Id: I8c57512d619ac66bcdaf2d9ccdf0d7f74af2ff66 Signed-off-by: Todd Poynor <toddpoynor@google.com>	2017-08-25 19:59:57 +03:00
Todd Poynor	d2d0a8cb2c	alarmtimer: add alarm_start_relative Start an alarmtimer with an expires time relative to the current time of the associated clock. Change-Id: Ifb5309a15e0d502bb4d0209ca5510a56ee7fa88c Signed-off-by: Todd Poynor <toddpoynor@google.com>	2017-08-25 19:59:57 +03:00
Todd Poynor	f76497eb33	alarmtimer: add alarm_forward_now Similar to hrtimer_forward_now, move the expires time forward to an interval from the current time of the associated clock. Change-Id: I73fed223321167507b6eddcb7a57d235ffcfc1be Signed-off-by: Todd Poynor <toddpoynor@google.com>	2017-08-25 19:59:56 +03:00
Todd Poynor	11e30d5ce1	alarmtimer: add alarm_restart Analogous to hrtimer_restart, restart an alarmtimer after the expires time has already been updated (as with alarm_forward). Change-Id: Ia2613bbb467404cb2c35c11efa772bc56294963a Signed-off-by: Todd Poynor <toddpoynor@google.com>	2017-08-25 19:59:55 +03:00
John Stultz	ae8e0de35c	alarmtimer: Use hrtimer per-alarm instead of per-base Arve Hjønnevåg reported numerous crashes from the "BUG_ON(timer->state != HRTIMER_STATE_CALLBACK)" check in __run_hrtimer after it called alarmtimer_fired. It ends up the alarmtimer code was not properly handling possible failures of hrtimer_try_to_cancel, and because these faulres occur when the underlying base hrtimer is being run, this limits the ability to properly handle modifications to any alarmtimers on that base. Because much of the logic duplicates the hrtimer logic, it seems that we might as well have a per-alarmtimer hrtimer, and avoid the extra complextity of trying to multiplex many alarmtimers off of one hrtimer. Thus this patch moves the hrtimer to the alarm structure and simplifies the management logic. Change-Id: I54b2d718ed1d39598e321eacf88c3335284d4cea Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2017-08-25 19:59:54 +03:00
Todd Poynor	a9df325c74	alarmtimer: implement minimum alarm interval for allowing suspend alarmtimer suspend return -EBUSY if the next alarm will fire in less than 2 seconds. This allows one RTC seconds tick to occur subsequent to this check before the alarm wakeup time is set, ensuring the wakeup time is still in the future (assuming the RTC does not tick one more second prior to setting the alarm). If suspend is rejected, hold a wakeup source for 2 seconds to process the alarm prior to reattempting suspend. Change-Id: If38e2568da0ea01dfee6e00323ce7e2c00f2f110 Signed-off-by: Todd Poynor <toddpoynor@google.com>	2017-08-25 19:59:54 +03:00
Steve Kondik	2cf668fdf2	Revert "sched: reinitialize rq->next_balance when a CPU is hot-added" This reverts commit ff1a78c96a97a4a4b6fd3405f7be9543ef8332c2. Revert "nohz: stat: Fix CPU idle time accounting" This reverts commit 201ab603673f4007e26350e8ecb89ee7563c7ee1. Revert "sched: remove redundant update_runtime notifier" This reverts commit 755305a60021cd32736c876fce03dfaba52e6a84. Revert "sched,rt: disable rt_runtime borrowing by default" This reverts commit b7e4061cf5a93f3fe2702ec79aaf61b7495375e9. Revert "sched/rt: Remove redundant nr_cpus_allowed test" This reverts commit 18f87a01c955253487b8a6fca2e94ef0a57fac60. Revert "sched: Micro-optimize the smart wake-affine logic" This reverts commit 5219d51eca28a8e02d136e142d8188e14e4abc62. Revert "sched: Implement smarter wake-affine logic" This reverts commit c93bc0af21dc9921a23e8a77a207b81781896773. Revert "sched: Reduce overestimating rq->avg_idle" This reverts commit 80ca5758d20583282e33927c43293d618b55e41f. Revert "sched: scale the busy and this queue's per-task load before compare" This reverts commit 785aa0522f4cbec1eabc2c077f016db543684df4. Revert "sched: reduce calculation effort in fix_small_imbalance" This reverts commit 769ec1bfe483138827c8e225588c7b4228fba15e. Conflicts: fs/proc/stat.c Change-Id: I67ce80ef55893c71f01728069efa40d7e6373659	2017-08-25 16:04:32 +03:00
Bo Yan	1f0200aad7	nohz: stat: Fix CPU idle time accounting Since cpustat[CPUTIME_IDLE] is never connected to ts->idle_sleeptime, never read from cpustat[CPUTIME_IDLE] when reporting stats in /proc/stat. Note this was rejected by Michal Hocko when it was initially proposed by Martin Schwidefsky in LKML, so if you want to upstream it, better find an alternative (either completely disable cpustat[CPUTIME_IDLE] for CONFIG_NO_HZ or somehow connect them to keep them in sync.) bug 1190321 Conflicts: fs/proc/stat.c Change-Id: Idc92488910b826aff850a010016d8326c7ab9e6c Signed-off-by: Bo Yan <byan@nvidia.com> Reviewed-on: http://git-master/r/214638 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Liang Cheng (SW) <licheng@nvidia.com> Tested-by: Liang Cheng (SW) <licheng@nvidia.com>	2017-08-25 16:04:25 +03:00
Maria Yu	ee27e49654	rtc: alarm: have dummy set_power_on_alarm function When !CONFIG_RTC_CLASS we need to have a dummy set_power_on_alarm function. Otherwise, it is easy to have build issues. Change-Id: I39310241cc2e59f672964c8bc0ee4c3607aedf4e Signed-off-by: Maria Yu <aiquny@codeaurora.org>	2017-08-25 16:04:17 +03:00
Kumar Gala	590dbc9a2e	rtc: alarm: init power_on_alarm_lock mutex in alarmtimer_rtc_timer_init Moved mutex_init of power_on_alarm_lock into alarmtimer_rtc_timer_init so that if CONFIG_RTC_CLASS is not enabled, we dont try to initialize the mutex. This fixes a build issue when !CONFIG_RTC_CLASS Change-Id: I270564b94fd54162aa177a2c9767655098662259 Signed-off-by: Kumar Gala <galak@codeaurora.org>	2017-08-25 16:04:10 +03:00
Daniel Lezcano	062f9b42cd	tick: Dynamically set broadcast irq affinity When a cpu goes to a deep idle state where its local timer is shutdown, it notifies the time frame work to use the broadcast timer instead. Unfortunately, the broadcast device could wake up any CPU, including an idle one which is not concerned by the wake up at all. So in the worst case an idle CPU will wake up to send an IPI to the CPU whose timer expired. Provide an opt-in feature CLOCK_EVT_FEAT_DYNIRQ which tells the core that is should set the interrupt affinity of the broadcast interrupt to the cpu which has the earliest expiry time. This avoids unnecessary spurious wakeups and IPIs. [ tglx: Adopted to cpumask rework, silenced an uninitialized warning, massaged changelog ] Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: viresh.kumar@linaro.org Cc: jacob.jun.pan@linux.intel.com Cc: linux-arm-kernel@lists.infradead.org Cc: santosh.shilimkar@ti.com Cc: linaro-kernel@lists.linaro.org Cc: patches@linaro.org Cc: rickard.andersson@stericsson.com Cc: vincent.guittot@linaro.org Cc: linus.walleij@stericsson.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1362219013-18173-3-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Change-Id: I6a880a39dd595526b80a6d72b88be74163513da9 Signed-off-by: Karthik Parsha <kparsha@codeaurora.org> Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org>	2017-08-25 16:04:02 +03:00
Feng Tang	0b5c47ff0b	timekeeping: utilize the suspend-nonstop clocksource to count suspended time There are some new processors whose TSC clocksource won't stop during suspend. Currently, after system resumes, kernel will use persistent clock or RTC to compensate the sleep time, but with these nonstop clocksources, we could skip the special compensation from external sources, and just use current clocksource for time recounting. This can solve some time drift bugs caused by some not-so-accurate or error-prone RTC devices. The current way to count suspended time is first try to use the persistent clock, and then try the RTC if persistent clock can't be used. This patch will change the trying order to: suspend-nonstop clocksource -> persistent clock -> RTC When counting the sleep time with nonstop clocksource, use an accurate way suggested by Jason Gunthorpe to cover very large delta cycles. CRs-fixed: 536881 Change-Id: I59c21b1ce5ae81cebc8cf3e817d9b3f951302a3d Signed-off-by: Feng Tang <feng.tang@intel.com> [jstultz: Small optimization, avoiding re-reading the clocksource] Signed-off-by: John Stultz <john.stultz@linaro.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: `e445cf1c42`) [sboyd: Took tk and clock local variables] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2017-08-25 16:03:55 +03:00
AdrianDC	8e17455f73	Revert "HACK: time: Disable alarmtimer" This reverts commit abbb445f65bbb139202fde5a66f9a249977058c9.	2017-08-25 16:03:46 +03:00
Colin Cross	ae8a974101	timekeeping: fix 32-bit overflow in get_monotonic_boottime fixed upstream in v3.6 by `ec145babe7` get_monotonic_boottime adds three nanonsecond values stored in longs, followed by an s64. If the long values are all close to 1e9 the first three additions can overflow and become negative when added to the s64. Cast the first value to s64 so that all additions are 64 bit. Change-Id: Id90beaf652571841b33cc6613d4744df33f5f007 Signed-off-by: Colin Cross <ccross@android.com> [jstultz: Fished this out of the AOSP commong.git tree. This was fixed upstream in v3.6 by `ec145babe7`] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-25 16:03:40 +03:00
Kees Cook	7b25909341	time: Remove CONFIG_TIMER_STATS Currently CONFIG_TIMER_STATS exposes process information across namespaces: kernel/time/timer_list.c print_timer(): SEQ_printf(m, ", %s/%d", tmp, timer->start_pid); /proc/timer_list: #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570 Given that the tracer can give the same information, this patch entirely removes CONFIG_TIMER_STATS. Change-Id: I46f71dd592c2d241aacb1bfe7165c07254bc4298 Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: linux-doc@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Xing Gao <xgao01@email.wm.edu> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jessica Frazelle <me@jessfraz.com> Cc: kernel-hardening@lists.openwall.com Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Marek <mmarek@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Olof Johansson <olof@lixom.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-api@vger.kernel.org Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast Signed-off-by: Thomas Gleixner <tglx@linutronix.de> haggertk: Backported to 3.4 Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2017-07-02 13:03:26 +03:00
Steven Rostedt (Red Hat)	862c1fdf15	ring-buffer: Prevent overflow of size in ring_buffer_resize() If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE then the DIV_ROUND_UP() will return zero. Here's the details: # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb tracing_entries_write() processes this and converts kb to bytes. 18014398509481980 << 10 = 18446744073709547520 and this is passed to ring_buffer_resize() as unsigned long size. size = DIV_ROUND_UP(size, BUF_PAGE_SIZE); Where DIV_ROUND_UP(a, b) is (a + b - 1)/b BUF_PAGE_SIZE is 4080 and here 18446744073709547520 + 4080 - 1 = 18446744073709551599 where 18446744073709551599 is still smaller than 2^64 2^64 - 18446744073709551599 = 17 But now 18446744073709551599 / 4080 = 4521260802379792 and size = size * 4080 = 18446744073709551360 This is checked to make sure its still greater than 2 * 4080, which it is. Then we convert to the number of buffer pages needed. nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE) but this time size is 18446744073709551360 and 2^64 - (18446744073709551360 + 4080 - 1) = -3823 Thus it overflows and the resulting number is less than 4080, which makes 3823 / 4080 = 0 an nr_pages is set to this. As we already checked against the minimum that nr_pages may be, this causes the logic to fail as well, and we crash the kernel. There's no reason to have the two DIV_ROUND_UP() (that's just result of historical code changes), clean up the code and fix this bug. Change-Id: I7744dfdd1c3be9676f767139002b5f57c41d87b2 Cc: stable@vger.kernel.org # 3.5+ Fixes: `83f40318da` ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2017-06-26 21:35:28 +03:00
Nick Desaulniers	6e29c7b713	cgroup: prefer %pK to %p Prevents leaking kernel pointers when using kptr_restrict. Bug: 30149174 Change-Id: I0fa3cd8d4a0d9ea76d085bba6020f1eda073c09b Git-repo: https://android.googlesource.com/kernel/msm.git Git-commit: `505e48f32f` Signed-off-by: Srinivasa Rao Kuppala <srkupp@codeaurora.org>	2017-06-26 19:42:11 +03:00
John Dias	c746791c71	perf: don't leave group_entry on sibling list (use-after-free) When perf_group_detach is called on a group leader, it should empty its sibling list. Otherwise, when a sibling is later deallocated, list_del_event() removes the sibling's group_entry from its current list, which can be the now-deallocated group leader's sibling list (use-after-free bug). Bug: 32402548 Change-Id: I99f6bc97c8518df1cb0035814368012ba72ab1f1 Signed-off-by: John Dias <joaodias@google.com>	2017-06-07 13:08:01 -06:00
Tom Marshall	75ec7fa33f	kernel: Only expose su when daemon is running Note: this is for the 3.4 kernel It has been claimed that the PG implementation of 'su' has security vulnerabilities even when disabled. Unfortunately, the people that find these vulnerabilities often like to keep them private so they can profit from exploits while leaving users exposed to malicious hackers. In order to reduce the attack surface for vulnerabilites, it is therefore necessary to make 'su' completely inaccessible when it is not in use (except by the root and system users). Change-Id: Ia7d50ba46c3d932c2b0ca5fc8e9ec69ec9045f85	2017-05-19 18:41:25 -06:00
Amey Telawane	31406fa807	trace: resolve stack corruption due to string copy Strcpy has no limit on string being copied which causes stack corruption leading to kernel panic. Use strlcpy to resolve the issue by providing length of string to be copied. CRs-fixed: 1048480 CAF-Change-Id: Ib290b25f7e0ff96927b8530e5c078869441d409f Signed-off-by: Amey Telawane <ameyt@codeaurora.org> CVE-2017-0605 Change-Id: I300bf476a38a15d515a2e1d795a53650b209a701 (cherry picked from commit 2161ae9a70b12cf18ac8e5952a20161ffbccb477)	2017-05-01 19:35:12 -06:00
Peter Zijlstra	ec19de36ed	perf: Tighten (and fix) the grouping condition The fix from 9fc81d87420d ("perf: Fix events installation during moving group") was incomplete in that it failed to recognise that creating a group with events for different CPUs is semantically broken -- they cannot be co-scheduled. Furthermore, it leads to real breakage where, when we create an event for CPU Y and then migrate it to form a group on CPU X, the code gets confused where the counter is programmed -- triggered in practice as well by me via the perf fuzzer. Fix this by tightening the rules for creating groups. Only allow grouping of counters that can be co-scheduled in the same context. This means for the same task and/or the same cpu. Fixes: 9fc81d87420d ("perf: Fix events installation during moving group") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> CVE-2015-9004 Change-Id: I8775ff596bbf97f6eeaaa679c319618fb7a9c639 (cherry picked from commit c3c87e770458aa004bd7ed3f29945ff436fd6511)	2017-05-01 19:11:56 -06:00
Paul Moore	2b53ad2c75	BACKPORT: audit: fix a double fetch in audit_log_single_execve_arg() (cherry picked from commit 43761473c254b45883a64441dd0bc85a42f3645c) There is a double fetch problem in audit_log_single_execve_arg() where we first check the execve(2) argumnets for any "bad" characters which would require hex encoding and then re-fetch the arguments for logging in the audit record[1]. Of course this leaves a window of opportunity for an unsavory application to munge with the data. This patch reworks things by only fetching the argument data once[2] into a buffer where it is scanned and logged into the audit records(s). In addition to fixing the double fetch, this patch improves on the original code in a few other ways: better handling of large arguments which require encoding, stricter record length checking, and some performance improvements (completely unverified, but we got rid of some strlen() calls, that's got to be a good thing). As part of the development of this patch, I've also created a basic regression test for the audit-testsuite, the test can be tracked on GitHub at the following link: * https://github.com/linux-audit/audit-testsuite/issues/25 [1] If you pay careful attention, there is actually a triple fetch problem due to a strnlen_user() call at the top of the function. [2] This is a tiny white lie, we do make a call to strnlen_user() prior to fetching the argument data. I don't like it, but due to the way the audit record is structured we really have no choice unless we copy the entire argument at once (which would require a rather wasteful allocation). The good news is that with this patch the kernel no longer relies on this strnlen_user() value for anything beyond recording it in the log, we also update it with a trustworthy value whenever possible. Reported-by: Pengfei Wang <wpengfeinudt@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com> Change-Id: I10e979e94605e3cf8d461e3e521f8f9837228aa5 Bug: 30956807	2016-11-11 13:44:38 +11:00
Peter Zijlstra	145365bfb6	perf: Fix race in swevent hash There's a race on CPU unplug where we free the swevent hash array while it can still have events on. This will result in a use-after-free which is BAD. Simply do not free the hash array on unplug. This leaves the thing around and no use-after-free takes place. When the last swevent dies, we do a for_each_possible_cpu() iteration anyway to clean these up, at which time we'll free it, so no leakage will occur. Change-Id: I751faf3215bbdaa6b6358f3a752bdd24126cfa0b Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-11 13:37:45 +11:00
dcashman	dcc94e7ac7	FROMLIST: mm: mmap: Add new /proc tunable for mmap_base ASLR. (cherry picked from commit https://lkml.org/lkml/2015/12/21/337) ASLR only uses as few as 8 bits to generate the random offset for the mmap base address on 32 bit architectures. This value was chosen to prevent a poorly chosen value from dividing the address space in such a way as to prevent large allocations. This may not be an issue on all platforms. Allow the specification of a minimum number of bits so that platforms desiring greater ASLR protection may determine where to place the trade-off. Bug: 24047224 Signed-off-by: Daniel Cashman <dcashman@android.com> Signed-off-by: Daniel Cashman <dcashman@google.com> Change-Id: Ic74424e07710cd9ccb4a02871a829d14ef0cc4bc	2016-10-29 23:12:40 +08:00
Paul E. McKenney	2f8470379e	cpu: Handle smpboot_unpark_threads() uniformly Commit 00df35f99191 (cpu: Defer smpboot kthread unparking until CPU known to scheduler) put the online path's call to smpboot_unpark_threads() into a CPU-hotplug notifier. This commit places the offline-failure paths call into the same notifier for the sake of uniformity. Note that it is not currently possible to place the offline path's call to smpboot_park_threads() into an existing notifier because the CPU_DYING notifiers run in a restricted environment, and the CPU_UP_PREPARE notifiers run too soon. Change-Id: I27fff8de4ec3f0193c1c9cf9ccb5701d08faa1ca Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2016-10-29 23:12:40 +08:00
Paul E. McKenney	b7703ca834	cpu: Defer smpboot kthread unparking until CPU known to scheduler Currently, smpboot_unpark_threads() is invoked before the incoming CPU has been added to the scheduler's runqueue structures. This might potentially cause the unparked kthread to run on the wrong CPU, since the correct CPU isn't fully set up yet. That causes a sporadic, hard to debug boot crash triggering on some systems, reported by Borislav Petkov, and bisected down to: 2a442c9c6453 ("x86: Use common outgoing-CPU-notification code") This patch places smpboot_unpark_threads() in a CPU hotplug notifier with priority set so that these kthreads are unparked just after the CPU has been added to the runqueues. Change-Id: I8921987de9c2a2f475cc63dc82662d6ebf6e8725 Reported-and-tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-commit: 00df35f991914db6b8bde8cf09808e19a9cffc3d Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Matt Wagantall <mattw@codeaurora.org>	2016-10-29 23:12:40 +08:00
Vignesh Radhakrishnan	061336f2bb	smpboot: use kmemleak_not_leak for smpboot_thread_data Kmemleak reports the following memory leak : [<ffffffc0002faef8>] create_object+0x140/0x274 [<ffffffc000cc3598>] kmemleak_alloc+0x80/0xbc [<ffffffc0002f707c>] kmem_cache_alloc_trace+0x148/0x1d8 [<ffffffc00024504c>] __smpboot_create_thread.part.2+0x2c/0xec [<ffffffc0002452b4>] smpboot_register_percpu_thread+0x90/0x118 [<ffffffc0016067c0>] spawn_ksoftirqd+0x1c/0x30 [<ffffffc000200824>] do_one_initcall+0xb0/0x14c [<ffffffc001600820>] kernel_init_freeable+0x84/0x1e0 [<ffffffc000cc273c>] kernel_init+0x10/0xcc [<ffffffc000203bbc>] ret_from_fork+0xc/0x50 This memory allocated here points to smpboot_thread_data. Data is used as an argument for this kthread. This will be used when smpboot_thread_fn runs. Therefore, is not a leak. Call kmemleak_not_leak for smpboot_thread_data pointer to ensure that kmemleak doesn't report it as a memory leak. Change-Id: I02b0a7debea3907b606856e069d63d7991b67cd9 Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>	2016-10-29 23:12:40 +08:00
Lai Jiangshan	13ad0c7bc0	smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread() The following race exists in the smpboot percpu threads management: CPU0 CPU1 cpu_up(2) get_online_cpus(); smpboot_create_threads(2); smpboot_register_percpu_thread(); for_each_online_cpu(); __smpboot_create_thread(); __cpu_up(2); This results in a missing per cpu thread for the newly onlined cpu2 and in a NULL pointer dereference on a consecutive offline of that cpu. Proctect smpboot_register_percpu_thread() with get_online_cpus() to prevent that. [ tglx: Massaged changelog and removed the change in smpboot_unregister_percpu_thread() because that's an optimization and therefor not stable material. ] Change-Id: I8c92a64bf35c3e77c8dd81761e9c8f71b2f94817 Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1406777421-12830-1-git-send-email-laijs@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:40 +08:00
Subbaraman Narayanamurthy	c9e9aa8c95	kthread: Fix the race condition when kthread is parked While stressing the CPU hotplug path, sometimes we hit a problem as shown below. [57056.416774] ------------[ cut here ]------------ [57056.489232] ksoftirqd/1 (14): undefined instruction: pc=c01931e8 [57056.489245] Code: e594a000 eb085236 e15a0000 0a000000 (e7f001f2) [57056.489259] ------------[ cut here ]------------ [57056.492840] kernel BUG at kernel/kernel/smpboot.c:134! [57056.513236] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [57056.519055] Modules linked in: wlan(O) mhi(O) [57056.523394] CPU: 0 PID: 14 Comm: ksoftirqd/1 Tainted: G W O 3.10.0-g3677c61-00008-g180c060 #1 [57056.532595] task: f0c8b000 ti: f0e78000 task.ti: f0e78000 [57056.537991] PC is at smpboot_thread_fn+0x124/0x218 [57056.542750] LR is at smpboot_thread_fn+0x11c/0x218 [57056.547528] pc : [<c01931e8>] lr : [<c01931e0>] psr: 200f0013 [57056.547528] sp : f0e79f30 ip : 00000000 fp : 00000000 [57056.558983] r10: 00000001 r9 : 00000000 r8 : f0e78000 [57056.564192] r7 : 00000001 r6 : c1195758 r5 : f0e78000 r4 : f0e5fd00 [57056.570701] r3 : 00000001 r2 : f0e79f20 r1 : 00000000 r0 : 00000000 This issue was always seen in the context of "ksoftirqd". It seems to be happening because of a potential race condition in __kthread_parkme where just after completing the parked completion, before the ksoftirqd task has been scheduled again, it can go into running state. Fix this by waiting for the task state to parked after waiting the parked completion. CRs-Fixed: 659674 Change-Id: If3f0e9b706eeb5d30d5a32f84378d35bb03fe794 Signed-off-by: Subbaraman Narayanamurthy <subbaram@codeaurora.org>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	34ba3c370d	kthread: Prevent unpark race which puts threads on the wrong cpu The smpboot threads rely on the park/unpark mechanism which binds per cpu threads on a particular core. Though the functionality is racy: CPU0 CPU1 CPU2 unpark(T) wake_up_process(T) clear(SHOULD_PARK) T runs leave parkme() due to !SHOULD_PARK bind_to(CPU2) BUG_ON(wrong CPU) We cannot let the tasks move themself to the target CPU as one of those tasks is actually the migration thread itself, which requires that it starts running on the target cpu right away. The solution to this problem is to prevent wakeups in park mode which are not from unpark(). That way we can guarantee that the association of the task to the target cpu is working correctly. Add a new task state (TASK_PARKED) which prevents other wakeups and use this state explicitly for the unpark wakeup. Peter noticed: Also, since the task state is visible to userspace and all the parked tasks are still in the PID space, its a good hint in ps and friends that these tasks aren't really there for the moment. The migration thread has another related issue. CPU0 CPU1 Bring up CPU2 create_thread(T) park(T) wait_for_completion() parkme() complete() sched_set_stop_task() schedule(TASK_PARKED) The sched_set_stop_task() call is issued while the task is on the runqueue of CPU1 and that confuses the hell out of the stop_task class on that cpu. So we need the same synchronizaion before sched_set_stop_task(). Change-Id: I9ad6fbe65992ad5b5cb9a252470a56ec51a4ff4f Reported-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Dave Hansen <dave@sr71.net> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Peter Ziljstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: dhillf@gmail.com Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	abd57a700e	stop_machine: Mark per cpu stopper enabled early commit `14e568e78` (stop_machine: Use smpboot threads) introduced the following regression: Before this commit the stopper enabled bit was set in the online notifier. CPU0 CPU1 cpu_up cpu online hotplug_notifier(ONLINE) stopper(CPU1)->enabled = true; ... stop_machine() The conversion to smpboot threads moved the enablement to the wakeup path of the parked thread. The majority of users seem to have the following working order: CPU0 CPU1 cpu_up cpu online unpark_threads() wakeup(stopper[CPU1]) .... stopper thread runs stopper(CPU1)->enabled = true; stop_machine() But Konrad and Sander have observed: CPU0 CPU1 cpu_up cpu online unpark_threads() wakeup(stopper[CPU1]) .... stop_machine() stopper thread runs stopper(CPU1)->enabled = true; Now the stop machinery kicks CPU0 into the stop loop, where it gets stuck forever because the queue code saw stopper(CPU1)->enabled == false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1 stopper work got discarded due to enabled == false. Add a pre_unpark function to the smpboot thread descriptor and call it before waking the thread. This fixes the problem at hand, but the stop_machine code should be more robust. The stopper->enabled flag smells fishy at best. Thanks to Konrad for going through a loop of debug patches and providing the information to decode this issue. Change-Id: I636875cf71ea5c5315eb0eb8599a8ebb9eadabf8 Reported-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	9b73a14959	stop_machine: Use smpboot threads Use the smpboot thread infrastructure. Mark the stopper thread selfparking and park it after it has finished the take_cpu_down() work. Change-Id: If478c40c32a6a1c41c6f73da422d0f2401ee8d17 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.686315164@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	87943c7833	stop_machine: Store task reference in a separate per cpu variable To allow the stopper thread being managed by the smpboot thread infrastructure separate out the task storage from the stopper data structure. Change-Id: I5270d00c93125618ddde65c1d753c9623740d184 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.626690384@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	738c3ffd5f	smpboot: Allow selfparking per cpu threads The stop machine threads are still killed when a cpu goes offline. The reason is that the thread is used to bring the cpu down, so it can't be parked along with the other per cpu threads. Allow a per cpu thread to be excluded from automatic parking, so it can park itself once it's done Add a create callback function as well. Change-Id: I6c7496b9da7984cfd513d2e7ee681f0df3206c26 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Richard Weinberger <rw@linutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130131120741.553993267@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	a73b106a40	softirq: Use hotplug thread infrastructure [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ] Change-Id: I87f9f3a3fb1856f0d5c5712be39b4c2c6ecd468f Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20120716103948.456416747@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Paul E. McKenney	c93d48c1a3	rcu: Use smp_hotplug_thread facility for RCUs per-CPU kthread Bring RCU into the new-age CPU-hotplug fold by modifying RCU's per-CPU kthread code to use the new smp_hotplug_thread facility. [ tglx: Adapted it to use callbacks and to the simplified rcu yield ] Change-Id: I55d4c5448eb4ea05debb75c3442316ce45728647 Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20120716103948.673354828@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Paul E. McKenney	6b4b576cd5	hotplug: Fix UP bug in smpboot hotplug code Because kernel subsystems need their per-CPU kthreads on UP systems as well as on SMP systems, the smpboot hotplug kthread functions must be provided in UP builds as well as in SMP builds. This commit therefore adds smpboot.c to UP builds and excludes irrelevant code via #ifdef. Change-Id: Idaaa4943bd35d389ad6e9e4bd807ae2c067c1931 Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	7f8c125206	smpboot: Provide infrastructure for percpu hotplug threads Provide a generic interface for setting up and tearing down percpu threads. On registration the threads for already online cpus are created and started. On deregistration (modules) the threads are stoppped. During hotplug operations the threads are created, started, parked and unparked. The datastructure for registration provides a pointer to percpu storage space and optional setup, cleanup, park, unpark functions. These functions are called when the thread state changes. Each implementation has to provide a function which is queried and returns whether the thread should run and the thread function itself. The core code handles all state transitions and avoids duplicated code in the call sites. [ paulmck: Preemption leak fix ] Change-Id: Ibd0993d9e7f95c47aee75836632b2cb950aa777c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20120716103948.352501068@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	c20b8a62d4	kthread: Implement park/unpark facility To avoid the full teardown/setup of per cpu kthreads in the case of cpu hot(un)plug, provide a facility which allows to put the kthread into a park position and unpark it when the cpu comes online again. Change-Id: Id76d714f0ca35744665f74bce4a8e9310abdb9ac Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120716103948.236618824@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	23df15c46c	rcu: Yield simpler The rcu_yield() code is amazing. It's there to avoid starvation of the system when lots of (boosting) work is to be done. Now looking at the code it's functionality is: Make the thread SCHED_OTHER and very nice, i.e. get it out of the way Arm a timer with 2 ticks schedule() Now if the system goes idle the rcu task returns, regains SCHED_FIFO and plugs on. If the systems stays busy the timer fires and wakes a per node kthread which in turn makes the per cpu thread SCHED_FIFO and brings it back on the cpu. For the boosting thread the "make it FIFO" bit is missing and it just runs some magic boost checks. Now this is a lot of code with extra threads and complexity. It's way simpler to let the tasks when they detect overload schedule away for 2 ticks and defer the normal wakeup as long as they are in yielded state and the cpu is not idle. That solves the same problem and the only difference is that when the cpu goes idle it's not guaranteed that the thread returns right away, but it won't be longer out than two ticks, so no harm is done. If that's an issue than it is way simpler just to wake the task from idle as RCU has callbacks there anyway. Change-Id: I82e134c4dd765b6b01eeb099cdabd9325e2e049a Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120716103948.131256723@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Thomas Gleixner	aaea68e448	smpboot: Remove leftover declaration Change-Id: I53a1ca8e0dc23f546bb3886b9d456ca3fc195693 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00
Srivatsa S. Bhat	6cc6bcde50	smpboot, idle: Fix comment mismatch over idle_threads_init() The comment over idle_threads_init() really talks about the functionality of idle_init(). Move that comment to idle_init(), and add a suitable comment over idle_threads_init(). Change-Id: Ib4fa008f4a6154f7234728efda18bf61ced93206 Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: suresh.b.siddha@intel.com Cc: venki@google.com Cc: nikunj@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20120524151100.2549.66501.stgit@srivatsabhat.in.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-29 23:12:39 +08:00

1 2 3 4 5 ...

13857 commits