android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Jeff Layton	3df0a6646d	vfs: define struct filename and have getname() return it getname() is intended to copy pathname strings from userspace into a kernel buffer. The result is just a string in kernel space. It would however be quite helpful to be able to attach some ancillary info to the string. For instance, we could attach some audit-related info to reduce the amount of audit-related processing needed. When auditing is enabled, we could also call getname() on the string more than once and not need to recopy it from userspace. This patchset converts the getname()/putname() interfaces to return a struct instead of a string. For now, the struct just tracks the string in kernel space and the original userland pointer for it. Later, we'll add other information to the struct as it becomes convenient. Change-Id: Ib690c3dd4d56624f0ddb081e1c1d4f23c2dd0cd1 Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-12-07 22:28:48 +04:00
Kees Cook	2f549f9575	fs: add link restriction audit reporting Adds audit messages for unexpected link restriction violations so that system owners will have some sort of potentially actionable information about misbehaving processes. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Change-Id: I4a6ef885b0680e1d554e32b7cc3506f8e0ba0b8a	2018-12-07 22:28:48 +04:00
Kees Cook	ec7215ac09	fs: add link restrictions This adds symlink and hardlink restrictions to the Linux VFS. Symlinks: A long-standing class of security issues is the symlink-based time-of-check-time-of-use race, most commonly seen in world-writable directories like /tmp. The common method of exploitation of this flaw is to cross privilege boundaries when following a given symlink (i.e. a root process follows a symlink belonging to another user). For a likely incomplete list of hundreds of examples across the years, please see: http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp The solution is to permit symlinks to only be followed when outside a sticky world-writable directory, or when the uid of the symlink and follower match, or when the directory owner matches the symlink's owner. Some pointers to the history of earlier discussion that I could find: 1996 Aug, Zygo Blaxell http://marc.info/?l=bugtraq&m=87602167419830&w=2 1996 Oct, Andrew Tridgell http://lkml.indiana.edu/hypermail/linux/kernel/9610.2/0086.html 1997 Dec, Albert D Cahalan http://lkml.org/lkml/1997/12/16/4 2005 Feb, Lorenzo Hernández García-Hierro http://lkml.indiana.edu/hypermail/linux/kernel/0502.0/1896.html 2010 May, Kees Cook https://lkml.org/lkml/2010/5/30/144 Past objections and rebuttals could be summarized as: - Violates POSIX. - POSIX didn't consider this situation and it's not useful to follow a broken specification at the cost of security. - Might break unknown applications that use this feature. - Applications that break because of the change are easy to spot and fix. Applications that are vulnerable to symlink ToCToU by not having the change aren't. Additionally, no applications have yet been found that rely on this behavior. - Applications should just use mkstemp() or O_CREATE\|O_EXCL. - True, but applications are not perfect, and new software is written all the time that makes these mistakes; blocking this flaw at the kernel is a single solution to the entire class of vulnerability. - This should live in the core VFS. - This should live in an LSM. (https://lkml.org/lkml/2010/5/31/135) - This should live in an LSM. - This should live in the core VFS. (https://lkml.org/lkml/2010/8/2/188) Hardlinks: On systems that have user-writable directories on the same partition as system files, a long-standing class of security issues is the hardlink-based time-of-check-time-of-use race, most commonly seen in world-writable directories like /tmp. The common method of exploitation of this flaw is to cross privilege boundaries when following a given hardlink (i.e. a root process follows a hardlink created by another user). Additionally, an issue exists where users can "pin" a potentially vulnerable setuid/setgid file so that an administrator will not actually upgrade a system fully. The solution is to permit hardlinks to only be created when the user is already the existing file's owner, or if they already have read/write access to the existing file. Many Linux users are surprised when they learn they can link to files they have no access to, so this change appears to follow the doctrine of "least surprise". Additionally, this change does not violate POSIX, which states "the implementation may require that the calling process has permission to access the existing file"[1]. This change is known to break some implementations of the "at" daemon, though the version used by Fedora and Ubuntu has been fixed[2] for a while. Otherwise, the change has been undisruptive while in use in Ubuntu for the last 1.5 years. [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/linkat.html [2] http://anonscm.debian.org/gitweb/?p=collab-maint/at.git;a=commitdiff;h=f4114656c3a6c6f6070e315ffdf940a49eda3279 This patch is based on the patches in Openwall and grsecurity, along with suggestions from Al Viro. I have added a sysctl to enable the protected behavior, and documentation. Change-Id: Ic4872c58e8a0672147c73b13175ea143e19915ba Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-12-07 22:28:48 +04:00
Al Viro	66c4da2876	stop passing nameidata to ->lookup() Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Change-Id: Id5a9a96c3202f724156c32fb266190334e7dbe48	2018-12-07 22:26:28 +04:00
Artem Borisov	e2c600a1f3	sched_clock: Squashed revert of the latest updates Revert "sched_clock: Avoid corrupting hrtimer tree during suspend" This reverts commit `8aad725c70`. Revert "sched_clock: Add support for >32 bit sched_clock" This reverts commit `657eb100e4`. Revert "sched_clock: Use an hrtimer instead of timer" This reverts commit `b2ee62ec51`. Revert "sched_clock: Use seqcount instead of rolling our own" This reverts commit `538b187b6e`. Revert "ARM: sched_clock: Load cycle count after epoch stabilizes" This reverts commit `8c7175ba39`. Revert "sched_clock: Make ARM's sched_clock generic for all architectures" This reverts commit `ebb97da74a`. Revert "ARM: 7699/1: sched_clock: Add more notrace to prevent recursion" This reverts commit `086da6a6c4`. Revert "ARM: make sched_clock just call a function pointer" This reverts commit `0dd4fad6c9`. Revert "ARM: sched_clock: allow changing to higher frequency counter" This reverts commit `4a3cf85432`. Change-Id: I98aaec7b554a2e11be4c551a864d952e0d8c3e22	2018-02-20 21:56:17 +03:00
Peter Zijlstra	3cffdb884f	perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream. commit fe525a280e8b5f04c7666fe22d1a4ef592f7b953 in 3.16.40 bug: 37901413 Di Shen reported a race between two concurrent sys_perf_event_open() calls where both try and move the same pre-existing software group into a hardware context. The problem is exactly that described in commit: f63a8daa5812 ("perf: Fix event->ctx locking") ... where, while we wait for a ctx->mutex acquisition, the event->ctx relation can have changed under us. That very same commit failed to recognise sys_perf_event_context() as an external access vector to the events and thereby didn't apply the established locking rules correctly. So while one sys_perf_event_open() call is stuck waiting on mutex_lock_double(), the other (which owns said locks) moves the group about. So by the time the former sys_perf_event_open() acquires the locks, the context we've acquired is stale (and possibly dead). Apply the established locking rules as per perf_event_ctx_lock_nested() to the mutex_lock_double() for the 'move_group' case. This obviously means we need to validate state after we acquire the locks. Change-Id: I816a317dff3ce999c94d22b7e51152ad1dcc30a2 Reported-by: Di Shen (Keen Lab) Tested-by: John Dias <joaodias@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Min Chong <mchong@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: f63a8daa5812 ("perf: Fix event->ctx locking") Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: - Use ACCESS_ONCE() instead of READ_ONCE() - Test perf_event::group_flags instead of group_caps - Add the err_locked cleanup block, which we didn't need before - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2018-01-13 17:13:44 +03:00
Peter Zijlstra	898386b287	BACKPORT: perf: Fix event->ctx locking There have been a few reported issues wrt. the lack of locking around changing event->ctx. This patch tries to address those. It avoids the whole rwsem thing; and while it appears to work, please give it some thought in review. What I did fail at is sensible runtime checks on the use of event->ctx, the RCU use makes it very hard. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit f63a8daa5812afef4f06c962351687e1ff9ccb2b) Bug: 30955111 Bug: 31095224 Change-Id: I8dfc0aae8d1206c177454e0093dacd82b6129c55 Signed-off-by: Joao Dias <joaodias@google.com>	2018-01-13 17:13:44 +03:00
Yan, Zheng	b23629d405	perf: Introduce perf_pmu_migrate_context() Originally from Peter Zijlstra. The helper migrates perf events from one cpu to another cpu. Change-Id: I4d3c45b4594f3d26bbe7cc9e3fb79675ffac8b5e Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-5-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-01-13 17:13:43 +03:00
Yan, Zheng	75e8341254	perf: Allow the PMU driver to choose the CPU on which to install events Allow the pmu->event_init callback to change event->cpu, so the PMU driver can choose the CPU on which to install events. Change-Id: Ie1f67c8b9fac650002f059081fe325eb799690c1 Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-4-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-01-13 17:13:43 +03:00
Peter Zijlstra	29484ea618	UPSTREAM: perf: Fix race in swevent hash (cherry picked from commit 12ca6ad2e3a896256f086497a7c7406a547ee373) There's a race on CPU unplug where we free the swevent hash array while it can still have events on. This will result in a use-after-free which is BAD. Simply do not free the hash array on unplug. This leaves the thing around and no use-after-free takes place. When the last swevent dies, we do a for_each_possible_cpu() iteration anyway to clean these up, at which time we'll free it, so no leakage will occur. Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Change-Id: I14c0679a2934dccdbb052805e6430fe54b5978f0 Bug: 30952077	2018-01-13 17:13:43 +03:00
Oleg Nesterov	4d448bba95	BACKPORT: FROMLIST: pids: make task_tgid_nr_ns() safe This was reported many times, and this was even mentioned in commit `52ee2dfdd4` "pids: refactor vnr/nr_ns helpers to make them safe" but somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is not safe because task->group_leader points to nowhere after the exiting task passes exit_notify(), rcu_read_lock() can not help. We really need to change __unhash_process() to nullify group_leader, parent, and real_parent, but this needs some cleanups. Until then we can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and fix the problem. Reported-by: Troy Kensinger <tkensinger@google.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> (url: https://patchwork.kernel.org/patch/9913055/) Bug: 31495866 Change-Id: I5e67b02a77e805f71fa3a787249f13c1310f02e2	2018-01-13 17:13:41 +03:00
Stephen Boyd	8aad725c70	sched_clock: Avoid corrupting hrtimer tree during suspend During suspend we call sched_clock_poll() to update the epoch and accumulated time and reprogram the sched_clock_timer to fire before the next wrap-around time. Unfortunately, sched_clock_poll() doesn't restart the timer, instead it relies on the hrtimer layer to do that and during suspend we aren't calling that function from the hrtimer layer. Instead, we're reprogramming the expires time while the hrtimer is enqueued, which can cause the hrtimer tree to be corrupted. Furthermore, we restart the timer during suspend but we update the epoch during resume which seems counter-intuitive. Let's fix this by saving the accumulated state and canceling the timer during suspend. On resume we can update the epoch and restart the timer similar to what we would do if we were starting the clock for the first time. Change-Id: Iee2a1cca42e5b681347ea0607e9af420a63892d7 CRs-Fixed: 696826 Fixes: `a08ca5d108` "sched_clock: Use an hrtimer instead of timer" Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2018-01-02 22:36:45 +03:00
Stephen Boyd	657eb100e4	sched_clock: Add support for >32 bit sched_clock The ARM architected system counter has at least 56 usable bits. Add support for counters with more than 32 bits to the generic sched_clock implementation so we can increase the time between wakeups due to dealing with wrap-around on these devices while benefiting from the irqtime accounting and suspend/resume handling that the generic sched_clock code already has. On my system using 56 bits over 32 bits changes the wraparound time from a few minutes to an hour. For faster running counters (GHz range) this is even more important because we may not be able to execute the timer in time to deal with the wraparound if only 32 bits are used. We choose a maxsec value of 3600 seconds because we assume no system will go idle for more than an hour. In the future we may need to increase this value. Note: All users should switch over to the 64-bit read function so we can remove setup_sched_clock() in favor of sched_clock_register(). Change-Id: Ieea67ada51e6fce2637520bf5f60e789762d8694 Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `e7e3ff1bfe` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Ian Maund <imaund@codeaurora.org>	2018-01-02 22:36:43 +03:00
Stephen Boyd	b2ee62ec51	sched_clock: Use an hrtimer instead of timer In the next patch we're going to increase the number of bits that the generic sched_clock can handle to be greater than 32. With more than 32 bits the wraparound time can be larger than what can fit into the units that msecs_to_jiffies takes (unsigned int). Luckily, the wraparound is initially calculated in nanoseconds which we can easily use with hrtimers, so switch to using an hrtimer. Change-Id: Ia052d3d5d9c1ec7b59c440ceae852c12c61477e1 Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [jstultz: Fixup hrtimer intitialization order issue] Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `a08ca5d108` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Ian Maund <imaund@codeaurora.org>	2018-01-02 22:36:42 +03:00
Stephen Boyd	538b187b6e	sched_clock: Use seqcount instead of rolling our own We're going to increase the cyc value to 64 bits in the near future. Doing that is going to break the custom seqcount implementation in the sched_clock code because 64 bit numbers aren't guaranteed to be atomic. Replace the cyc_copy with a seqcount to avoid this problem. Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `85c3d2dd15` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [imaund@codeaurora.org: resolve merge conflicts] Signed-off-by: Ian Maund <imaund@codeaurora.org> Change-Id: If7fae3228ff425c9f1763145cbd5c851141c7755	2018-01-02 22:36:39 +03:00
Stephen Boyd	8c7175ba39	ARM: sched_clock: Load cycle count after epoch stabilizes There is a small race between when the cycle count is read from the hardware and when the epoch stabilizes. Consider this scenario: CPU0 CPU1 ---- ---- cyc = read_sched_clock() cyc_to_sched_clock() update_sched_clock() ... cd.epoch_cyc = cyc; epoch_cyc = cd.epoch_cyc; ... epoch_ns + cyc_to_ns((cyc - epoch_cyc) The cyc on cpu0 was read before the epoch changed. But we calculate the nanoseconds based on the new epoch by subtracting the new epoch from the old cycle count. Since epoch is most likely larger than the old cycle count we calculate a large number that will be converted to nanoseconds and added to epoch_ns, causing time to jump forward too much. Fix this problem by reading the hardware after the epoch has stabilized. Change-Id: I2d81b8422814f209ca1e03a45eb370c9a0696c31 Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `336ae1180d` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Ian Maund <imaund@codeaurora.org>	2018-01-02 22:35:58 +03:00
Stephen Boyd	ebb97da74a	sched_clock: Make ARM's sched_clock generic for all architectures Nothing about the sched_clock implementation in the ARM port is specific to the architecture. Generalize the code so that other architectures can use it by selecting GENERIC_SCHED_CLOCK. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [jstultz: Merge minor collisions with other patches in my tree] Signed-off-by: John Stultz <john.stultz@linaro.org> Git-commit: `38ff87f77a` Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [imaund@codeaurora.org: resolve merge conflicts] Signed-off-by: Ian Maund <imaund@codeaurora.org> [flex1911: backport to 3.4] Signed-off-by: Artem Borisov <dedsa2002@gmail.com> Change-Id: I798c7c58dc9f476b07e60958a970aff7ceb4b797	2018-01-02 22:35:27 +03:00
Srivatsa S. Bhat	b96d7e4f12	CPU hotplug: Provide lockless versions of callback registration functions (cherry pick from commit `93ae4f978c`) The following method of CPU hotplug callback registration is not safe due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock and the cpu_hotplug.lock. get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); The deadlock is shown below: CPU 0 CPU 1 ----- ----- Acquire cpu_hotplug.lock [via get_online_cpus()] CPU online/offline operation takes cpu_add_remove_lock [via cpu_maps_update_begin()] Try to acquire cpu_add_remove_lock [via register_cpu_notifier()] CPU online/offline operation tries to acquire cpu_hotplug.lock [via cpu_hotplug_begin()] * DEADLOCK! * The problem here is that callback registration takes the locks in one order whereas the CPU hotplug operations take the same locks in the opposite order. To avoid this issue and to provide a race-free method to register CPU hotplug callbacks (along with initialization of already online CPUs), introduce new variants of the callback registration APIs that simply register the callbacks without holding the cpu_add_remove_lock during the registration. That way, we can avoid the ABBA scenario. However, we will need to hold the cpu_add_remove_lock throughout the entire critical section, to protect updates to the callback/notifier chain. This can be achieved by writing the callback registration code as follows: cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ] for_each_online_cpu(cpu) init_cpu(cpu); /* This doesn't take the cpu_add_remove_lock */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_maps_update_done(); [ or cpu_notifier_register_done(); see below ] Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin() because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD notifiers, and hence get_online_cpus() cannot provide the necessary synchronization to protect the callback/notifier chains against concurrent reads and writes. On the other hand, since the cpu_add_remove_lock protects the entire hotplug operation (including CPU_POST_DEAD), we can use cpu_maps_update_begin/done() to guarantee proper synchronization. Also, since cpu_maps_update_begin/done() is like a super-set of get/put_online_cpus(), the former naturally protects the critical sections from concurrent hotplug operations. Since the names cpu_maps_update_begin/done() don't make much sense in CPU hotplug callback registration scenarios, we'll introduce new APIs named cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done(). In summary, introduce the lockless variants of un/register_cpu_notifier() and also export the cpu_notifier_register_begin/done() APIs for use by modules. This way, we provide a race-free way to register hotplug callbacks as well as perform initialization for the CPUs that are already online. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Bug: 24810447 Change-Id: I5f85fcb5cfaa5f5f04a29eefc361851e9c345a99	2018-01-01 21:26:45 +03:00
Andrey Vagin	5f83d0f802	BACKPORT: signal: allow to send any siginfo to itself (cherry picked from commit `66dd34ad31`) The idea is simple. We need to get the siginfo for each signal on checkpointing dump, and then return it back on restore. The first problem is that the kernel doesn't report complete siginfos to userspace. In a signal handler the kernel strips SI_CODE from siginfo. When a siginfo is received from signalfd, it has a different format with fixed sizes of fields. The interface of signalfd was extended. If a signalfd is created with the flag SFD_RAW, it returns siginfo in a raw format. rt_sigqueueinfo looks suitable for restoring signals, but it can't send siginfo with a positive si_code, because these codes are reserved for the kernel. In the real world each person has right to do anything with himself, so I think a process should able to send any siginfo to itself. This patch: The kernel prevents sending of siginfo with positive si_code, because these codes are reserved for kernel. I think we can allow a task to send such a siginfo to itself. This operation should not be dangerous. This functionality is required for restoring signals in checkpoint/restart. Change-Id: I40101d87eeb53ae05cfa0949439577a8f3f58f94 Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-27 22:49:08 +03:00
John Stultz	ff9ff2f4b9	ANDROID: exec_domains: Disable request_module() call for personalities (cherry pick from commit `a9ac1262ce`) With Android M, Android environments use a separate execution domain for 32bit processes. See: https://android-review.googlesource.com/#/c/122131/ This results in systems that use kernel modules to see selinux audit noise like: type=1400 audit(28.989:15): avc: denied { module_request } for pid=1622 comm="app_process32" kmod="personality-8" scontext=u:r:zygote:s0 tcontext=u:r:kernel:s0 tclass=system While using kernel modules is unadvised, some systems do require them. Thus to avoid developers adding sepolicy exceptions to allow for request_module calls, this patch disables the logic which tries to call request_module for the 32bit personality (ie: personality-8), which doesn't actually exist. Signed-off-by: John Stultz <john.stultz@linaro.org> Change-Id: I9cb90bd1291f0a858befa7d347c85464346702db	2017-12-27 22:46:26 +03:00
Guenter Roeck	f57b91255d	seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock Current upstream kernel hangs with mips and powerpc targets in uniprocessor mode if SECCOMP is configured. Bisect points to commit `dbd952127d` ("seccomp: introduce writer locking"). Turns out that code such as BUG_ON(!spin_is_locked(&list_lock)); can not be used in uniprocessor mode because spin_is_locked() always returns false in this configuration, and that assert_spin_locked() exists for that very purpose and must be used instead. Fixes: `dbd952127d` ("seccomp: introduce writer locking") Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-12-27 22:42:09 +03:00
Artem Borisov	d7992e6feb	Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 All bluetooth-related changes were omitted because of our ancient incompatible bt stack. Change-Id: I96440b7be9342a9c1adc9476066272b827776e64	2017-12-27 17:13:15 +03:00
Liu ShuoX	3da71738d2	PM / Sleep: avoid 'autosleep' in shutdown progress commit `e5248a111b` upstream. Prevent automatic system suspend from happening during system shutdown by making try_to_suspend() check system_state and return immediately if it is not SYSTEM_RUNNING. This prevents the following breakage from happening (scenario from Zhang Yanmin): Kernel starts shutdown and calls all device driver's shutdown callback. When a driver's shutdown is called, the last wakelock is released and suspend-to-ram starts. However, as some driver's shut down callbacks already shut down devices and disabled runtime pm, the suspend-to-ram calls driver's suspend callback without noticing that device is already off and causes crash. Change-Id: I09261fe136713cb6bdd66e061a9e886d077324c5 [rjw: Changelog] Signed-off-by: Liu ShuoX <shuox.liu@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 426b7d5074424aab388af948ba75a5e1c8b9a702)	2017-10-15 17:05:14 +03:00
Dmitry Shmidt	36f258a110	PM: Check dpm_suspend_start() return code during partial resume Bug: 24986869 Change-Id: Iea3e0f84e43827b365b96d34bc647e310523bd40 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Thierry Strudel <tstrudel@google.com>	2017-10-15 17:05:13 +03:00
Ruchi Kandoi	0ff10ad812	wakeup_reason: use vsnprintf instead of snprintf for vargs. Bug: 22368519 Change-Id: I38f6f1ac6eaf9490bdc195c59e045b33ad154a72 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2017-10-15 17:05:12 +03:00
Dmitry Shmidt	f6d4e6286e	Power: Add wakeup reasons counters from boot in suspend_since_boot From left to right: 1. Amount of no-wait cycles 2. Amount of timeout cycles 3. Max waiting time in ms Change-Id: Ibc0bb1c4ea591d005cdbb095b6d21c0734d2eb8b Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>	2017-10-15 17:05:12 +03:00
Dmitry Shmidt	dc58d3c933	PM: Reduce waiting for wakeup reasons to 100 ms In 80% cases there is no need to wait, and in case of timeout we continue to resume. Change-Id: I6ae44e0ef6f7aa497f57fcd5f6e6bc83dc781852 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>	2017-10-15 17:05:12 +03:00
Ruchi Kandoi	18bbd5eed4	suspend: Return error when pending wakeup source is found. Suspend is aborted if the wakeup_source is pending. These wakeup sources are checked multiple times before going to suspend. If it is found to be pending then suspend is aborted and -EBUSY is returned. This happens at all the places except the last time they are checked. In this case suspend is aborted but the error is not set. Since the error is not propogated the suspend accounting considers this as a sucessful suspend instead of suspend abort. Change-Id: Ib63b4ead755127eaf03e3b303aab3c782ad02ed1 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2017-10-15 17:05:11 +03:00
Iliyan Malchev	a50130f1be	PM: wakeup_reasons: disable wakeup-reason deduction by default Introduce a config item, CONFIG_DEDUCE_WAKEUP_REASONS, disabled by default. Make CONFIG_PARTUALRESUME select it. Change-Id: I7d831ff0a9dfe0a504824f4bc65ba55c4d92546b Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 17:05:11 +03:00
Dmitry Shmidt	9ff16b88b8	PM: Replace WARN_ON on timeout with one line print Change-Id: Ia8b32b8ee225b7b62a327fecb10e9284ee4116df Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>	2017-10-15 17:05:10 +03:00
Ruchi Kandoi	124acbcfe3	power: Avoids bogus error messages for the suspend aborts. Avoids printing bogus error message "tasks refusing to freeze", in cases where pending wakeup source caused the suspend abort. Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Change-Id: I913ad290f501b31cd536d039834c8d24c6f16928	2017-10-15 17:05:10 +03:00
Iliyan Malchev	84612c4595	PM: wakeup_reasons: fix race condition log_possible_wakeup_reason() and stop_logging_wakeup_reasons() can race, as the latter can be called from process context, and both can run on separate cores. Change-Id: I306441d0be46dd4fe58c55cdc162f9d61a28c27d Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 17:05:09 +03:00
Dmitry Shmidt	563e031bd3	Power: Report total suspend times from boot in suspend_since_boot This node exports five values separated by space. From left to right: 1. Amount of suspend/resume cycles 2. Amount of suspend abort cycles 3. Total time spent in suspend/resume process 4. Total time in suspend abort process 5. Total time spent sleep in suspend state Change-Id: Ife188fd8386dce35f95fa7ba09fbc9d7e152db62 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>	2017-10-15 17:05:09 +03:00
Iliyan Malchev	953a4840dd	PM: extend suspend_again mechanism to use partialresume The old platform suspend_again callback overrides drivers' votes, such that if it implemented and returns false, then we do not call the partialresume handlers. When it doesn't exists or returns true, then we also query the registered drivers for consensus. When a device resumes from suspend, the suspend/resume code invokes partialresume to check to see if the set of wakeup interrupts all have matching handlers. If this is not the case, the PM subsystem can continue to resume as before. If all of the wakeup sources have matching handlers, then those are invoked in turn (and can block), and if all of them reach consensus that the reason for the wakeup can be ignored, they say so to the PM subsystem, which goes right back into suspend. Signed-off-by: Iliyan Malchev <malchev@google.com> Change-Id: Iaeb9ed78c4b5fb815c6e9c701233e703f481f962	2017-10-15 17:05:08 +03:00
Iliyan Malchev	4e0c8780cc	power: add partial-resume framework Partial resume refers to the concept of not waking up userspace when the kernel comes out of suspend for certain types of events that we wish to discard. An example is a network packet that can be disacarded in the kernel, or spurious wakeup event that we wish to ignore. Partial resume allows drivers to register callbacks, one one hand, and provides hooks into the PM's suspend/resume mechanism, on the other. When a device resumes from suspend, the core suspend/resume code invokes partialresume to check to see if the set of wakeup interrupts all have matching handlers. If this is not the case, the PM subsystem can continue to resume as before. If all of the wakeup sources have matching handlers, then those are invoked in turn (and can block), and if all of them reach consensus that the reason for the wakeup can be ignored, they say so to the PM subsystem, which goes right back into suspend. This latter support is implemented in a separate change. Signed-off-by: Iliyan Malchev <malchev@google.com> Change-Id: Id50940bb22a550b413412264508d259f7121d442	2017-10-15 17:05:08 +03:00
Iliyan Malchev	7e87a4dc87	PM: wakeup_reason: correctly deduce wakeup interrupts The wakeup_reason driver works by having a callback log_wakeup_reason(), be called by the resume path once for each wakeup interrupt, with the irq number as argument. It then saves this interrupt in an array, and reports it when requested (via /sys/kernel/wakeup_reasons/last_resume_reason) and also prints the information out in kmsg. This approach works, but it has the deficiency that often the reported wakeup interrupt, while correct, is not the interrupt truly responsible for the wakeup. The reason for this is due to chained interrupt controllers (whether in hardware or simulated in software). It could be, for example, that the power button is wired to a GPIO handled by a single interrupt for all GPIOs, which interrupt then determines the GPIO and maps this to a software interrupt. Whether this is done in software, or by chaining interrupt controllers, the end result is that the wakeup reason will show not the interrupt associated with the power button, but the base-GPIO interrupt instead. This patch reworks the wakeup_sources driver such that it reports those final interrupts we are interested in, and not the intermediate (and not the base) ones. It does so as follows: -- The assumption is that generic_handle_irq() is called to dispatch all interrupts; due to this, chained interrupts result in recursive calls of generic_handle_irq(). -- We reconstruct the chains of interrupts that originate with the base wakeup interrupt and terminate with the interrupt we are interested in by tracing the calls to generic_handle_irq() -- The tracing works by maitaining a per-cpu counter that is incremented with each call to generic_handle_irq(); that counter is reported to the wakeup_sources driver by a pair of functions, called log_possible_wakeup_reason_start() and log_possible_wakeup_reason_complete(). The former is called before generic_handle_irq() handles the interrupt (thereby potentially calling itself recusively) and the latter afterward. -- The two functions mentioned above are complemented by log_base_wake_reason() (renamed from log_wakeup_reason()), which is used to report the base wakeup interrupts to the wakeup_reason driver. -- The three functions work together to build a set of trees, one per base wakeup reason, the leaves of which correspond to the interrupts we are interesed in; these trees can be arbitratily complex, though in reality they most often are a single node, or a chain of two nodes. The complexity supports arbitrarily involved interrupt dispatch. -- On resume, we build the tree; once the tree is completed, we walk it recursively, and print out to kmesg the (more useful) list of wakeup sources; simiarly, we walk the tree and print the leaves when /sys/kernel/wakeup_reasons/last_resume_reason is read. Signed-off-by: Iliyan Malchev <malchev@google.com> Change-Id: If8acb2951b61d2c6bcf4d011fe04d7f91057d139	2017-10-15 17:05:07 +03:00
Iliyan Malchev	4dbec3e7db	irq_flow_handler_t now returns bool Alter the signature of irq_flow_handler_t to return true for those interrupts whose handlers were invoked, and false otherwise. Also rework the actual handlers, handle_.*_irq, to support the new signature. Change-Id: I8a50410c477692bbcd39a0fefdac14253602d1f5 Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 17:05:07 +03:00
Dmitry Shmidt	13e2b3277d	PM: wakeup_reason: add check_wakeup_reason() to verify wakeup source irq Wakeup reason is set before driver resume handlers are called. It is cleared before driver suspend handlers are called, on PM_SUSPEND_PREPARE. Change-Id: I04218c9b0c115a7877e8029c73e6679ff82e0aa4 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:24:04 +03:00
Ruchi Kandoi	e532467dab	power: log the last suspend abort reason. Extends the last_resume_reason to log suspend abort reason. The abort reasons will have "Abort:" appended at the start to distinguish itself from the resume reason. Change-Id: Id3c62fc0cb86ca2e05a69e40de040b94f32be389 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:23:56 +03:00
Ruchi Kandoi	1154a48192	PM: wakeup_reason: add functionality to log the last suspend-abort reason. Extends the last_resume_reason to log suspend abort reason. The abort reasons will have "Abort:" appended at the start to distinguish itself from the resume reason. Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Iliyan Malchev <malchev@google.com> Change-Id: I3207f1844e3d87c706dfc298fb10e1c648814c5f	2017-10-15 16:17:14 +03:00
Iliyan Malchev	c9816de694	PM: wakeup_reason: add functions to query and clear wakeup reasons The query results are valid until the next PM_SUSPEND_PREPARE. Change-Id: I6bc2bd47c830262319576a001d39ac9a994916cf Signed-off-by: Iliyan Malchev <malchev@google.com>	2017-10-15 16:17:14 +03:00
jinqian	ec32c90b65	Power: Report suspend times from last_suspend_time This node epxorts two values separated by space. From left to right: 1. time spent in suspend/resume process 2. time spent sleep in suspend state Change-Id: I2cb9a9408a5fd12166aaec11b935a0fd6a408c63	2017-10-15 16:17:13 +03:00
Ruchi Kandoi	92b3e919c4	Power: Add guard condition for maximum wakeup reasons Ensure the array for the wakeup reason IRQs does not overflow. Change-Id: Iddc57a3aeb1888f39d4e7b004164611803a4d37c Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2017-10-15 16:17:13 +03:00
Stepan Moskovchenko	6365c6e304	PM / Sleep: Clean up remnants of workqueue-based sync When legacy wakelock code was removed in commit f85607a715a74c65db812cd3901022888257f966, some of the code for moving calls to sys_sync() from suspend paths into a workqueue item had not been properly removed. Specifically, one of the call sites to suspend_sys_sync_wait() has been mistakenly replaced with a call to sys_sync(), which is not necessary because the corresponding instance of suspend_sys_sync_queue() was already replaced with sys_sync(). Clean up the remnants of the legacy wakelock code by removing the extraneous call to sys_sync() and restoring some of the surrounding printk statements that had been moved to suspend_sys_sync_queue() and subsequently lost. CRs-Fixed: 498669 Change-Id: Ifb2ede7808560f456c824d3d6359a4541c51b73f Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>	2017-10-15 15:46:58 +03:00
Arve Hjønnevåg	516049d328	PM / Sleep: Fix a mistake in a conditional in autosleep_store() The condition check in autosleep_store() is incorrect and prevents /sys/power/autosleep from working as advertised. Fix that. [rjw: Added the changelog.] Change-Id: I231cc24fc3f245003dcf5053ff6a71eb69ffa273 Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Git-commit: `040e5bf65e` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:58 +03:00
Amar Singhal	9801869142	power: main: Add conditional compilation for touch nodes Add conditional compilation for touch event sysfs nodes. Otherwise, if CONFIG_PM_SLEEP is not defined, there could be compilation errors. Change-Id: I1ac7f284ec35eae2cfa076ef8e71c29ddc24817c Signed-off-by: Amar Singhal <asinghal@codeaurora.org> Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:57 +03:00
Rafael J. Wysocki	d364a08bcc	PM / Sleep: User space wakeup sources garbage collector Kconfig option Make it possible to configure out the user space wakeup sources garbage collector for debugging and default Android builds. Change-Id: I85ca6caa92c8e82d863f0fa58d8861b5571c1b4a Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Arve Hjønnevåg <arve@android.com> Git-commit: `4e585d25e1` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:57 +03:00
Rafael J. Wysocki	5265f5b2e8	PM / Sleep: Make the limit of user space wakeup sources configurable Make it possible to configure out the check against the limit of user space wakeup sources for debugging and default Android builds. Change-Id: I8f74d7c8391627df970d2df666938069b012e2fe Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Arve Hjønnevåg <arve@android.com> Git-commit: `c73893e2ca` Git-repo: git://codeaurora.org/kernel/msm.git Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00
Anurag Singh	e535fb5292	power: Remove unnecessary options from Kconfig The config options HAS_WAKELOCK and WAKELOCK are not needed any more since a wakeup sources-based wakelock implementation will be used. Change-Id: I4163f048c079ec3d10a02d9db16c3ca6fb5fd759 Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00
Arve Hjønnevåg	f484965275	PM / Sleep: Add wake lock api wrapper on top of wakeup sources Change-Id: Icaad02fe1e8856fdc2e4215f380594a5dde8e002 Signed-off-by: Arve Hjønnevåg <arve@android.com> Git-commit: `e9911f4efd` Git-repo: git://codeaurora.org/kernel/msm.git [anursing@codeaurora.org: replace existing implementation, resolve merge conflicts] Signed-off-by: Anurag Singh <anursing@codeaurora.org>	2017-10-15 15:46:56 +03:00

1 2 3 4 5 ...

13844 commits