Add the BFQ-v7r8 I/O scheduler to 3.10.8+.
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.
- Slices are given in the service domain: tasks are assigned
budgets, measured in number of sectors. Once got the disk, a task
must however consume its assigned budget within a configurable
maximum time (by default, the maximum possible value of the
budgets is automatically computed to comply with this timeout).
This allows the desired latency vs "throughput boosting" tradeoff
to be set.
- Budgets are scheduled according to a variant of WF2Q+, implemented
using an augmented rb-tree to take eligibility into account while
preserving an O(log N) overall complexity.
- A low-latency tunable is provided; if enabled, both interactive
and soft real-time applications are guaranteed a very low latency.
- Latency guarantees are preserved also in the presence of NCQ.
- Also with flash-based devices, a high throughput is achieved
while still preserving latency guarantees.
- BFQ features Early Queue Merge (EQM), a sort of fusion of the
cooperating-queue-merging and the preemption mechanisms present
in CFQ. EQM is in fact a unified mechanism that tries to get a
sequential read pattern, and hence a high throughput, with any
set of processes performing interleaved I/O over a contiguous
sequence of sectors.
- BFQ supports full hierarchical scheduling, exporting a cgroups
interface. Since each node has a full scheduler, each group can
be assigned its own weight.
- If the cgroups interface is not used, only I/O priorities can be
assigned to processes, with ioprio values mapped to weights
with the relation weight = IOPRIO_BE_NR - ioprio.
- ioprio classes are served in strict priority order, i.e., lower
priority queues are not served as long as there are higher
priority queues. Among queues in the same class the bandwidth is
distributed in proportion to the weight of each queue. A very
thin extra bandwidth is however guaranteed to the Idle class, to
prevent it from starving.
Change-Id: Iebf9be399041b89d79b54077da1a34a81d4e4238
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Update Kconfig.iosched and do the related Makefile changes to include
kernel configuration options for BFQ. Also add the bfqio controller
to the cgroups subsystem.
Change-Id: I41b0fe61f036d59b641205ab21902401e7a704c0
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: Josue Rivera <prbassplayer@gmail.com>
The core_ctl module takes input from userspace and CPU load information to
decide how many CPUs to keep online. User space has the following tunables:
- min_cpus: Minimum number of CPUs to keep online. This overrides other
heuristics.
- max_cpus: Maximum number of CPUs to keep online. This overrides other
heuristics.
- additional_cpus: Additional idle CPUs to keep ready for use.
- busy_up_thres: The normalized load% threshold that the CPU load should
exceeded for the CPU to be go from not busy to busy.
It could be a single threshold for all CPUs in a group, or num_cpus
thresholds separated by spaces to specify different thresholds based on
the current number of online CPUs.
- busy_down_thres: The normalized load% threshold that the CPU load should
be lower than for the CPU to go from busy to not busy.
It could be a single threshold for all CPUs in a group, or num_cpus
thresholds separated by spaces to specify different thresholds based on
the current number of online CPUs.
- offline_delay_ms: The time to wait for before offline cores when the
number of needed CPUs goes down.
Mot-CRs-fixed: (CR)
Change-Id: Ied1d5bcbb8da5bbd5f3d1a3f042599babace6b65
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Ravi Chebolu <arc095@motorola.com>
Reviewed-on: http://gerrit.mot.com/866560
SME-Granted: SME Approvals Granted
SLTApproved: Slta Waiver <sltawvr@motorola.com>
Tested-by: Jira Key <jirakey@motorola.com>
Reviewed-by: Lian-Wei Wang <lian-wei.wang@motorola.com>
Reviewed-by: Christopher Fries <cfries@motorola.com>
Submit-Approved: Jira Key <jirakey@motorola.com>
This deliberately changes the behavior of the per-cpuset
cpus file to not be effected by hotplug. When a cpu is offlined,
it will be removed from the cpuset/cpus file. When a cpu is onlined,
if the cpuset originally requested that that cpu was part of the cpuset, that
cpu will be restored to the cpuset. The cpus files still
have to be hierachical, but the ranges no longer have to be out of
the currently online cpus, just the physically present cpus.
Change-Id: I3efbae24a1f6384be1e603fb56f0d3baef61d924
When moving a group_leader perf event from a software-context
to a hardware-context, there's a race in checking and
updating that context. The existing locking solution
doesn't work; note that it tries to grab a lock inside
the group_leader's context object, which you can only
get at by going through a pointer that should be protected
from these races. To avoid that problem, and to produce
a simple solution, we can just use a lock per group_leader
to protect all checks on the group_leader's context.
The new lock is grabbed and released when no context locks
are held.
Bug: 30955111
Bug: 31095224
Change-Id: If37124c100ca6f4aa962559fba3bd5dbbec8e052
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit: 5b87e00be9ca28ea32cab49b92c0386e4a91f730
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
Make change in __qseecom_load_fw() and qseecom_load_commonlib_image()
to check buffer size before copying img to buffer.
CRs-fixed: 1080290
Change-Id: I0f48666ac948a9571e249598ae7cc19df9036b1d
Signed-off-by: Zhen Kong <zkong@codeaurora.org>
There have been a few reported issues wrt. the lack of locking around
changing event->ctx. This patch tries to address those.
It avoids the whole rwsem thing and while it appears to work, please
give it some thought in review.
What I did fail at is sensible runtime checks on the use of
event->ctx, the RCU use makes it very hard.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit f63a8daa5812afef4f06c962351687e1ff9ccb2b)
Bug: 30955111
Bug: 31095224
Signed-off-by: Joao Dias <joaodias@google.com>
Change-Id: I8dfc0aae8d1206c177454e0093dacd82b6129c55
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
Git-commit: f63a8daa5812afef4f06c962351687e1ff9ccb2b
[rsiddoji@codeaurora.org: resloved some trival confilits]
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
Signed-off-by: Pradosh Das <prados@codeaurora.org>
Signed-off-by: Kishor PK <kpbhat@codeaurora.org>
Enable config IP_NF_MATCH_RPFILTER.
This option allows you to match packets whose replies would go out via
the interface the packet came in
Change-Id: I2a23346e726a8df8487aeb664d6316b3cf2b9d77
Signed-off-by: Bharat Pawar <bpawar@codeaurora.org>
If __key_link_begin() failed then "edit" would be uninitialized. I've
added a check to fix that.
This allows a random user to crash the kernel, though it's quite
difficult to achieve. There are three ways it can be done as the user
would have to cause an error to occur in __key_link():
(1) Cause the kernel to run out of memory. In practice, this is difficult
to achieve without ENOMEM cropping up elsewhere and aborting the
attempt.
(2) Revoke the destination keyring between the keyring ID being looked up
and it being tested for revocation. In practice, this is difficult to
time correctly because the KEYCTL_REJECT function can only be used
from the request-key upcall process. Further, users can only make use
of what's in /sbin/request-key.conf, though this does including a
rejection debugging test - which means that the destination keyring
has to be the caller's session keyring in practice.
(3) Have just enough key quota available to create a key, a new session
keyring for the upcall and a link in the session keyring, but not then
sufficient quota to create a link in the nominated destination keyring
so that it fails with EDQUOT.
The bug can be triggered using option (3) above using something like the
following:
echo 80 >/proc/sys/kernel/keys/root_maxbytes
keyctl request2 user debug:fred negate @t
The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system. Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.
Assuming the failure occurs, something like the following will be seen:
kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
------------[ cut here ]------------
kernel BUG at ../mm/slab.c:2821!
...
RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
RSP: 0018:ffff8804014a7de8 EFLAGS: 00010092
RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
...
Call Trace:
kfree+0xde/0x1bc
assoc_array_cancel_edit+0x1f/0x36
__key_link_end+0x55/0x63
key_reject_and_link+0x124/0x155
keyctl_reject_key+0xb6/0xe0
keyctl_negate_key+0x10/0x12
SyS_keyctl+0x9f/0xe7
do_syscall_64+0x63/0x13a
entry_SYSCALL64_slow_path+0x25/0x25
Fixes: f70e2e0619 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
Git-repo: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Git-commit: 38327424b40bcebe2de92d07312c89360ac9229a
(cherry picked from commit 38327424b40bcebe2de92d07312c89360ac9229a)
Change-Id: I07568c78448b9d4bcc19b506ac0cbeb3d8af6961
This was found during userspace fuzzing test when a large size dma cma
allocation is made by driver(like ion) through userspace.
show_stack+0x10/0x1c
dump_stack+0x74/0xc8
kasan_report_error+0x2b0/0x408
kasan_report+0x34/0x40
__asan_storeN+0x15c/0x168
memset+0x20/0x44
__dma_alloc_coherent+0x114/0x18c
Change-Id: Ia0c4def2ec27ec56e9faf43ed1b8012381e3b253
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 67a2e213e7e937c41c52ab5bc46bf3f4de469f6e
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[shashim@codeaurora.org: replace %p by %pK in print format]
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Adding user passed parameters without check might
lead to Integer overflow and unpredictable system
behaviour.
Change-Id: Iaf8259e3c4a157e1790f1447b1b62a646988b7c4
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Signed-off-by: Yang Guang <guyang@codeaurora.org>
Usage of %p exposes the kernel addresses, an easy target to
kernel write vulnerabilities. With this patch currently
%pK prints only Zeros as address. If you need actual address
echo 0 > /proc/sys/kernel/kptr_restrict
addressing the info leak issue under following CVEs
CVE-2016-8401, CVE-2016-8402, CVE-2016-8403,
CVE-2016-8404, CVE-2016-8405, CVE-2016-8406,
CVE-2016-8407
Change-Id: Iefe0639416275cfeca6e90b6f88cd0412bb76414
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
adding validation of the len variable in ping_common_sendmsg()
to check if it is less than icmph_len which canleading to
an overflow issue.
Addressing issue reported under CVE-2016-8399.
Change-Id: I98f7b070b41312832b6a347ea1c11b9c700159a7
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
commit 480546c9b2b8 ("ecryptfs: forbid opening
Files without mmap handler")
It fixed a local root exploit but also introduced a dependency on
the lower file system implementing an mmap operation just to open a file,
which is a bit of a heavy hammer. The right fix is to have mmap depend
on the existence of the mmap handler instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Git-repo:https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
Git-commit: 78c4e172412de5d0456dc00d2b34050aa0b683b5
Signed-off-by: Srinivasa Rao Kuppala <srkupp@codeaurora.org>
Change-Id: I7c11ae0215d9b3a5944bb3be73d5423a4d008fcd
There are legitimate reasons to disallow mmap on certain files, notably
in sysfs or procfs. We shouldn't emulate mmap support on file systems
that don't offer support natively.
CVE-2016-1583
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
[tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/
Git-commit: f0fe970df3838c202ef6c07a4c2b36838ef0a88b
[rsiddoji@codeaurora.org: backport and resolved trivial confits ]
Change-Id: I6127049027975ca20a38ee4fb500478db8b3fd5e
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
As with x86, mark the sys_call_table const such that it will be placed
in the .rodata section. This will cause attempts to modify the table
(accidental or deliberate) to fail when strict page permissions are in
place. In the absence of strict page permissions, there should be no
functional change.
Change-Id: I5b8fcf486c59cb1d83c117c5246eeb2447ccfb65
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Git-repo: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Git-commit: c623b33b4e9599c6ac5076f7db7369eb9869aa04
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
Initialize dp_fw to NULL so that we don't try to release it in the
error path err_invalid_fw.
CRs-Fixed: 1095243
Change-Id: I18f549102e626dc788e8fa56d6bb1ea28efe4f88
Signed-off-by: Puja Gupta <pujag@codeaurora.org>