android_kernel_samsung_msm8976

Commit Graph

Author	SHA1	Message	Date
Keith Busch	3710e26e8c	block: Fix dev_t minor allocation lifetime commit 2da78092dda13f1efd26edbbf99a567776913750 upstream. Releases the dev_t minor when all references are closed to prevent another device from acquiring the same major/minor. Since the partition's release may be invoked from call_rcu's soft-irq context, the ext_dev_idr's mutex had to be replaced with a spinlock so as not so sleep. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-10-05 14:54:12 -07:00
Toshiaki Makita	fe63ce5175	cfq-iosched: Fix wrong children_weight calculation commit e15693ef18e13e3e6bffe891fe140f18b8ff6d07 upstream. cfq_group_service_tree_add() is applying new_weight at the beginning of the function via cfq_update_group_weight(). This actually allows weight to change between adding it to and subtracting it from children_weight, and triggers WARN_ON_ONCE() in cfq_group_service_tree_del(), or even causes oops by divide error during vfr calculation in cfq_group_service_tree_add(). The detailed scenario is as follows: 1. Create blkio cgroups X and Y as a child of X. Set X's weight to 500 and perform some I/O to apply new_weight. This X's I/O completes before starting Y's I/O. 2. Y starts I/O and cfq_group_service_tree_add() is called with Y. 3. cfq_group_service_tree_add() walks up the tree during children_weight calculation and adds parent X's weight (500) to children_weight of root. children_weight becomes 500. 4. Set X's weight to 1000. 5. X starts I/O and cfq_group_service_tree_add() is called with X. 6. cfq_group_service_tree_add() applies its new_weight (1000). 7. I/O of Y completes and cfq_group_service_tree_del() is called with Y. 8. I/O of X completes and cfq_group_service_tree_del() is called with X. 9. cfq_group_service_tree_del() subtracts X's weight (1000) from children_weight of root. children_weight becomes -500. This triggers WARN_ON_ONCE(). 10. Set X's weight to 500. 11. X starts I/O and cfq_group_service_tree_add() is called with X. 12. cfq_group_service_tree_add() applies its new_weight (500) and adds it to children_weight of root. children_weight becomes 0. Calcularion of vfr triggers oops by divide error. weight should be updated right before adding it to children_weight. Reported-by: Ruki Sekiya <sekiya.ruki@lab.ntt.co.jp> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-10-05 14:54:08 -07:00
Tejun Heo	f5b48b7a3d	blkcg: don't call into policy draining if root_blkg is already gone commit 2a1b4cf2331d92bc009bf94fa02a24604cdaf24c upstream. While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687bce42b ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Shirish Pargaonkar <spargaonkar@suse.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-09-17 09:04:02 -07:00
Ian Maund	6440f462f9	Merge upstream tag 'v3.10.49' into msm-3.10 * commit 'v3.10.49': (529 commits) Linux 3.10.49 ACPI / battery: Retry to get battery information if failed during probing x86, ioremap: Speed up check for RAM pages Score: Modify the Makefile of Score, remove -mlong-calls for compiling Score: The commit is for compiling successfully. Score: Implement the function csum_ipv6_magic score: normalize global variables exported by vmlinux.lds rtmutex: Plug slow unlock race rtmutex: Handle deadlock detection smarter rtmutex: Detect changes in the pi lock chain rtmutex: Fix deadlock detector for real ring-buffer: Check if buffer exists before polling drm/radeon: stop poisoning the GART TLB drm/radeon: fix typo in golden register setup on evergreen ext4: disable synchronous transaction batching if max_batch_time==0 ext4: clarify error count warning messages ext4: fix unjournalled bg descriptor while initializing inode bitmap dm io: fix a race condition in the wake up code for sync_io Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code clk: spear3xx: Use proper control register offset ... In addition to bringing in upstream commits, this merge also makes minor changes to mainitain compatibility with upstream: The definition of list_next_entry in qcrypto.c and ipa_dp.c has been removed, as upstream has moved the definition to list.h. The implementation of list_next_entry was identical between the two. irq.c, for both arm and arm64 architecture, has had its calls to __irq_set_affinity_locked updated to reflect changes to the API upstream. Finally, as we have removed the sleep_length member variable of the tick_sched struct, all changes made by upstream commit `ec804bd` do not apply to our tree and have been removed from this merge. Only kernel/time/tick-sched.c is impacted. Change-Id: I63b7e0c1354812921c94804e1f3b33d1ad6ee3f1 Signed-off-by: Ian Maund <imaund@codeaurora.org>	2014-08-20 13:23:09 -07:00
Peter Zijlstra	c5ac12693f	arch: Mass conversion of smp_mb__() Mostly scripted conversion of the smp_mb__ barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 4e857c58efeb99393cba5a5d0d8ec7117183137c [joonwoop@codeaurora.org: fixed trivial merge conflict.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2014-08-15 11:45:28 -07:00
Tejun Heo	cebdb6fa24	blkcg: don't call into policy draining if root_blkg is already gone commit 0b462c89e31f7eb6789713437eb551833ee16ff3 upstream. While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687bce42b ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Shirish Pargaonkar <spargaonkar@suse.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-31 12:53:49 -07:00
Christoph Hellwig	cb454b6d31	block: don't assume last put of shared tags is for the host commit d45b3279a5a2252cafcd665bbf2db8c9b31ef783 upstream. There is no inherent reason why the last put of a tag structure must be the one for the Scsi_Host, as device model objects can be held for arbitrary periods. Merge blk_free_tags and __blk_free_tags into a single funtion that just release a references and get rid of the BUG() when the host reference wasn't the last. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-31 12:53:48 -07:00
Mikulas Patocka	668b7a05f2	block: provide compat ioctl for BLKZEROOUT commit 3b3a1814d1703027f9867d0f5cbbfaf6c7482474 upstream. This patch provides the compat BLKZEROOUT ioctl. The argument is a pointer to two uint64_t values, so there is no need to translate it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-31 12:53:48 -07:00
Tanya Brokhman	01635f0715	block: row: Fix crash when adding a new field in bio struct When adding new field to struct bio there is a crash in the removed code lines. This issue was introduced by commit `80a8f0f87b` "block: row-iosched idling triggered by readahead pages" (Partly) reverting this patch till root cause is fixed (on FS level). Change-Id: Idce180802227aaab495bf0723768ba4cb437bcab Signed-off-by: Tanya Brokhman <tlinder@codeaurora.org>	2014-06-22 16:18:04 +03:00
Roman Pen	e9d9339415	blktrace: fix accounting of partially completed requests commit af5040da01ef980670b3741b3e10733ee3e33566 upstream. trace_block_rq_complete does not take into account that request can be partially completed, so we can get the following incorrect output of blkparser: C R 232 + 240 [0] C R 240 + 232 [0] C R 248 + 224 [0] C R 256 + 216 [0] but should be: C R 232 + 8 [0] C R 240 + 8 [0] C R 248 + 8 [0] C R 256 + 8 [0] Also, the whole output summary statistics of completed requests and final throughput will be incorrect. This patch takes into account real completion size of the request and fixes wrong completion accounting. Signed-off-by: Roman Pen <r.peniaev@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> CC: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-30 21:52:11 -07:00
Amir Samuelov	dc43671ff7	security: selinux: Add Per-File-Encryption hooks Add hooks for tagging/detecting Per-File-Encryption files. Change-Id: I9d1f791b68d3552b1a508c21ff8336182e8527fa Signed-off-by: Amir Samuelov <amirs@codeaurora.org>	2014-05-21 15:56:51 +03:00
Venkat Gopalakrishnan	12d982a03b	block/fs: keep track of the task that dirtied the page Background writes happen in the context of a background thread. It is very useful to identify the actual task that generated the request instead of background task that submited the request. Hence keep track of the task when a page gets dirtied and dump this task info while tracing. Not all the pages in the bio are dirtied by the same task but most likely it will be, since the sectors accessed on the device must be adjacent. Change-Id: I6afba85a2063dd3350a0141ba87cf8440ce9f777 Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>	2014-05-06 12:08:26 -07:00
Ian Maund	356fb13538	Merge upstream linux-stable v3.10.36 into msm-3.10 * commit 'v3.10.36': (494 commits) Linux 3.10.36 netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages mm: close PageTail race net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE x86: fix boot on uniprocessor systems Input: cypress_ps2 - don't report as a button pads Input: synaptics - add manual min/max quirk for ThinkPad X240 Input: synaptics - add manual min/max quirk Input: mousedev - fix race when creating mixed device ext4: atomically set inode->i_flags in ext4_set_inode_flags() Linux 3.10.35 sched/autogroup: Fix race with task_groups list e100: Fix "disabling already-disabled device" warning xhci: Fix resume issues on Renesas chips in Samsung laptops Input: wacom - make sure touch_max is set for touch devices KVM: VMX: fix use after free of vmx->loaded_vmcs KVM: x86: handle invalid root_hpa everywhere KVM: MMU: handle invalid root_hpa at __direct_map Input: elantech - improve clickpad detection ARM: highbank: avoid L2 cache smc calls when PL310 is not present ... Change-Id: Ib68f565291702c53df09e914e637930c5d3e5310 Signed-off-by: Ian Maund <imaund@codeaurora.org>	2014-04-23 16:23:49 -07:00
Ian Maund	f1b32d4e47	Merge upstream linux-stable v3.10.28 into msm-3.10 The following commits have been reverted from this merge, as they are known to introduce new bugs and are currently incompatible with our audio implementation. Investigation of these commits is ongoing, and they are expected to be brought in at a later time: `86e6de7` ALSA: compress: fix drain calls blocking other compress functions (v6) `16442d4` ALSA: compress: fix drain calls blocking other compress functions This merge commit also includes a change in block, necessary for compilation. Upstream has modified elevator_init_fn to prevent race conditions, requring updates to row_init_queue and test_init_queue. * commit 'v3.10.28': (1964 commits) Linux 3.10.28 ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init() serial: amba-pl011: use port lock to guard control register access mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL md/raid5: Fix possible confusion when multiple write errors occur. md/raid10: fix two bugs in handling of known-bad-blocks. md/raid10: fix bug when raid10 recovery fails to recover a block. md: fix problem when adding device to read-only array with bitmap. drm/i915: fix DDI PLLs HW state readout code nilfs2: fix segctor bug that causes file system corruption thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only ftrace/x86: Load ftrace_ops in parameter not the variable holding it SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() writeback: Fix data corruption on NFS hwmon: (coretemp) Fix truncated name of alarm attributes vfs: In d_path don't call d_dname on a mount point staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq() staging: comedi: addi_apci_1032: fix subdevice type/flags bug mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully GFS2: Increase i_writecount during gfs2_setattr_chown perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h perf scripting perl: Fix build error on Fedora 12 ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic Linux 3.10.27 sched: Guarantee new group-entities always have weight sched: Fix hrtimer_cancel()/rq->lock deadlock sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining sched: Fix race on toggling cfs_bandwidth_used x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper SCSI: sd: Reduce buffer size for vpd request intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters. mac80211: move "bufferable MMPDU" check to fix AP mode scan ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS ACPI / TPM: fix memory leak when walking ACPI namespace mfd: rtsx_pcr: Disable interrupts before cancelling delayed works clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock clk: samsung: exynos4: Correct SRC_MFC register clk: clk-divider: fix divisor > 255 bug ahci: add PCI ID for Marvell 88SE9170 SATA controller parisc: Ensure full cache coherency for kmap/kunmap drm/nouveau/bios: make jump conditional ARM: shmobile: mackerel: Fix coherent DMA mask ARM: shmobile: armadillo: Fix coherent DMA mask ARM: shmobile: kzm9g: Fix coherent DMA mask ARM: dts: exynos5250: Fix MDMA0 clock number ARM: fix "bad mode in ... handler" message for undefined instructions ARM: fix footbridge clockevent device net: Loosen constraints for recalculating checksum in skb_segment() bridge: use spin_lock_bh() in br_multicast_set_hash_max netpoll: Fix missing TXQ unlock and and OOPS. net: llc: fix use after free in llc_ui_recvmsg virtio-net: fix refill races during restore virtio_net: don't leak memory or block when too many frags virtio-net: make all RX paths handle errors consistently virtio_net: fix error handling for mergeable buffers vlan: Fix header ops passthru when doing TX VLAN offload. net: rose: restore old recvmsg behavior rds: prevent dereference of a NULL device ipv6: always set the new created dst's from in ip6_rt_copy net: fec: fix potential use after free hamradio/yam: fix info leak in ioctl drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() net: inet_diag: zero out uninitialized idiag_{src,dst} fields ip_gre: fix msg_name parsing for recvfrom/recvmsg net: unix: allow bind to fail on mutex lock ipv6: fix illegal mac_header comparison on 32bit netvsc: don't flush peers notifying work during setting mtu tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 net: unix: allow set_peek_off to fail net: drop_monitor: fix the value of maxattr ipv6: don't count addrconf generated routes against gc limit packet: fix send path when running with proto == 0 virtio: delete napi structures from netdev before releasing memory macvtap: signal truncated packets tun: update file current position macvtap: update file current position macvtap: Do not double-count received packets rds: prevent BUG_ON triggered on congestion update to loopback net: do not pretend FRAGLIST support IPv6: Fixed support for blackhole and prohibit routes HID: Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue"" gpio-rcar: R-Car GPIO IRQ share interrupt clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast irqchip: renesas-irqc: Fix irqc_probe error handling Linux 3.10.26 sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c ext4: fix bigalloc regression arm64: Use Normal NonCacheable memory for writecombine arm64: Do not flush the D-cache for anonymous pages arm64: Avoid cache flushing in flush_dcache_page() ARM: KVM: arch_timers: zero CNTVOFF upon return to host ARM: hyp: initialize CNTVOFF to zero clocksource: arch_timer: use virtual counters arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S arm64: dts: Reserve the memory used for secondary CPU release address arm64: check for number of arguments in syscall_get/set_arguments() arm64: fix possible invalid FPSIMD initialization state ... Change-Id: Ia0e5d71b536ab49ec3a1179d59238c05bdd03106 Signed-off-by: Ian Maund <imaund@codeaurora.org>	2014-03-24 14:28:34 -07:00
Jens Axboe	163d66d4fb	block: add cond_resched() to potentially long running ioctl discard loop commit c8123f8c9cb517403b51aa41c3c46ff5e10b2c17 upstream. When mkfs issues a full device discard and the device only supports discards of a smallish size, we can loop in blkdev_issue_discard() for a long time. If preempt isn't enabled, this can turn into a softlock situation and the kernel will start complaining. Add an explicit cond_resched() at the end of the loop to avoid that. Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-22 12:41:28 -08:00
Tejun Heo	404ced25b4	block: __elv_next_request() shouldn't call into the elevator if bypassing commit 556ee818c06f37b2e583af0363e6b16d0e0270de upstream. request_queue bypassing is used to suppress higher-level function of a request_queue so that they can be switched, reconfigured and shut down. A request_queue does the followings while bypassing. * bypasses elevator and io_cq association and queues requests directly to the FIFO dispatch queue. * bypasses block cgroup request_list lookup and always uses the root request_list. Once confirmed to be bypassing, specific elevator and block cgroup policy implementations can assume that nothing is in flight for them and perform various operations which would be dangerous otherwise. Such confirmation is acheived by short-circuiting all new requests directly to the dispatch queue and waiting for all the requests which were issued before to finish. Unfortunately, while the request allocating and draining sides were properly handled, we forgot to actually plug the request dispatch path. Even after bypassing mode is confirmed, if the attached driver tries to fetch a request and the dispatch queue is empty, __elv_next_request() would invoke the current elevator's elevator_dispatch_fn() callback. As all in-flight requests were drained, the elevator wouldn't contain any request but once bypass is confirmed we don't even know whether the elevator is even there. It might be in the process of being switched and half torn down. Frank Mayhar reports that this actually happened while switching elevators, leading to an oops. Let's fix it by making __elv_next_request() avoid invoking the elevator_dispatch_fn() callback if the queue is bypassing. It already avoids invoking the callback if the queue is dying. As a dying queue is guaranteed to be bypassing, we can simply replace blk_queue_dying() check with blk_queue_bypass(). Reported-by: Frank Mayhar <fmayhar@google.com> References: http://lkml.kernel.org/g/1390319905.20232.38.camel@bobble.lax.corp.google.com Tested-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-22 12:41:28 -08:00
Christoph Hellwig	d040642497	kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS We've switched over every architecture that supports SMP to it, so remove the new useless config variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Git-commit: 0a06ff068f1255bcd7965ab07bc0f4adc3eb639a Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [imaund@codeaurora.org: resolve merge conflicts] Signed-off-by: Ian Maund <imaund@codeaurora.org>	2014-02-07 15:55:40 -08:00
Lee Susman	db776d551f	block: add test bio size define to test-iosched Add a define for the test bio size (which is the size of a page), this is used for allocating the right sized buffer for the bio during test request creation. Change-Id: I9505c85c4352009bdee442172eb8ae8f4254cfb0 Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2014-01-29 18:29:29 +02:00
Dinesh K Garg	e6304b0351	dm: Request based dm-crypt dm-crypt provides bios based device mapper module. dm-crypt operates on packets with 512 bytes size which is not effiicent way for HW based crypto blocks. dm-req-crypt is developed to address this. dm-req-crypt works on requests which carry upto 512KB of data for unmerged requests. Change-Id: I7d6a63d516dc2dbe80f46c06dd0722847d55bc9f Signed-off-by: Dinesh K Garg <dineshg@codeaurora.org>	2014-01-18 14:11:36 -08:00
Konstantin Dorfman	552431dbf2	block: do not notify urgent request, when flush with data in flight MMC device driver implements URGENT request execution with priority (using stop flow), as a result currently running (and prepeared) request may be reinserted back into I/O scheduler. This will break block layer logic of flushes (flush request should not be inserted into I/O scheduler). Block layer flush machinery keep q->flush_data_in_flight list updated with started but not completed flush requests with data (REQ_FUA). This change will not notify underling block device driver about pending urgent request during flushes in flight. Change-Id: I98113621223fe0c7d224de023db888a73bd62b48 Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>	2014-01-05 16:09:12 +02:00
Hong Zhiguo	950cda7f8e	Update of blkg_stat and blkg_rwstat may happen in bh context. While u64_stats_fetch_retry is only preempt_disable on 32bit UP system. This is not enough to avoid preemption by bh and may read strange 64 bit value. commit 2c575026fae6e63771bd2a4c1d407214a8096a89 upstream. Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-11 22:36:27 -08:00
Tomoki Sekiyama	72b9401c2f	elevator: acquire q->sysfs_lock in elevator_change() commit 7c8a3679e3d8e9d92d58f282161760a0e247df97 upstream. Add locking of q->sysfs_lock into elevator_change() (an exported function) to ensure it is held to protect q->elevator from elevator_init(), even if elevator_change() is called from non-sysfs paths. sysfs path (elv_iosched_store) uses __elevator_change(), non-locking version, as the lock is already taken by elv_iosched_store(). Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 07:29:27 -08:00
Tomoki Sekiyama	6d53d39270	elevator: Fix a race in elevator switching and md device initialization commit eb1c160b22655fd4ec44be732d6594fd1b1e44f4 upstream. The soft lockup below happens at the boot time of the system using dm multipath and the udev rules to switch scheduler. [ 356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483] [ 356.127001] RIP: 0010:[<ffffffff81072a7d>] [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50 ... [ 356.127001] Call Trace: [ 356.127001] [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70 [ 356.127001] [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230 [ 356.127001] [<ffffffff810738b2>] del_timer_sync+0x52/0x60 [ 356.127001] [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0 [ 356.127001] [<ffffffff812c98df>] elevator_exit+0x2f/0x50 [ 356.127001] [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0 [ 356.127001] [<ffffffff812caa50>] elv_iosched_store+0x20/0x50 [ 356.127001] [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0 [ 356.127001] [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140 [ 356.127001] [<ffffffff811a326d>] vfs_write+0xbd/0x1e0 [ 356.127001] [<ffffffff811a3ca9>] SyS_write+0x49/0xa0 [ 356.127001] [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b This is caused by a race between md device initialization by multipathd and shell script to switch the scheduler using sysfs. - multipathd: SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init q->elevator = elevator_alloc(q, e); // not yet initialized - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler': elevator_switch (in the call trace above) struct elevator_queue old = q->elevator; q->elevator = elevator_alloc(q, new_e); elevator_exit(old); // lockup! () - multipathd: (cont.) err = e->ops.elevator_init_fn(q); // init fails; q->elevator is modified (*) When del_timer_sync() is called, lock_timer_base() will loop infinitely while timer->base == NULL. In this case, as timer will never initialized, it results in lockup. This patch introduces acquisition of q->sysfs_lock around elevator_init() into blk_init_allocated_queue(), to provide mutual exclusion between initialization of the q->scheduler and switching of the scheduler. This should fix this bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=902012 Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 07:29:27 -08:00
Mikulas Patocka	d8db1a5f31	blk-core: Fix memory corruption if blkcg_init_queue fails commit fff4996b7db7955414ac74386efa5e07fd766b50 upstream. If blkcg_init_queue fails, blk_alloc_queue_node doesn't call bdi_destroy to clean up structures allocated by the backing dev. ------------[ cut here ]------------ WARNING: at lib/debugobjects.c:260 debug_print_object+0x85/0xa0() ODEBUG: free active (active state 0) object type: percpu_counter hint: (null) Modules linked in: dm_loop dm_mod ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev ipt_MASQUERADE iptable_nat nf_nat_ipv4 msr nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative spadfs fuse hid_generic usbhid hid raid0 md_mod dmi_sysfs nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lm85 hwmon_vid snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib snd_rawmidi snd soundcore acpi_cpufreq freq_table mperf sata_svw serverworks kvm_amd ide_core ehci_pci ohci_hcd libata ehci_hcd kvm usbcore tg3 usb_common libphy k10temp pcspkr ptp i2c_piix4 i2c_core evdev microcode hwmon rtc_cmos pps_core e100 skge floppy mii processor button unix CPU: 0 PID: 2739 Comm: lvchange Tainted: G W 3.10.15-devel #14 Hardware name: empty empty/S3992-E, BIOS 'V1.06 ' 06/09/2009 0000000000000009 ffff88023c3c1ae8 ffffffff813c8fd4 ffff88023c3c1b20 ffffffff810399eb ffff88043d35cd58 ffffffff81651940 ffff88023c3c1bf8 ffffffff82479d90 0000000000000005 ffff88023c3c1b80 ffffffff81039a67 Call Trace: [<ffffffff813c8fd4>] dump_stack+0x19/0x1b [<ffffffff810399eb>] warn_slowpath_common+0x6b/0xa0 [<ffffffff81039a67>] warn_slowpath_fmt+0x47/0x50 [<ffffffff8122aaaf>] ? debug_check_no_obj_freed+0xcf/0x250 [<ffffffff81229a15>] debug_print_object+0x85/0xa0 [<ffffffff8122abe3>] debug_check_no_obj_freed+0x203/0x250 [<ffffffff8113c4ac>] kmem_cache_free+0x20c/0x3a0 [<ffffffff811f6709>] blk_alloc_queue_node+0x2a9/0x2c0 [<ffffffff811f672e>] blk_alloc_queue+0xe/0x10 [<ffffffffa04c0093>] dm_create+0x1a3/0x530 [dm_mod] [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod] [<ffffffffa04c6c07>] dev_create+0x57/0x2b0 [dm_mod] [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod] [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod] [<ffffffffa04c6528>] ctl_ioctl+0x268/0x500 [dm_mod] [<ffffffff81097662>] ? get_lock_stats+0x22/0x70 [<ffffffffa04c67ce>] dm_ctl_ioctl+0xe/0x20 [dm_mod] [<ffffffff81161aad>] do_vfs_ioctl+0x2ed/0x520 [<ffffffff8116cfc7>] ? fget_light+0x377/0x4e0 [<ffffffff81161d2b>] SyS_ioctl+0x4b/0x90 [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f ---[ end trace 4b5ff0d55673d986 ]--- ------------[ cut here ]------------ This fix should be backported to stable kernels starting with 2.6.37. Note that in the kernels prior to 3.5 the affected code is different, but the bug is still there - bdi_init is called and bdi_destroy isn't. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-04 10:56:46 -08:00
Mike Snitzer	0deb6f9cb8	block: properly stack underlying max_segment_size to DM device commit d82ae52e68892338068e7559a0c0657193341ce4 upstream. Without this patch all DM devices will default to BLK_MAX_SEGMENT_SIZE (65536) even if the underlying device(s) have a larger value -- this is due to blk_stack_limits() using min_not_zero() when stacking the max_segment_size limit. 1073741824 before patch: 65536 after patch: 1073741824 Reported-by: Lukasz Flis <l.flis@cyfronet.pl> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-11-29 11:11:51 -08:00
Jeff Moyer	869d4e7f52	block: fix race between request completion and timeout handling commit 4912aa6c11e6a5d910264deedbec2075c6f1bb73 upstream. crocode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca be2net sg ses enclosure ext4 mbcache jbd2 sd_mod crc_t10dif ahci megaraid_sas(U) dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 491, comm: scsi_eh_0 Tainted: G W ---------------- 2.6.32-220.13.1.el6.x86_64 #1 IBM -[8722PAX]-/00D1461 RIP: 0010:[<ffffffff8124e424>] [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0 RSP: 0018:ffff881057eefd60 EFLAGS: 00010012 RAX: ffff881d99e3e8a8 RBX: ffff881d99e3e780 RCX: ffff881d99e3e8a8 RDX: ffff881d99e3e8a8 RSI: ffff881d99e3e780 RDI: ffff881d99e3e780 RBP: ffff881057eefd80 R08: ffff881057eefe90 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff881057f92338 R13: 0000000000000000 R14: ffff881057f92338 R15: ffff883058188000 FS: 0000000000000000(0000) GS:ffff880040200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000006d3ec0 CR3: 000000302cd7d000 CR4: 00000000000406b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process scsi_eh_0 (pid: 491, threadinfo ffff881057eee000, task ffff881057e29540) Stack: 0000000000001057 0000000000000286 ffff8810275efdc0 ffff881057f16000 <0> ffff881057eefdd0 ffffffff81362323 ffff881057eefe20 ffffffff8135f393 <0> ffff881057e29af8 ffff8810275efdc0 ffff881057eefe78 ffff881057eefe90 Call Trace: [<ffffffff81362323>] __scsi_queue_insert+0xa3/0x150 [<ffffffff8135f393>] ? scsi_eh_ready_devs+0x5e3/0x850 [<ffffffff81362a23>] scsi_queue_insert+0x13/0x20 [<ffffffff8135e4d4>] scsi_eh_flush_done_q+0x104/0x160 [<ffffffff8135fb6b>] scsi_error_handler+0x35b/0x660 [<ffffffff8135f810>] ? scsi_error_handler+0x0/0x660 [<ffffffff810908c6>] kthread+0x96/0xa0 [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffff81090830>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20 Code: 00 00 eb d1 4c 8b 2d 3c 8f 97 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 RIP [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0 RSP <ffff881057eefd60> The RIP is this line: BUG_ON(blk_queued_rq(rq)); After digging through the code, I think there may be a race between the request completion and the timer handler running. A timer is started for each request put on the device's queue (see blk_start_request->blk_add_timer). If the request does not complete before the timer expires, the timer handler (blk_rq_timed_out_timer) will mark the request complete atomically: static inline int blk_mark_rq_complete(struct request rq) { return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags); } and then call blk_rq_timed_out. The latter function will call scsi_times_out, which will return one of BLK_EH_HANDLED, BLK_EH_RESET_TIMER or BLK_EH_NOT_HANDLED. If BLK_EH_RESET_TIMER is returned, blk_clear_rq_complete is called, and blk_add_timer is again called to simply wait longer for the request to complete. Now, if the request happens to complete while this is going on, what happens? Given that we know the completion handler will bail if it finds the REQ_ATOM_COMPLETE bit set, we need to focus on the completion handler running after that bit is cleared. So, from the above paragraph, after the call to blk_clear_rq_complete. If the completion sets REQ_ATOM_COMPLETE before the BUG_ON in blk_add_timer, we go boom there (I haven't seen this in the cores). Next, if we get the completion before the call to list_add_tail, then the timer will eventually fire for an old req, which may either be freed or reallocated (there is evidence that this might be the case). Finally, if the completion comes in after* the addition to the timeout list, I think it's harmless. The request will be removed from the timeout list, req_atom_complete will be set, and all will be well. This will only actually explain the coredumps IF the request structure was freed, reallocated and queued before the error handler thread had a chance to process it. That is possible, but it may make sense to keep digging for another race. I think that if this is what was happening, we would see other instances of this problem showing up as null pointer or garbage pointer dereferences, for example when the request structure was not re-used. It looks like we actually do run into that situation in other reports. This patch moves the BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags)); from blk_add_timer to the only caller that could trip over it (blk_start_request). It then inverts the calls to blk_clear_rq_complete and blk_add_timer in blk_rq_timed_out to address the race. I've boot tested this patch, but nothing more. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-11-29 11:11:50 -08:00
Konstantin Dorfman	0e9d14e308	mmc: Unit test fix for logging Update logging with: - prefix with module name - add '\n' in the end - test_pr_* removed Change-Id: I465c9809def9d294dcbb3f7cf7f474c189f5fdbf Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>	2013-11-05 14:21:36 +02:00
Dolev Raviv	0730ae1d60	Revert "block: prevent access to NULL pointer in req->part" This reverts commit `f97d4f6148`. Although this check prevents NULL reference, it hides the real problem. Requests that wish to avoid statistics update have to disable the REQ_IO_STAT flag, otherwise req->part is expected to be initialized. Change-Id: I680b95ab9aa668612d948770347929ffde30aeab Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-10-27 13:39:57 +02:00
Dolev Raviv	163f46e7fe	block: test-iosched: disable statistic flag on request The flag REQ_IO_STAT is enabled by default this assumes statistics are initialized and might cause NULL references in the kernel. To avoid it this flag is cleared in the request and stats are not updated. Change-Id: I6a1890dde51dfa8ffdd376b13f4466c9db0ae05b Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-10-27 13:26:08 +02:00
Dolev Raviv	f97d4f6148	block: prevent access to NULL pointer in req->part Block layer is accessing req->part without checking for NULL pointer access, it is done when statistcs is not fully initialized. Preventing the NULL pointer access effects only the statistics update. Change-Id: I45c91c074ecec1c3849f4f36185edcc6db35383c Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-10-24 17:48:45 +03:00
Dolev Raviv	a34d47b53d	scsi: ufs: mixed long sequential The test will verify correctness of sequential data pattern written to the device while new data (with same pattern) is written simultaneously. First this test will run a long sequential write scenario. This first stage will write the pattern that will be read later. Second, sequential read requests will read and compare the same data. The second stage reads, will issue in Parallel to write requests with the same LBA and size. NOTE: The test requires a long timeout. The purpose of this test is to mix read and write requests on the same LBA while checking for the read data correctness. Change-Id: I6a437ce689b66233af3055d07a7f62f1e7b40765 Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-10-10 11:16:04 +03:00
Dolev Raviv	6a3202515b	scsi: ufs: add support for test specific completion check Introduce a new callback 'check_test_completion_fn' to test-iosched framework. This callback is necessary to determine if a test has completed or not in situation where the request queue is empty, but the test was not completed. Change-Id: I60bd8cccffacab11a5a7cba78caccf53fea3e1d8 Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-10-10 11:16:03 +03:00
Sujit Reddy Thumma	8cd640e8da	block: allow REQ_PM requests even when the device is suspended Some times even though the block device is suspended by the block layer the low-level driver might want to queue the PM requests to the device. Allow such requests to get peeked as the blk_pm_add_request() has already added it to the I/O scheduler otherwise the request would be forever stuck in the I/O scheduler without being fetched by the driver. Change-Id: I353943a7008ea1d92ff825d220cad1828fe37c27 Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>	2013-10-03 22:05:37 -07:00
Anatol Pomozov	85f58908c0	cfq: explicitly use 64bit divide operation for 64bit arguments commit f3cff25f05f2ac29b2ee355e611b0657482f6f1d upstream. 'samples' is 64bit operant, but do_div() second parameter is 32. do_div silently truncates high 32 bits and calculated result is invalid. In case if low 32bit of 'samples' are zeros then do_div() produces kernel crash. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Jonghwan Choi <jhbird.choi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-01 09:17:48 -07:00
Tatyana Brokhman	fbb81a523c	Revert "block: Add URGENT request notification support to CFQ scheduler" This reverts commit b410a82118cdaa1dc92759e7995c20dcce0d1f1a. The reverted commit was identified as the cause of the FS error mentioned in the CR bellow. It's reverted till further annalists of the root cause of FS error. Change-Id: Ia75216de8012a2491b87f33e8c21f75592d87c80 CRs-fixed: 531257 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-09-09 14:06:40 -07:00
Tatyana Brokhman	d5847837ef	block: Add URGENT request notification support to CFQ scheduler When the scheduler reports to the block layer that there is an urgent request pending, the device driver may decide to stop the transmission of the current request in order to handle the urgent one. This is done in order to reduce the latency of an urgent request. For example: long WRITE may be stopped to handle an urgent READ. Change-Id: I3072b8a1423870fed9c04c28d93caaf9557a7b89 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-09-04 17:08:10 -07:00
Lee Susman	8774b4dbbc	scsi: ufs: long sequential read/write tests This test adds the ability to test the UFS task management feature in the driver. It loads the queue with requests in order to allow the task management to operate in full capacity. Modify test-iosched infrastructure to support the new tests: - expose check_test_completion() Note: we submit 16-bio requests since the current HW is very slow and we don't want to exceed the timeout duration. Change-Id: I8ee752cba3c6838d8edc05747fa0288c4b347ef6 Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 17:06:59 -07:00
Lee Susman	3c36fc4ff4	mmc: card: change long_sequential_test time measurements to ktime Change time measurements in long_sequential_test from jiffies to ktime, and make the relevant change in test-iosched infrastructure. In long_sequential_test we measure throughput, and the jiffies resolution is not sensitive enough for this calculation. Change-Id: If7c9a03c687f61996609c014e056bcd7132b9012 Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 17:06:58 -07:00
Tatyana Brokhman	1d1d2ee81c	block: Remove "requeuing urgent req" error messages It is possible for URGENT request to be requeued/reinserted if it was fetched during the creation of a packed list. This end case is rare and is not handled at the moment. This patch changes the messages notifying of the above to debug level (instead of error) in order to clear the dmesg log. Change-Id: Ie8bc067e61559a6f702077b95c5dbcc426404232 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-09-04 16:39:05 -07:00
Tatyana Brokhman	1abaa811a6	block: urgent: Fix dispatching of URGENT mechanism There are cases when blk_peek_request is called not from blk_fetch_request thus the URGENT request may be started but the flag q->dispatched_urgent is not updated. Change-Id: I4fb588823f1b2949160cbd3907f4729767932e12 CRs-fixed: 471736 CRs-fixed: 473036 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-09-04 16:30:17 -07:00
Tatyana Brokhman	78ed8ec260	block: urgent request: remove unnecessary urgent marking An urgent request is marked by the scheduler in rq->cmd_flags with the REQ_URGENT flag. There is no need to add an additional marking by the block layer. Change-Id: I05d5e9539d2f6c1bfa80240b0671db197a5d3b3f Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-09-04 16:07:06 -07:00
Dolev Raviv	07e8067e29	block: fix test crashing due to synchronization issue The __blk_run_queue function is called from several contexts. The fix is replacing it with blk_run_queue function, this function is guarded with a lock, thus making it thread safe and prevents the crashing. Change-Id: I3e12fa9c8b9e161375fffa3570abfa46b223a60b Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-09-04 15:58:07 -07:00
Lee Susman	dc97f3c488	mmc: enhance long_sequential_test for higher throughput Change the test design so that requests are dynamically created and freed. This enables running tests with more than 128 requests, therefore more than 50MiB can be written/read and makes it possible to measure driver write/read throughput more accurately. Change-Id: I56c9d6c1afba5c91a0621a16d97feafd4689521d Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 15:49:04 -07:00
Dolev Raviv	7b52db505f	block: test-iosched: Add support for setting rq_disk Some block devices requires the rq_disk field to be assigned. This patch exposes a new API to the block device test utility for getting the rq_disk assigned, in the created request. Change-Id: I61dc4dad50eb7600728156a6cd08bb1ee134df0d Signed-off-by: Dolev Raviv <draviv@codeaurora.org>	2013-09-04 15:48:12 -07:00
Lee Susman	30984a303e	mmc: new request notification unit-test The new request notification test checks the following scenario: A new request arrives after a NULL request was sent to the mmc_queue, which is waiting for completion of a former request. Change-Id: I05db0959ded400e292eb5e84e1ecfc579b78ee62 Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org> Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 15:45:11 -07:00
Lee Susman	edd0f1bf35	block: test-iosched infrastructure enhancement Add functionality to test-iosched so that it could simulate the ROW scheduler behaviour. The main additions are: - 3 distinct requests queue with counters - support for urgent request pending - reinsert request implementation (callback + dispatch behavior) Change-Id: I83b5d9e3d2b8cd9a2353afa6a3e6a4cbc83b0cd4 Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org> Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 15:45:10 -07:00
Maya Erez	6c5c821a29	mmc: Enable eMMC unit-tests Enable the compilation of eMMC4.5 unit-tests, required by APT team. This will allow the APT team to test the storage activity on released builds. The storage tests are disabled in normal operation and in order to activate them a test I/O scheduler should be chosen and the test should be triggered via debugfs. Therefore they have no effect on normal eMMC driver operation. Change-Id: I179c567f67cc8fab9ed1edab8246483de18bc76a Signed-off-by: Maya Erez <merez@codeaurora.org>	2013-09-04 15:39:07 -07:00
Lee Susman	4d49de28ab	mmc: improve mmc_block_test printouts Change the printout format to be more readable. Specifically, add quotes around the test case name strings. Change-Id: I51b0c1b94389e4b51af84c5e993207b18efc2226 Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 15:37:16 -07:00
Lee Susman	1b44812543	mmc: card: Add long sequential read test to test-iosched Long sequential read test measures read throughput at the driver level by reading large requests sequentially. Change-Id: I3b6d685930e1d0faceabbc7d20489111734cc9d4 Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-09-04 15:30:39 -07:00
Stephen Boyd	38d8910730	Merge branch 'qandroid-3.10' into msm-3.10 * qandroid-3.10: (636 commits) netfilter: xt_qtaguid: Protect iface list access with necessary lock HID: magicmouse: Fix build warning USB: gadget: mtp: Fix OUT endpoint request length usage in read USB: gadget: f_mtp: Fix using tx buffer pointer msm: Fix race condition in domain lookup msm: Add null-pointer checks for domains base: sync: increase size of sync_timeline name USB: gadget: mtp: Add module parameters for Tx transfer length msm: iommu: Lock the genpool allocation gpu: ion: fix page offset in dma_buf_kmap() gpu: ion: Fix bug in ion_system_heap map_user gpu: ion: Only map as much of the vma as the user requested gpu: ion: use vmalloc to allocate page array to map kernel gpu: ion: Remove dead comments gpu: ion: Minimize allocation fallback delay mmc: sd: Set the card removed if card detect fails gpu: ion: don't fault in individual pages for the CP heap gpu: ion: do not ask for compound pages in system heap gpu: ion: Modify the system heap to try to allocate large/huge pages gpu: ion: Set the dma_address of the sg list at alloc time ... Conflicts: arch/arm/Kconfig arch/arm/include/asm/hardware/cache-l2x0.h arch/arm/mm/cache-l2x0.c drivers/mmc/card/block.c drivers/usb/gadget/udc-core.c	2013-09-04 14:46:18 -07:00
Tatyana Brokhman	a19db032ca	block: row: Remove warning massage from add_request Regular priority queues is marked as "starved" if it skipped a dispatch due to being empty. When a new request is added to a "starved" queue it will be marked as urgent. The removed WARN_ON was warning about an impossible case when a regular priority (read) queue was marked as starved but wasn't empty. This is a possible case due to the bellow: If the device driver fetched a read request that is pending for transmission and an URGENT request arrives, the fetched read will be reinserted back to the scheduler. Its possible that the queue it will be reinserted to was marked as "starved" in the meanwhile due to being empty. CRs-fixed: 517800 Change-Id: Iaae642ea0ed9c817c41745b0e8ae2217cc684f0c Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-22 18:08:59 -07:00
Tatyana Brokhman	ca231e14f3	block: row: change hrtimer_cancel to hrtimer_try_to_cancel Calling hrtimer_cancel with interrupts disabled can result in a livelock. When flushing plug list in the block layer interrupts are disabled and an hrtimer is used when adding requests from that plug list to the scheduler. In this code flow, if the hrtimer (which is used for idling) is set, it's being canceled by calling hrtimer_cancel. hrtimer_cancel will perform the following in an endless loop: 1. try cancel the timer 2. if fails - rest_cpu the cancellation can fail if the timer function already started. Since interrupts are disabled it can never complete. This patch reduced the number of times the hrtimer lock is taken while interrupts are disabled by calling hrtimer_try_co_cancel. the later will try to cancel the timer just once and return with an error code if fails. CRs-fixed: 499887 Change-Id: I25f79c357426d72ad67c261ce7cb503ae97dc7b9 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-22 18:08:36 -07:00
Lee Susman	80a8f0f87b	block: row-iosched idling triggered by readahead pages In the current implementation idling is triggered only by request insertion frequency. This heuristic is not very accurate and may hit random requests that shouldn't trigger idling. This patch uses the PG_readahead flag in struct page's flags, which indicates that the page is part of a readahead window, to start idling upon dispatch of a request associated with a readahead page. The above readehead flag is used together with the existing insertion-frequency trigger. The frequency timer will catch read requests which are not part of a readahead window, but are still part of a sequential stream (and therefore dispatched in small time intervals). Change-Id: Icb7145199c007408de3f267645ccb842e051fd00 Signed-off-by: Lee Susman <lsusman@codeaurora.org>	2013-08-22 18:08:28 -07:00
Tatyana Brokhman	852d8b48d9	block: urgent request: Update dispatch_urgent in case of requeue/reinsert The block layer implements a mechanism for verifying that the device driver won't be notified of an URGENT request if there is already an URGENT request in flight. This is due to the fact that interrupting an URGENT request isn't efficient. This patch fixes the above described mechanism in case the URGENT request was returned back to the block layer from some reason: by requeue or reinsert. CRs-fixed: 473376, 473036, 471736 Change-Id: Ie8b8208230a302d4526068531616984825f1050d Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-22 18:07:57 -07:00
Maya Erez	1183b42d40	block: row: Fix starvation tolerance values The current starvation tolerance values increase the boot time since high priority SW requests are delayed by regular priority requests. In order to overcome this, increase the starvation tolerance values. Change-Id: I9947fca9927cbd39a1d41d4bd87069df679d3103 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org>	2013-08-22 18:07:49 -07:00
Tatyana Brokhman	d01470a40d	block: row: Update sysfs functions All ROW (time related) configurable parameters are stored in ms so there is no need to convert from/to ms when reading/updating them via sysfs. Change-Id: Ib6a1de54140b5d25696743da944c076dd6fc02ae Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-22 18:07:46 -07:00
Tatyana Brokhman	36f707b17a	block: row: Prevent starvation of regular priority by high priority At the moment all REGULAR and LOW priority requests are starved as long as there are HIGH priority requests to dispatch. This patch prevents the above starvation by setting a starvation limit the REGULAR\LOW priority requests can tolerate. Change-Id: Ibe24207982c2c55d75c0b0230f67e013d1106017 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-22 18:07:45 -07:00
Jianpeng Ma	a6ad83fce0	elevator: Fix a race in elevator switching commit d50235b7bc3ee0a0427984d763ea7534149531b4 upstream. There's a race between elevator switching and normal io operation. Because the allocation of struct elevator_queue and struct elevator_data don't in a atomic operation.So there are have chance to use NULL ->elevator_data. For example: Thread A: Thread B blk_queu_bio elevator_switch spin_lock_irq(q->queue_block) elevator_alloc elv_merge elevator_init_fn Because call elevator_alloc, it can't hold queue_lock and the ->elevator_data is NULL.So at the same time, threadA call elv_merge and nedd some info of elevator_data.So the crash happened. Move the elevator_alloc into func elevator_init_fn, it make the operations in a atomic operation. Using the follow method can easy reproduce this bug 1:dd if=/dev/sdb of=/dev/null 2:while true;do echo noop > scheduler;echo deadline > scheduler;done The test method also use this method. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-08-20 08:43:03 -07:00
Tatyana Brokhman	313bd2fbf1	block: row: Re-design urgent request notification mechanism When ROW scheduler reports to the block layer that there is an urgent request pending, the device driver may decide to stop the transmission of the current request in order to handle the urgent one. This is done in order to reduce the latency of an urgent request. For example: long WRITE may be stopped to handle an urgent READ. This patch updates the ROW URGENT notification policy to apply with the below: - Don't notify URGENT if there is an un-completed URGENT request in driver - After notifying that URGENT request is present, the next request dispatched is the URGENT one. - At every given moment only 1 request can be marked as URGENT. Independent of it's location (driver or scheduler) Other changes to URGENT policy: - Only READ queues are allowed to notify of an URGENT request pending. CR fix: If a pending urgent request (A) gets merged with another request (B) A is removed from scheduler queue but is not removed from rd->pending_urgent_rq. CRs-Fixed: 453712 Change-Id: I321e8cf58e12a05b82edd2a03f52fcce7bc9a900 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-15 15:18:27 -07:00
Tatyana Brokhman	e6c4488c57	block: test-iosched: Sleep before each test In order to be sure that the packing statistics collected after the test reflect only requests issued by the test (and not real request from FS) - sleep before each test in order to give an already dispatched requests time to complete. Change-Id: If2f40efad1d79084a8ea85afe93cce58e49ff698 CRs-Fixed: 453712 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-08-15 15:18:26 -07:00
Kees Cook	88ce7cf76c	block: do not pass disk names as format strings commit ffc8b30866879ed9ba62bd0a86fecdbd51cd3d19 upstream. Disk names may contain arbitrary strings, so they must not be interpreted as format strings. It seems that only md allows arbitrary strings to be used for disk names, but this could allow for a local memory corruption from uid 0 into ring 0. CVE-2013-2851 Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-13 11:42:26 -07:00
Tatyana Brokhman	59d6580bf3	block: row: Update initial values of ROW data structures This patch sets the initial values of internal ROW parameters. Change-Id: I38132062a7fcbe2e58b9cc757e55caac64d013dc Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org> [smuckle@codeaurora.org: ported from msm-3.7] Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2013-07-08 05:55:11 -07:00
Tatyana Brokhman	72ca8dabb7	block: row: Don't notify URGENT if there are un-completed urgent req When ROW scheduler reports to the block layer that there is an urgent request pending, the device driver may decide to stop the transmission of the current request in order to handle the urgent one. If the current transmitted request is an urgent request - we don't want it to be stopped. Due to the above ROW scheduler won't notify of an urgent request if there are urgent requests in flight. Change-Id: I2fa186d911b908ec7611682b378b9cdc48637ac7 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:09 -07:00
Tatyana Brokhman	d3da5b5a78	block: row: Idling mechanism re-factoring At the moment idling in ROW is implemented by delayed work that uses jiffies granularity which is not very accurate. This patch replaces current idling mechanism implementation with hrtime API, which gives nanosecond resolution (instead of jiffies). Change-Id: I86c7b1776d035e1d81571894b300228c8b8f2d92 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:09 -07:00
Tatyana Brokhman	880972cc41	block: row: Dispatch requests according to their io-priority This patch implements "application-hints" which is a way the issuing application can notify the scheduler on the priority of its request. This is done by setting the io-priority of the request. This patch reuses an already existing mechanism of io-priorities developed for CFQ. Please refer to kernel/Documentation/block/ioprio.txt for usage example and explanations. Change-Id: I228ec8e52161b424242bb7bb133418dc8b73925a Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:08 -07:00
Tatyana Brokhman	6d2d825ef6	block: row: Aggregate row_queue parameters to one structure Each ROW queues has several parameters which default values are defined in separate arrays. This patch aggregates all default values into one array. The values in question are: - is idling enabled for the queue - queue quantum - can the queue notify on urgent request Change-Id: I3821b0a042542295069b340406a16b1000873ec6 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:06 -07:00
Tatyana Brokhman	4e832c28c7	block: row: fix sysfs functions - idle_time conversion idle_time was updated to be stored in msec instead of jiffies. So there is no need to convert the value when reading from user or displaying the value to him. Change-Id: I58e074b204e90a90536d32199ac668112966e9cf Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:05 -07:00
Tatyana Brokhman	fa56654524	block: row: Insert dispatch_quantum into struct row_queue There is really no point in keeping the dispatch quantum of a queue outside of it. By inserting it to the row_queue structure we spare extra level in accessing it. Change-Id: Ic77571818b643e71f9aafbb2ca93d0a92158b199 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:05 -07:00
Tatyana Brokhman	98e48591e3	block: row: Add some debug information on ROW queues 1. Add a counter for number of requests on queue. 2. Add function to print queues status (number requests currently on queue and number of already dispatched requests in current dispatch cycle). Change-Id: I1e98b9ca33853e6e6a8ddc53240f6cd6981e6024 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:05 -07:00
Tatyana Brokhman	5a209c1f4f	row: Add support for urgent request handling This patch adds support for handling urgent requests. ROW queue can be marked as "urgent" so if it was un-served in last dispatch cycle and a request was added to it - it will trigger issuing an urgent-request-notification to the block device driver. The block device driver may choose at stop the transmission of current ongoing request to handle the urgent one. Foe example: long WRITE may be stopped to handle an urgent READ. This decreases READ latency. Change-Id: I84954c13f5e3b1b5caeadc9fe1f9aa21208cb35e Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:03 -07:00
Tatyana Brokhman	ade89fc539	row: Adding support for reinsert already dispatched req Add support for reinserting already dispatched request back to the schedulers internal data structures. The request will be reinserted back to the queue (head) it was dispatched from as if it was never dispatched. Change-Id: I70954df300774409c25b5821465fb3aa33d8feb5 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:03 -07:00
Sujit Reddy Thumma	0a4d72a6fb	cfq-iosched: Fix null pointer dereference NULL pointer dereference can happen in cfq_choose_cfqg() when there are no cfq groups to select other than the current serving group. Prevent this by adding a NULL check before dereferencing. Unable to handle kernel NULL pointer dereference at virtual address [<c02502cc>] (cfq_dispatch_requests+0x368/0x8c0) from [<c0243f30>] (blk_peek_request+0x220/0x25c) [<c0243f30>] (blk_peek_request+0x220/0x25c) from [<c0243f74>] (blk_fetch_request+0x8/0x1c) [<c0243f74>] (blk_fetch_request+0x8/0x1c) from [<c041cedc>] (mmc_queue_thread+0x58/0x120) [<c041cedc>] (mmc_queue_thread+0x58/0x120) from [<c00ad310>] (kthread+0x84/0x90) [<c00ad310>] (kthread+0x84/0x90) from [<c000eeac>] (kernel_thread_exit+0x0/0x8) CRs-Fixed: 416466 Change-Id: I1fab93a4334b53e1d7c5dcc8f93cff174bae0d5e Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>	2013-07-08 05:55:03 -07:00
Tatyana Brokhman	ed74a56687	block:row: fix idling mechanism in ROW This patch addresses the following issues found in the ROW idling mechanism: 1. Fix the delay passed to queue_delayed_work (pass actual delay and not the time when to start the work) 2. Change the idle time and the idling-trigger frequency to be HZ dependent (instead of using msec_to_jiffies()) 3. Destroy idle_workqueue() in queue_exit Change-Id: If86513ad6b4be44fb7a860f29bd2127197d8d5bf Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:00 -07:00
Tatyana Brokhman	fa4331e301	block: Add API for urgent request handling This patch add support in block & elevator layers for handling urgent requests. The decision if a request is urgent or not is taken by the scheduler. Urgent request notification is passed to the underlying block device driver (eMMC for example). Block device driver may decide to interrupt the currently running low priority request to serve the new urgent request. By doing so READ latency is greatly reduced in read&write collision scenarios. Note that if the current scheduler doesn't implement the urgent request mechanism, this code path is never activated. Change-Id: I8aa74b9b45c0d3a2221bd4e82ea76eb4103e7cfa Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:00 -07:00
Tatyana Brokhman	0d0db8896e	block: Add support for reinsert a dispatched req Add support for reinserting a dispatched request back to the scheduler's internal data structures. This capability is used by the device driver when it chooses to interrupt the current request transmission and execute another (more urgent) pending request. For example: interrupting long write in order to handle pending read. The device driver re-inserts the remaining write request back to the scheduler, to be rescheduled for transmission later on. Add API for verifying whether the current scheduler supports reinserting requests mechanism. If reinsert mechanism isn't supported by the scheduler, this code path will never be activated. Change-Id: I5c982a66b651ebf544aae60063ac8a340d79e67f Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:55:00 -07:00
Tatyana Brokhman	971c289cd3	block: ROW: Fix forced dispatch This patch fixes forced dispatch in the ROW scheduling algorithm. When the dispatch function is called with the forced flag on, we can't delay the dispatch of the requests that are in scheduler queues. Thus, when dispatch is called with forced turned on, we need to cancel idling, or not to idle at all. Change-Id: I3aa0da33ad7b59c0731c696f1392b48525b52ddc Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:54:53 -07:00
Tatyana Brokhman	e021b54faf	block: ROW: Correct minimum values of ROW tunable parameters The ROW scheduling algorithm exposes several tunable parameters. This patch updates the minimum allowed values for those parameters. Change-Id: I5ec19d54b694e2e83ad5376bd99cc91f084967f5 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:54:52 -07:00
Tatyana Brokhman	d8fa466602	block: Adding ROW scheduling algorithm This patch adds the implementation of a new scheduling algorithm - ROW. The policy of this algorithm is to prioritize READ requests over WRITE as much as possible without starving the WRITE requests. Change-Id: I4ed52ea21d43b0e7c0769b2599779a3d3869c519 Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>	2013-07-08 05:54:50 -07:00
Maya Erez	31d40e487a	block: test-iosched error handling fixes - Fix test-iosched crash when running multiple tests - Free the BIOs memory when a request is not completed Change-Id: I1baa916c04ae73c809dee7e67ec63f4546dc71aa Signed-off-by: Maya Erez <merez@codeaurora.org>	2013-07-08 05:52:31 -07:00
Maya Erez	f88cc0d55c	block: Add test-iosched scheduler The test scheduler allows testing a block device by dispatching specific requests according to the test case and declare PASS/FAIL according to the requests completion error code Change-Id: Ief91f9fed6e3c3c75627d27264d5252ea14f10ad Signed-off-by: Maya Erez <merez@codeaurora.org>	2013-07-08 05:52:28 -07:00
San Mehat	ac0949ebe8	block: genhd: Add disk/partition specific uevent callbacks for partition info For disk devices, a new uevent parameter 'NPARTS' specifies the number of partitions detected by the kernel. Partition devices get 'PARTN' which specifies the partitions index in the table, and 'PARTNAME', which specifies PARTNAME specifices the partition name of a partition device Signed-off-by: Dima Zavin <dima@android.com>	2013-07-01 13:40:28 -07:00
Aaron Lu	c60855cdb9	blkpm: avoid sleep when holding queue lock In blk_post_runtime_resume, an autosuspend request will be initiated for the device. Since we are holding the queue lock, we can't sleep and thus we should use the async version to initiate an autosuspend, i.e. pm_request_suspend instead of pm_runtime_suspend, which might sleep. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-05-17 10:00:43 +02:00
Linus Torvalds	4de13d7aa8	Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block Pull block core updates from Jens Axboe: - Major bit is Kents prep work for immutable bio vecs. - Stable candidate fix for a scheduling-while-atomic in the queue bypass operation. - Fix for the hang on exceeded rq->datalen 32-bit unsigned when merging discard bios. - Tejuns changes to convert the writeback thread pool to the generic workqueue mechanism. - Runtime PM framework, SCSI patches exists on top of these in James' tree. - A few random fixes. * 'for-3.10/core' of git://git.kernel.dk/linux-block: (40 commits) relay: move remove_buf_file inside relay_close_buf partitions/efi.c: replace useless kzalloc's by kmalloc's fs/block_dev.c: fix iov_shorten() criteria in blkdev_aio_read() block: fix max discard sectors limit blkcg: fix "scheduling while atomic" in blk_queue_bypass_start Documentation: cfq-iosched: update documentation help for cfq tunables writeback: expose the bdi_wq workqueue writeback: replace custom worker pool implementation with unbound workqueue writeback: remove unused bdi_pending_list aoe: Fix unitialized var usage bio-integrity: Add explicit field for owner of bip_buf block: Add an explicit bio flag for bios that own their bvec block: Add bio_alloc_pages() block: Convert some code to bio_for_each_segment_all() block: Add bio_for_each_segment_all() bounce: Refactor __blk_queue_bounce to not use bi_io_vec raid1: use bio_copy_data() pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage pktcdvd: use bio_copy_data() block: Add bio_copy_data() ...	2013-05-08 10:13:35 -07:00
Kent Overstreet	a27bb332c0	aio: don't include aio.h in sched.h Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 20:16:25 -07:00
Linus Torvalds	736a2dd257	Lots of virtio work which wasn't quite ready for last merge window. Plus I dived into lguest again, reworking the pagetable code so we can move the switcher page: our fixmaps sometimes take more than 2MB now... Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJRga7lAAoJENkgDmzRrbjx/yIQAKpqIBtxOJeYH3SY+Uoe7Cfp toNYcpJEldvb0UcWN8M2cSZpHoxl1SUoq9djwcM29tcKa7EZAjHaGtb/Q1qMTDgv +B3WAfiGU2pmXFxLAkbrlLNGnysy24JspqJQ5hcYV84EiBxQdZp+nCYgOphd+GMK ww16vo9ya8jFjzt3GeRp/Heb3vEzV4Cp6BC3i0m8A3WNpEpbRb66pqXNk5o8ggJO SxQOKSXmUM+0m+jKSul5xn3e2Ls2LOrZZ8/DIHA+gW66N4Zab7n2/j1Q9VRxb4lh FqnR7KwgBX8OCh9IsBDqQYS7MohvMYge6eUdLtFrq84jvMleMEhrC8q9v2tucFUb 5t18CLwvyK7Gdg6UCKiZ7YSPcuURAILO16al9bh5IseeBDsuX+43VsvQoBmFn9k6 cLOVTZ6BlOmahK5PyRYFSvLa9Rxzr/05Mr7oYq9UgshD9io78dnqczFYIORF53rW zD7C4HuTZfYJFfNd0wAJ0RfVXnf8QvDlMdo7zPC26DSXNWqj8OexCY0qqSWUB+2F vcfJP6NkV4fZB8aawWIFUVwc64yqtt2uPVLa7ATZWqk16PgKrchGewmw3tiEwOgu 1l7xgffTRRUIJsqaCZoXdgw3yezcKRjuUBcOxL09lDAAhc+NxWNvzZBsKp66DwDk yZQKn0OdXnuf0CeEOfFf =1tYL -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio & lguest updates from Rusty Russell: "Lots of virtio work which wasn't quite ready for last merge window. Plus I dived into lguest again, reworking the pagetable code so we can move the switcher page: our fixmaps sometimes take more than 2MB now..." Ugh. Annoying conflicts with the tcm_vhost -> vhost_scsi rename. Hopefully correctly resolved. * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (57 commits) caif_virtio: Remove bouncing email addresses lguest: improve code readability in lg_cpu_start. virtio-net: fill only rx queues which are being used lguest: map Switcher below fixmap. lguest: cache last cpu we ran on. lguest: map Switcher text whenever we allocate a new pagetable. lguest: don't share Switcher PTE pages between guests. lguest: expost switcher_pages array (as lg_switcher_pages). lguest: extract shadow PTE walking / allocating. lguest: make check_gpte et. al return bool. lguest: assume Switcher text is a single page. lguest: rename switcher_page to switcher_pages. lguest: remove RESERVE_MEM constant. lguest: check vaddr not pgd for Switcher protection. lguest: prepare to make SWITCHER_ADDR a variable. virtio: console: replace EMFILE with EBUSY for already-open port virtio-scsi: reset virtqueue affinity when doing cpu hotplug virtio-scsi: introduce multiqueue support virtio-scsi: push vq lock/unlock into virtscsi_vq_done virtio-scsi: pass struct virtio_scsi to virtqueue completion function ...	2013-05-02 14:14:04 -07:00
Philippe De Muyter	ea56505bed	partitions/efi.c: replace useless kzalloc's by kmalloc's In alloc_read_gpt_entries and alloc_read_gpt_header, the kzalloc'ated zones are either totally overwritten by the following read_lba call, or freed. As kmalloc is cheaper than kzalloc, use kmalloc. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Panagiotis Issaris <takis@issaris.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-04-30 08:34:25 +02:00
Linus Torvalds	191a712090	Merge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Fixes and a lot of cleanups. Locking cleanup is finally complete. cgroup_mutex is no longer exposed to individual controlelrs which used to cause nasty deadlock issues. Li fixed and cleaned up quite a bit including long standing ones like racy cgroup_path(). - device cgroup now supports proper hierarchy thanks to Aristeu. - perf_event cgroup now supports proper hierarchy. - A new mount option "__DEVEL__sane_behavior" is added. As indicated by the name, this option is to be used for development only at this point and generates a warning message when used. Unfortunately, cgroup interface currently has too many brekages and inconsistencies to implement a consistent and unified hierarchy on top. The new flag is used to collect the behavior changes which are necessary to implement consistent unified hierarchy. It's likely that this flag won't be used verbatim when it becomes ready but will be enabled implicitly along with unified hierarchy. The option currently disables some of broken behaviors in cgroup core and also .use_hierarchy switch in memcg (will be routed through -mm), which can be used to make very unusual hierarchy where nesting is partially honored. It will also be used to implement hierarchy support for blk-throttle which would be impossible otherwise without introducing a full separate set of control knobs. This is essentially versioning of interface which isn't very nice but at this point I can't see any other options which would allow keeping the interface the same while moving towards hierarchy behavior which is at least somewhat sane. The planned unified hierarchy is likely to require some level of adaptation from userland anyway, so I think it'd be best to take the chance and update the interface such that it's supportable in the long term. Maintaining the existing interface does complicate cgroup core but shouldn't put too much strain on individual controllers and I think it'd be manageable for the foreseeable future. Maybe we'll be able to drop it in a decade. Fix up conflicts (including a semantic one adding a new #include to ppc that was uncovered by header the file changes) as per Tejun. * 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (45 commits) cpuset: fix compile warning when CONFIG_SMP=n cpuset: fix cpu hotplug vs rebuild_sched_domains() race cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn() cgroup: restore the call to eventfd->poll() cgroup: fix use-after-free when umounting cgroupfs cgroup: fix broken file xattrs devcg: remove parent_cgroup. memcg: force use_hierarchy if sane_behavior cgroup: remove cgrp->top_cgroup cgroup: introduce sane_behavior mount option move cgroupfs_root to include/linux/cgroup.h cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix cgroup: make cgroup_path() not print double slashes Revert "cgroup: remove bind() method from cgroup_subsys." perf: make perf_event cgroup hierarchical cgroup: implement cgroup_is_descendant() cgroup: make sure parent won't be destroyed before its children cgroup: remove bind() method from cgroup_subsys. devcg: remove broken_hierarchy tag cgroup: remove cgroup_lock_is_held() ...	2013-04-29 19:14:20 -07:00
Linus Torvalds	2794b5d408	Driver core update for 3.10-rc1 Here's the merge request for the driver core tree for 3.10-rc1 It's pretty small, just a number of driver core and sysfs updates and fixes, all of which have been in linux-next for a while now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlF+m4cACgkQMUfUDdst+ymp+wCgv/F7zAhZsKW5YT9A/FsTNl3m Ge8AnRlfYPwxM1Zt4kIuDAwfKuLTYV/B =swS7 -----END PGP SIGNATURE----- Merge tag 'driver-core-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg Kroah-Hartman: "Here's the merge request for the driver core tree for 3.10-rc1 It's pretty small, just a number of driver core and sysfs updates and fixes, all of which have been in linux-next for a while now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed conflict in kernel/rtmutex-tester.c, the locking tree had a better fix for the same sysfs file mode problem. * tag 'driver-core-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: PM / Runtime: Idle devices asynchronously after probe\|release driver core: handle user namespaces properly with the uid/gid devtmpfs change driver core: devtmpfs: fix compile failure with CONFIG_UIDGID_STRICT_TYPE_CHECKS devtmpfs: add base.h include driver core: add uid and gid to devtmpfs sysfs: check if one entry has been removed before freeing sysfs: fix crash_notes_size build warning sysfs: fix use after free in case of concurrent read/write and readdir rtmutex-tester: fix mode of sysfs files Documentation: Add ABI entry for crash_notes and crash_notes_size sysfs: Add crash_notes_size to export percpu note size driver core: platform_device.h: fix checkpatch errors and warnings driver core: platform.c: fix checkpatch errors and warnings driver core: warn that platform_driver_probe can not use deferred probing sysfs: use atomic_inc_unless_negative in sysfs_get_active base: core: WARN() about bogus permissions on device attributes device: separate all subsys mutexes	2013-04-29 11:31:50 -07:00
Linus Torvalds	0a82a8d132	Revert "block: add missing block_bio_complete() tracepoint" This reverts commit `3a366e614d`. Wanlong Gao reports that it causes a kernel panic on his machine several minutes after boot. Reverting it removes the panic. Jens says: "It's not quite clear why that is yet, so I think we should just revert the commit for 3.9 final (which I'm assuming is pretty close). The wifi is crap at the LSF hotel, so sending this email instead of queueing up a revert and pull request." Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Requested-by: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-18 09:00:26 -07:00
Greg Kroah-Hartman	0d1d392f01	Merge 3.9-rc7 into driver-core-next Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-04-14 18:37:05 -07:00
Greg Kroah-Hartman	4e4098a3e0	driver core: handle user namespaces properly with the uid/gid devtmpfs change Now that devtmpfs is caring about uid/gid, we need to use the correct internal types so users who have USER_NS enabled will have things work properly for them. Thanks to Eric for pointing this out, and the patch review. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-04-11 11:43:29 -07:00
Jun'ichi Nomura	e5072664f8	blkcg: fix "scheduling while atomic" in blk_queue_bypass_start Since `749fefe677` in v3.7 ("block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()"), the following warning appears when multipath is used with CONFIG_PREEMPT=y. This patch moves blk_queue_bypass_start() before radix_tree_preload() to avoid the sleeping call while preemption is disabled. BUG: scheduling while atomic: multipath/2460/0x00000002 1 lock held by multipath/2460: #0: (&md->type_lock){......}, at: [<ffffffffa019fb05>] dm_lock_md_type+0x17/0x19 [dm_mod] Modules linked in: ... Pid: 2460, comm: multipath Tainted: G W 3.7.0-rc2 #1 Call Trace: [<ffffffff810723ae>] __schedule_bug+0x6a/0x78 [<ffffffff81428ba2>] __schedule+0xb4/0x5e0 [<ffffffff814291e6>] schedule+0x64/0x66 [<ffffffff8142773a>] schedule_timeout+0x39/0xf8 [<ffffffff8108ad5f>] ? put_lock_stats+0xe/0x29 [<ffffffff8108ae30>] ? lock_release_holdtime+0xb6/0xbb [<ffffffff814289e3>] wait_for_common+0x9d/0xee [<ffffffff8107526c>] ? try_to_wake_up+0x206/0x206 [<ffffffff810c0eb8>] ? kfree_call_rcu+0x1c/0x1c [<ffffffff81428aec>] wait_for_completion+0x1d/0x1f [<ffffffff810611f9>] wait_rcu_gp+0x5d/0x7a [<ffffffff81061216>] ? wait_rcu_gp+0x7a/0x7a [<ffffffff8106fb18>] ? complete+0x21/0x53 [<ffffffff810c0556>] synchronize_rcu+0x1e/0x20 [<ffffffff811dd903>] blk_queue_bypass_start+0x5d/0x62 [<ffffffff811ee109>] blkcg_activate_policy+0x73/0x270 [<ffffffff81130521>] ? kmem_cache_alloc_node_trace+0xc7/0x108 [<ffffffff811f04b3>] cfq_init_queue+0x80/0x28e [<ffffffffa01a1600>] ? dm_blk_ioctl+0xa7/0xa7 [dm_mod] [<ffffffff811d8c41>] elevator_init+0xe1/0x115 [<ffffffff811e229f>] ? blk_queue_make_request+0x54/0x59 [<ffffffff811dd743>] blk_init_allocated_queue+0x8c/0x9e [<ffffffffa019ffcd>] dm_setup_md_queue+0x36/0xaa [dm_mod] [<ffffffffa01a60e6>] table_load+0x1bd/0x2c8 [dm_mod] [<ffffffffa01a7026>] ctl_ioctl+0x1d6/0x236 [dm_mod] [<ffffffffa01a5f29>] ? table_clear+0xaa/0xaa [dm_mod] [<ffffffffa01a7099>] dm_ctl_ioctl+0x13/0x17 [dm_mod] [<ffffffff811479fc>] do_vfs_ioctl+0x3fb/0x441 [<ffffffff811b643c>] ? file_has_perm+0x8a/0x99 [<ffffffff81147aa0>] sys_ioctl+0x5e/0x82 [<ffffffff812010be>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff814310d9>] system_call_fastpath+0x16/0x1b Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-04-09 15:01:21 +02:00
Kay Sievers	3c2670e651	driver core: add uid and gid to devtmpfs Some drivers want to tell userspace what uid and gid should be used for their device nodes, so allow that information to percolate through the driver core to userspace in order to make this happen. This means that some systems (i.e. Android and friends) will not need to even run a udev-like daemon for their device node manager and can just rely in devtmpfs fully, reducing their footprint even more. Signed-off-by: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-04-08 08:21:48 -07:00
Jens Axboe	c2fccc1c9f	Revert "loop: cleanup partitions when detaching loop device" This reverts commit `8761a3dc1f`. There are situations where the destruction path is called with the bdev->bd_mutex already held, which then deadlocks in loop_clr_fd(). The normal partition cleanup does a trylock() on the mutex, but it'd be nice to have a more bullet proof method in loop. So punt this more involved fix to the next merge window, and just back out this buggy fix for now. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-04-08 10:12:11 +02:00
Arnd Bergmann	c678ef5286	block: avoid using uninitialized value in from queue_var_store As found by gcc-4.8, the QUEUE_SYSFS_BIT_FNS macro creates functions that use a value generated by queue_var_store independent of whether that value was set or not. block/blk-sysfs.c: In function 'queue_store_nonrot': block/blk-sysfs.c:244:385: warning: 'val' may be used uninitialized in this function [-Wmaybe-uninitialized] Unlike most other such warnings, this one is not a false positive, writing any non-number string into the sysfs files indeed has an undefined result, rather than returning an error. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-04-03 21:53:57 +02:00
Jens Axboe	64f8de4da7	Merge branch 'writeback-workqueue' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq into for-3.10/core Tejun writes: ----- This is the pull request for the earlier patchset[1] with the same name. It's only three patches (the first one was committed to workqueue tree) but the merge strategy is a bit involved due to the dependencies. * Because the conversion needs features from wq/for-3.10, block/for-3.10/core is based on rc3, and wq/for-3.10 has conflicts with rc3, I pulled mainline (rc5) into wq/for-3.10 to prevent those workqueue conflicts from flaring up in block tree. * Resolving the issue that Jan and Dave raised about debugging requires arch-wide changes. The patchset is being worked on[2] but it'll have to go through -mm after these changes show up in -next, and not included in this pull request. The three commits are located in the following git branch. git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git writeback-workqueue Pulling it into block/for-3.10/core produces a conflict in drivers/md/raid5.c between the following two commits. `e3620a3ad5` ("MD RAID5: Avoid accessing gendisk or queue structs when not available") `2f6db2a707` ("raid5: use bio_reset()") The conflict is trivial - one removes an "if ()" conditional while the other removes "rbi->bi_next = NULL" right above it. We just need to remove both. The merged branch is available at git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git block-test-merge so that you can use it for verification. The test merge commit has proper merge description. While these changes are a bit of pain to route, they make code simpler and even have, while minute, measureable performance gain[3] even on a workload which isn't particularly favorable to showing the benefits of this conversion. ---- Fixed up the conflict. Conflicts: drivers/md/raid5.c Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-04-02 10:04:39 +02:00
Jens Axboe	705cd0ea1c	Merge branch 'for-jens' of http://evilpiepirate.org/git/linux-bcache into for-3.10/core This contains Kents prep work for the immutable bio_vecs.	2013-03-24 21:38:59 -06:00
Kent Overstreet	f73a1c7d11	block: Add bio_end_sector() Just a little convenience macro - main reason to add it now is preparing for immutable bio vecs, it'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx into a struct bvec_iter. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Lars Ellenberg <drbd-dev@lists.linbit.com> CC: Jiri Kosina <jkosina@suse.cz> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Neil Brown <neilb@suse.de> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux-s390@vger.kernel.org CC: Chris Mason <chris.mason@fusionio.com> CC: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>	2013-03-23 14:15:29 -07:00
Kent Overstreet	f79ea41614	block: Refactor blk_update_request() Converts it to use bio_advance(), simplifying it quite a bit in the process. Note that req_bio_endio() now always calls bio_advance() - which means it always loops over the biovec, not just on partial completions. Don't expect it to affect performance, but worth noting. Tested it by forcing partial updates, and dumping before and after on various bio/bvec fields when doing a partial update. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>	2013-03-23 14:15:28 -07:00
Lin Ming	c8158819d5	block: implement runtime pm strategy When a request is added: If device is suspended or is suspending and the request is not a PM request, resume the device. When the last request finishes: Call pm_runtime_mark_last_busy(). When pick a request: If device is resuming/suspending, then only PM request is allowed to go. The idea and API is designed by Alan Stern and described here: http://marc.info/?l=linux-scsi&m=133727953625963&w=2 Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 22:22:15 -06:00

1 2 3 4 5 ...

2034 Commits