android_kernel_samsung_msm8976/block
Tomoki Sekiyama 6d53d39270 elevator: Fix a race in elevator switching and md device initialization
commit eb1c160b22655fd4ec44be732d6594fd1b1e44f4 upstream.

The soft lockup below happens at the boot time of the system using dm
multipath and the udev rules to switch scheduler.

[  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
[  356.127001] RIP: 0010:[<ffffffff81072a7d>]  [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50
...
[  356.127001] Call Trace:
[  356.127001]  [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70
[  356.127001]  [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230
[  356.127001]  [<ffffffff810738b2>] del_timer_sync+0x52/0x60
[  356.127001]  [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0
[  356.127001]  [<ffffffff812c98df>] elevator_exit+0x2f/0x50
[  356.127001]  [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0
[  356.127001]  [<ffffffff812caa50>] elv_iosched_store+0x20/0x50
[  356.127001]  [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0
[  356.127001]  [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140
[  356.127001]  [<ffffffff811a326d>] vfs_write+0xbd/0x1e0
[  356.127001]  [<ffffffff811a3ca9>] SyS_write+0x49/0xa0
[  356.127001]  [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b

This is caused by a race between md device initialization by multipathd and
shell script to switch the scheduler using sysfs.

 - multipathd:
   SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load
   -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init
    q->elevator = elevator_alloc(q, e); // not yet initialized

 - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler':
   elevator_switch (in the call trace above)
    struct elevator_queue *old = q->elevator;
    q->elevator = elevator_alloc(q, new_e);
    elevator_exit(old);                 // lockup! (*)

 - multipathd: (cont.)
    err = e->ops.elevator_init_fn(q);   // init fails; q->elevator is modified

(*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
while timer->base == NULL. In this case, as timer will never initialized,
it results in lockup.

This patch introduces acquisition of q->sysfs_lock around elevator_init()
into blk_init_allocated_queue(), to provide mutual exclusion between
initialization of the q->scheduler and switching of the scheduler.

This should fix this bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=902012

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08 07:29:27 -08:00
..
partitions partitions/efi.c: replace useless kzalloc's by kmalloc's 2013-04-30 08:34:25 +02:00
blk-cgroup.c blkcg: fix "scheduling while atomic" in blk_queue_bypass_start 2013-04-09 15:01:21 +02:00
blk-cgroup.h cgroup: fix cgroup_path() vs rename() race 2013-03-04 09:50:08 -08:00
blk-core.c elevator: Fix a race in elevator switching and md device initialization 2013-12-08 07:29:27 -08:00
blk-exec.c Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block 2013-02-28 12:52:24 -08:00
blk-flush.c Block: blk-flush: Fixed indent code style 2013-03-22 12:22:51 -06:00
blk-integrity.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-ioc.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
blk-iopoll.c
blk-lib.c block: account iowait time when waiting for completion of IO request 2013-02-15 16:45:07 +01:00
blk-map.c
blk-merge.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-settings.c block: properly stack underlying max_segment_size to DM device 2013-11-29 11:11:51 -08:00
blk-softirq.c
blk-sysfs.c block: avoid using uninitialized value in from queue_var_store 2013-04-03 21:53:57 +02:00
blk-tag.c
blk-throttle.c block: Rename queue dead flag 2012-12-06 14:30:58 +01:00
blk-timeout.c block: fix race between request completion and timeout handling 2013-11-29 11:11:50 -08:00
blk.h block,elevator: use new hashtable implementation 2013-01-11 14:43:13 +01:00
bsg-lib.c bsg: Remove unused function bsg_goose_queue() 2012-12-06 14:33:02 +01:00
bsg.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
cfq-iosched.c cfq: explicitly use 64bit divide operation for 64bit arguments 2013-10-01 09:17:48 -07:00
compat_ioctl.c
deadline-iosched.c elevator: Fix a race in elevator switching 2013-08-20 08:43:03 -07:00
elevator.c elevator: Fix a race in elevator switching and md device initialization 2013-12-08 07:29:27 -08:00
genhd.c block: do not pass disk names as format strings 2013-07-13 11:42:26 -07:00
ioctl.c Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block 2012-10-11 09:04:23 +09:00
Kconfig block: don't select PERCPU_RWSEM 2013-02-22 10:42:45 +01:00
Kconfig.iosched
Makefile
noop-iosched.c elevator: Fix a race in elevator switching 2013-08-20 08:43:03 -07:00
partition-generic.c Revert "loop: cleanup partitions when detaching loop device" 2013-04-08 10:12:11 +02:00
scsi_ioctl.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00