android_kernel_samsung_msm8976/fs
Jerry Hoemann 1784d4766a fsnotify: next_i is freed during fsnotify_unmount_inodes.
commit 6424babfd68dd8a83d9c60a5242d27038856599f upstream.

During file system stress testing on 3.10 and 3.12 based kernels, the
umount command occasionally hung in fsnotify_unmount_inodes in the
section of code:

                spin_lock(&inode->i_lock);
                if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
                        spin_unlock(&inode->i_lock);
                        continue;
                }

As this section of code holds the global inode_sb_list_lock, eventually
the system hangs trying to acquire the lock.

Multiple crash dumps showed:

The inode->i_state == 0x60 and i_count == 0 and i_sb_list would point
back at itself.  As this is not the value of list upon entry to the
function, the kernel never exits the loop.

To help narrow down problem, the call to list_del_init in
inode_sb_list_del was changed to list_del.  This poisons the pointers in
the i_sb_list and causes a kernel to panic if it transverse a freed
inode.

Subsequent stress testing paniced in fsnotify_unmount_inodes at the
bottom of the list_for_each_entry_safe loop showing next_i had become
free.

We believe the root cause of the problem is that next_i is being freed
during the window of time that the list_for_each_entry_safe loop
temporarily releases inode_sb_list_lock to call fsnotify and
fsnotify_inode_delete.

The code in fsnotify_unmount_inodes attempts to prevent the freeing of
inode and next_i by calling __iget.  However, the code doesn't do the
__iget call on next_i

	if i_count == 0 or
	if i_state & (I_FREEING | I_WILL_FREE)

The patch addresses this issue by advancing next_i in the above two cases
until we either find a next_i which we can __iget or we reach the end of
the list.  This makes the handling of next_i more closely match the
handling of the variable "inode."

The time to reproduce the hang is highly variable (from hours to days.) We
ran the stress test on a 3.10 kernel with the proposed patch for a week
without failure.

During list_for_each_entry_safe, next_i is becoming free causing
the loop to never terminate.  Advance next_i in those cases where
__iget is not done.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hp.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ken Helias <kenhelias@firemail.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-27 07:52:33 -08:00
..
9p aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
adfs
affs
afs aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
autofs4 autofs - remove autofs dentry mount check 2013-05-06 13:06:59 -07:00
befs befs_readdir(): do not increment ->f_pos if filldir tells us to stop 2013-05-31 15:17:56 -04:00
bfs
btrfs Btrfs: don't delay inode ref updates during log replay 2015-01-16 06:59:02 -08:00
cachefiles lift sb_start_write() out of ->write() 2013-04-09 14:12:56 -04:00
ceph ceph: allow sync_read/write return partial successed size of read/write. 2014-01-09 12:24:25 -08:00
cifs CIFS: Fix SMB2 readdir error handling 2014-10-05 14:54:11 -07:00
coda lift sb_start_write() out of ->write() 2013-04-09 14:12:56 -04:00
configfs configfs: fix race between dentry put and lookup 2013-11-29 11:11:53 -08:00
cramfs
debugfs debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) 2013-08-14 22:59:10 -07:00
devpts devpts: plug the memory leak in kill_sb 2013-12-04 10:55:49 -08:00
dlm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-05-01 14:08:52 -07:00
ecryptfs eCryptfs: Remove buggy and unnecessary write in file name decode routine 2015-01-08 09:58:17 -08:00
efivarfs efivarfs: Never return ENOENT from firmware again 2013-05-13 20:12:10 +01:00
efs
exofs ore: Fix wrong math in allocation of per device BIO 2014-02-13 13:48:00 -08:00
exportfs
ext2 ext2: Fix oops in ext2_get_block() called from ext2_quota_write() 2014-12-16 09:09:43 -08:00
ext3 ext3: Don't check quota format when there are no quota files 2014-11-14 08:48:00 -08:00
ext4 ext4: fix oops when loading block bitmap failed 2014-11-14 08:47:58 -08:00
f2fs f2fs updates for v3.10 2013-05-08 15:11:48 -07:00
fat fat: fix possible overflow for fat_clusters 2013-05-24 16:22:50 -07:00
freevxfs
fscache fs/fscache/stats.c: fix memory leak 2013-04-29 15:54:27 -07:00
fuse fuse: handle large user and group ID 2014-07-28 08:00:02 -07:00
gfs2 GFS2: Increase i_writecount during gfs2_setattr_chown 2014-01-25 08:27:11 -08:00
hfs hfs: avoid crash in hfs_bnode_create 2013-05-24 16:22:51 -07:00
hfsplus aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
hostfs hostfs: use kmalloc instead of kzalloc 2013-05-04 15:48:45 -04:00
hpfs hpfs: better test for errors 2013-07-13 11:42:26 -07:00
hppfs hppfs: get rid of ->fsync() 2013-04-29 15:41:42 -04:00
hugetlbfs cope with potentially long ->d_dname() output for shmem/hugetlb 2013-10-18 07:45:45 -07:00
isofs isofs: Fix unchecked printing of ER records 2015-01-08 09:58:15 -08:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2013-05-03 09:56:25 -07:00
jbd2 jbd2: free bh when descriptor block checksum fails 2014-11-14 08:47:57 -08:00
jffs2 kill wbuf_queued/wbuf_dwork_lock 2014-11-14 08:47:55 -08:00
jfs jfs: fix error path in ialloc 2013-11-13 12:05:31 +09:00
lockd LOCKD: Fix a race when initialising nlmsvc_timeout 2015-01-27 07:52:33 -08:00
logfs
minix
ncpfs ncpfs: return proper error from NCP_IOC_SETROOT ioctl 2015-01-08 09:58:17 -08:00
nfs NFSv4.1: Fix client id trunking on Linux 2015-01-27 07:52:32 -08:00
nfs_common
nfsd nfsd4: fix xdr4 inclusion of escaped char 2015-01-16 06:59:02 -08:00
nilfs2 nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races 2015-01-16 06:59:02 -08:00
nls
notify fsnotify: next_i is freed during fsnotify_unmount_inodes. 2015-01-27 07:52:33 -08:00
ntfs aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
ocfs2 ocfs2: fix journal commit deadlock 2015-01-16 06:59:00 -08:00
omfs
openpromfs
proc userns: Add a knob to disable setgroups on a per user namespace basis 2015-01-08 09:58:16 -08:00
pstore pstore-ram: Allow optional mapping with pgprot_noncached 2015-01-16 06:59:00 -08:00
qnx4
qnx6 qnx6: qnx6_readdir() has a braino in pos calculation 2013-05-31 15:17:31 -04:00
quota quota: Properly return errors from dquot_writeback_dquots() 2014-11-14 08:48:00 -08:00
ramfs
reiserfs reiserfs: call truncate_setsize under tailpack mutex 2014-07-06 18:54:15 -07:00
romfs romfs: fix nommu map length to keep inside filesystem 2013-04-29 09:17:57 +10:00
squashfs
sysfs sysfs: check if one entry has been removed before freeing 2013-04-05 15:35:52 -07:00
sysv sysv: Add forgotten superblock lock init for v7 fs 2013-10-05 07:13:09 -07:00
ubifs UBIFS: fix free log space calculation 2014-11-14 08:47:54 -08:00
udf udf: Verify symlink size before loading it 2015-01-08 09:58:17 -08:00
ufs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-04-30 09:36:50 -07:00
xfs xfs: don't zero partial page cache pages during O_DIRECT writes 2014-09-17 09:04:01 -07:00
aio.c aio: fix kernel memory disclosure in io_getevents() introduced in v3.10 2014-06-30 20:09:45 -07:00
anon_inodes.c
attr.c fs,userns: Change inode_capable to capable_wrt_inode_uidgid 2014-06-16 13:42:52 -07:00
bad_inode.c
binfmt_aout.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
binfmt_elf.c fs/binfmt_elf.c: prevent a coredump with a large vm_map_count from Oopsing 2013-10-13 16:08:31 -07:00
binfmt_elf_fdpic.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-05-02 10:16:16 -07:00
binfmt_em86.c
binfmt_flat.c new helper: read_code() 2013-04-29 15:40:23 -04:00
binfmt_misc.c binfmt_misc: reuse string_unescape_inplace() 2013-04-30 17:04:03 -07:00
binfmt_script.c
binfmt_som.c
bio-integrity.c bio-integrity: Fix bio_integrity_verify segment start bug 2014-03-23 21:38:21 -07:00
bio.c block: Fix bio_copy_data() 2013-10-05 07:13:09 -07:00
block_dev.c writeback: Fix periodic writeback after fs mount 2013-07-28 16:29:40 -07:00
buffer.c vfs: fix data corruption when blocksize < pagesize for mmaped data 2014-11-14 08:47:54 -08:00
char_dev.c
compat.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
compat_binfmt_elf.c
compat_ioctl.c Removed unused typedef to avoid "unused local typedef" warnings. 2013-05-04 15:03:05 -04:00
coredump.c coredump: fix the setting of PF_DUMPCORE 2014-07-31 12:53:50 -07:00
coredump.h
dcache.c vfs: fix bad hashing of dentries 2014-09-17 09:04:02 -07:00
dcookies.c fs/compat: fix lookup_dcookie() parameter handling 2014-02-13 13:48:00 -08:00
direct-io.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
drop_caches.c
eventfd.c
eventpoll.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-05-01 07:21:43 -07:00
exec.c metag: Reduce maximum stack size to 256MB 2014-06-07 13:25:38 -07:00
fcntl.c
fhandle.c
file.c fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem 2014-02-22 12:41:25 -08:00
file_table.c don't bother with {get,put}_write_access() on non-regular files 2014-05-30 21:52:12 -07:00
filesystems.c
fs-writeback.c writeback: fix a subtle race condition in I_DIRTY clearing 2015-01-16 06:59:02 -08:00
fs_struct.c
generic_acl.c
inode.c fs,userns: Change inode_capable to capable_wrt_inode_uidgid 2014-06-16 13:42:52 -07:00
internal.h splice: don't pass the address of ->f_pos to methods 2013-06-20 19:02:45 +04:00
ioctl.c
ioprio.c block: Fix computation of merged request priority 2014-11-21 09:22:53 -08:00
Kconfig efivarfs: Move to fs/efivarfs 2013-04-17 13:25:09 +01:00
Kconfig.binfmt fs: make binfmt support for #! scripts modular and removable 2013-04-30 17:04:04 -07:00
libfs.c
locks.c locks: allow __break_lease to sleep even when break_time is 0 2014-05-13 13:59:44 +02:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
mbcache.c
mount.h vfs: Is mounted should be testing mnt_ns for NULL or error. 2014-02-06 11:08:16 -08:00
mpage.c
namei.c vfs: fix bad hashing of dentries 2014-09-17 09:04:02 -07:00
namespace.c umount: Disallow unprivileged mount force 2015-01-08 09:58:16 -08:00
no-block.c
open.c don't bother with {get,put}_write_access() on non-regular files 2014-05-30 21:52:12 -07:00
pipe.c vfs: fix subtle use-after-free of pipe_inode_info 2013-12-11 22:36:26 -08:00
pnode.c vfs: Fix invalid ida_remove() call 2013-05-31 15:16:33 -04:00
pnode.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
posix_acl.c posix_acl: handle NULL ACL in posix_acl_equiv_mode 2014-06-07 13:25:33 -07:00
proc_namespace.c
read_write.c fs/compat: fix parameter handling for compat readv/writev syscalls 2014-02-13 13:48:00 -08:00
readdir.c
select.c
seq_file.c seq_file: always update file->f_pos in seq_lseek() 2013-11-13 12:05:34 +09:00
signalfd.c
splice.c fuse: fix pipe_buf_operations 2014-02-13 13:47:59 -08:00
stack.c
stat.c
statfs.c vfs: allow O_PATH file descriptors for fstatfs() 2013-10-18 07:45:44 -07:00
super.c fs: Fix theoretical division by 0 in super_cache_scan(). 2014-11-14 08:47:54 -08:00
sync.c
timerfd.c
utimes.c
xattr.c
xattr_acl.c