Commit graph

33312 commits

Author SHA1 Message Date
Andrew Bresticker f39176835c pstore/ram: Use memcpy_fromio() to save old buffer
commit d771fdf94180de2bd811ac90cba75f0f346abf8d upstream.

The ramoops buffer may be mapped as either I/O memory or uncached
memory.  On ARM64, this results in a device-type (strongly-ordered)
mapping.  Since unnaligned accesses to device-type memory will
generate an alignment fault (regardless of whether or not strict
alignment checking is enabled), it is not safe to use memcpy().
memcpy_fromio() is guaranteed to only use aligned accesses, so use
that instead.

Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com>
Reviewed-by: Puneet Kumar <puneetster@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:50 +02:00
Furquan Shaikh 9cfbce6803 pstore/ram: Use memcpy_toio instead of memcpy
commit 7e75678d23167c2527e655658a8ef36a36c8b4d9 upstream.

persistent_ram_update uses vmap / iomap based on whether the buffer is in
memory region or reserved region. However, both map it as non-cacheable
memory. For armv8 specifically, non-cacheable mapping requests use a
memory type that has to be accessed aligned to the request size. memcpy()
doesn't guarantee that.

Signed-off-by: Furquan Shaikh <furquan@google.com>
Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com>
Reviewed-by: Aaron Durbin <adurbin@chromium.org>
Reviewed-by: Olof Johansson <olofj@chromium.org>
Tested-by: Furquan Shaikh <furquan@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:50 +02:00
Sebastian Andrzej Siewior 3c1df29bd0 pstore/core: drop cmpxchg based updates
commit d5a9bf0b38d2ac85c9a693c7fb851f74fd2a2494 upstream.

I have here a FPGA behind PCIe which exports SRAM which I use for
pstore. Now it seems that the FPGA no longer supports cmpxchg based
updates and writes back 0xff…ff and returns the same.  This leads to
crash during crash rendering pstore useless.
Since I doubt that there is much benefit from using cmpxchg() here, I am
dropping this atomic access and use the spinlock based version.

Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Rabin Vincent <rabinv@axis.com>
Tested-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
[kees: remove "_locked" suffix since it's the only option now]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:49 +02:00
Liu ShuoX 2a27a6f033 pstore: Fix buffer overflow while write offset equal to buffer size
commit 017321cf390045dd4c4afc4a232995ea50bcf66d upstream.

In case new offset is equal to prz->buffer_size, it won't wrap at this
time and will return old(overflow) value next time.

Signed-off-by: Liu ShuoX <shuox.liu@intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:35 +02:00
Jan Kara bfefe97700 isofs: Do not return EACCES for unknown filesystems
commit a2ed0b391dd9c3ef1d64c7c3e370f4a5ffcd324a upstream.

When isofs_mount() is called to mount a device read-write, it returns
EACCES even before it checks that the device actually contains an isofs
filesystem. This may confuse mount(8) which then tries to mount all
subsequent filesystem types in read-only mode.

Fix the problem by returning EACCES only once we verify that the device
indeed contains an iso9660 filesystem.

Fixes: 17b7f7cf58926844e1dd40f5eb5348d481deca6a
Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:21 +02:00
Vegard Nossum dd307a6776 fs/seq_file: fix out-of-bounds read
commit 088bf2ff5d12e2e32ee52a4024fec26e582f44d3 upstream.

seq_read() is a nasty piece of work, not to mention buggy.

It has (I think) an old bug which allows unprivileged userspace to read
beyond the end of m->buf.

I was getting these:

    BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
    Read of size 2713 by task trinity-c2/1329
    CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
      kasan_object_err+0x1c/0x80
      kasan_report_error+0x2cb/0x7e0
      kasan_report+0x4e/0x80
      check_memory_region+0x13e/0x1a0
      kasan_check_read+0x11/0x20
      seq_read+0xcd2/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      entry_SYSCALL64_slow_path+0x25/0x25
    Object at ffff880116889100, in cache kmalloc-4096 size: 4096
    Allocated:
    PID = 1329
      save_stack_trace+0x26/0x80
      save_stack+0x46/0xd0
      kasan_kmalloc+0xad/0xe0
      __kmalloc+0x1aa/0x4a0
      seq_buf_alloc+0x35/0x40
      seq_read+0x7d8/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      return_from_SYSCALL_64+0x0/0x6a
    Freed:
    PID = 0
    (stack is not available)
    Memory state around the buggy address:
     ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
		       ^
     ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint

This seems to be the same thing that Dave Jones was seeing here:

  https://lkml.org/lkml/2016/8/12/334

There are multiple issues here:

  1) If we enter the function with a non-empty buffer, there is an attempt
     to flush it. But it was not clearing m->from after doing so, which
     means that if we try to do this flush twice in a row without any call
     to traverse() in between, we are going to be reading from the wrong
     place -- the splat above, fixed by this patch.

  2) If there's a short write to userspace because of page faults, the
     buffer may already contain multiple lines (i.e. pos has advanced by
     more than 1), but we don't save the progress that was made so the
     next call will output what we've already returned previously. Since
     that is a much less serious issue (and I have a headache after
     staring at seq_read() for the past 8 hours), I'll leave that for now.

Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:21 +02:00
Theodore Ts'o f32c265fb4 ext4: sanity check the block and cluster size at mount time
commit 8cdf3372fe8368f56315e66bea9f35053c418093 upstream.

If the block size or cluster size is insane, reject the mount.  This
is important for security reasons (although we shouldn't be just
depending on this check).

Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:20 +02:00
Ross Zwisler c2502ceffd ext4: allow DAX writeback for hole punch
commit cca32b7eeb4ea24fa6596650e06279ad9130af98 upstream.

Currently when doing a DAX hole punch with ext4 we fail to do a writeback.
This is because the logic around filemap_write_and_wait_range() in
ext4_punch_hole() only looks for dirty page cache pages in the radix tree,
not for dirty DAX exceptional entries.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:20 +02:00
Konstantin Khlebnikov 186036171e ext4: use __GFP_NOFAIL in ext4_free_blocks()
commit adb7ef600cc9d9d15ecc934cc26af5c1379777df upstream.

This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:20 +02:00
Daeho Jeong f8ccdbafd3 ext4: avoid modifying checksum fields directly during checksum verification
commit b47820edd1634dc1208f9212b7ecfb4230610a23 upstream.

We temporally change checksum fields in buffers of some types of
metadata into '0' for verifying the checksum values. By doing this
without locking the buffer, some metadata's checksums, which are
being committed or written back to the storage, could be damaged.
In our test, several metadata blocks were found with damaged metadata
checksum value during recovery process. When we only verify the
checksum value, we have to avoid modifying checksum fields directly.

Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:19 +02:00
Theodore Ts'o 3f07ad0ad6 ext4: validate that metadata blocks do not overlap superblock
commit 829fa70dddadf9dd041d62b82cd7cea63943899d upstream.

A number of fuzzing failures seem to be caused by allocation bitmaps
or other metadata blocks being pointed at the superblock.

This can cause kernel BUG or WARNings once the superblock is
overwritten, so validate the group descriptor blocks to make sure this
doesn't happen.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:19 +02:00
Andrey Ryabinin 1694a7288d coredump: fix unfreezable coredumping task
commit 70d78fe7c8b640b5acfad56ad341985b3810998a upstream.

It could be not possible to freeze coredumping task when it waits for
'core_state->startup' completion, because threads are frozen in
get_signal() before they got a chance to complete 'core_state->startup'.

Inability to freeze a task during suspend will cause suspend to fail.
Also CRIU uses cgroup freezer during dump operation.  So with an
unfreezable task the CRIU dump will fail because it waits for a
transition from 'FREEZING' to 'FROZEN' state which will never happen.

Use freezer_do_not_count() to tell freezer to ignore coredumping task
while it waits for core_state->startup completion.

Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:42:15 +02:00
Wei Fang f3086244c8 fuse: fix wrong assignment of ->flags in fuse_send_init()
commit 9446385f05c9af25fed53dbed3cc75763730be52 upstream.

FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.

Change-Id: I2fe3e8634e1f097a96abffe1b6f149bf70fdf139
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 69fe05c90e ("fuse: add missing INIT flags")
Cc: <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:59 +02:00
Al Viro 2435954dd3 fix d_walk()/non-delayed __d_free() race
commit 3d56c25e3bb0726a5c5e16fc2d9e38f8ed763085 upstream.

Ascend-to-parent logics in d_walk() depends on all encountered child
dentries not getting freed without an RCU delay.  Unfortunately, in
quite a few cases it is not true, with hard-to-hit oopsable race as
the result.

Fortunately, the fix is simiple; right now the rule is "if it ever
been hashed, freeing must be delayed" and changing it to "if it
ever had a parent, freeing must be delayed" closes that hole and
covers all cases the old rule used to cover.  Moreover, pipes and
sockets remain _not_ covered, so we do not introduce RCU delay in
the cases which are the reason for having that delay conditional
in the first place.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[wt: add the required change to __d_materialise_dentry() for kernels
  older than v3.17]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:52 +02:00
Vegard Nossum 22b8207fa3 ext4: fix reference counting bug on block allocation error
commit 554a5ccc4e4a20c5f3ec859de0842db4b4b9c77e upstream.

If we hit this error when mounted with errors=continue or
errors=remount-ro:

    EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata

then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.

Instead, just error out. We could retry the allocation, but there is a
possibility of getting stuck in an infinite loop instead, so this seems
safer.

[ Fixed up so we don't return EAGAIN to userspace. --tytso ]

Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
[wt: 3.10 doesn't have EFSCORRUPTED, but XFS uses EUCLEAN as does 3.14
     on this patch so use this instead]

Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:52 +02:00
Vegard Nossum a81b8efcc3 ext4: short-cut orphan cleanup on error
commit c65d5c6c81a1f27dec5f627f67840726fcd146de upstream.

If we encounter a filesystem error during orphan cleanup, we should stop.
Otherwise, we may end up in an infinite loop where the same inode is
processed again and again.

    EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended
    EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters
    Aborting journal on device loop0-8.
    EXT4-fs (loop0): Remounting filesystem read-only
    EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure
    EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted
    EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed!
    [...]
    EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed!
    [...]
    EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed!
    [...]

See-also: c9eb13a9105 ("ext4: fix hang when processing corrupted orphaned inode list")
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:51 +02:00
Vegard Nossum 98ae08efee ext4: don't call ext4_should_journal_data() on the journal inode
commit 6a7fd522a7c94cdef0a3b08acf8e6702056e635c upstream.

If ext4_fill_super() fails early, it's possible for ext4_evict_inode()
to call ext4_should_journal_data() before superblock options and flags
are fully set up.  In that case, the iput() on the journal inode can
end up causing a BUG().

Work around this problem by reordering the tests so we only call
ext4_should_journal_data() after we know it's not the journal inode.

Fixes: 2d859db3e4 ("ext4: fix data corruption in inodes with journalled data")
Fixes: 2b405bfa84 ("ext4: fix data=journal fast mount/umount hang")
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:51 +02:00
Vegard Nossum 5a67308996 ext4: check for extents that wrap around
commit f70749ca42943faa4d4dcce46dfdcaadb1d0c4b6 upstream.

An extent with lblock = 4294967295 and len = 1 will pass the
ext4_valid_extent() test:

	ext4_lblk_t last = lblock + len - 1;

	if (len == 0 || lblock > last)
		return 0;

since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger
the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end().

We can simplify it by removing the - 1 altogether and changing the test
to use lblock + len <= lblock, since now if len = 0, then lblock + 0 ==
lblock and it fails, and if len > 0 then lblock + len > lblock in order
to pass (i.e. it doesn't overflow).

Fixes: 5946d0893 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Fixes: 2f974865f ("ext4: check for zero length extent explicitly")
Cc: Eryu Guan <guaneryu@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:51 +02:00
Vegard Nossum d5edde23cc ext4: verify extent header depth
commit 7bc9491645118c9461bd21099c31755ff6783593 upstream.

Although the extent tree depth of 5 should enough be for the worst
case of 2*32 extents of length 1, the extent tree code does not
currently to merge nodes which are less than half-full with a sibling
node, or to shrink the tree depth if possible.  So it's possible, at
least in theory, for the tree depth to be greater than 5.  However,
even in the worst case, a tree depth of 32 is highly unlikely, and if
the file system is maliciously corrupted, an insanely large eh_depth
can cause memory allocation failures that will trigger kernel warnings
(here, eh_depth = 65280):

    JBD2: ext4.exe wants too many credits credits:195849 rsv_credits:0 max:256
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 50 at fs/jbd2/transaction.c:293 start_this_handle+0x569/0x580
    CPU: 0 PID: 50 Comm: ext4.exe Not tainted 4.7.0-rc5+ #508
    Stack:
     604a8947 625badd8 0002fd09 00000000
     60078643 00000000 62623910 601bf9bc
     62623970 6002fc84 626239b0 900000125
    Call Trace:
     [<6001c2dc>] show_stack+0xdc/0x1a0
     [<601bf9bc>] dump_stack+0x2a/0x2e
     [<6002fc84>] __warn+0x114/0x140
     [<6002fdff>] warn_slowpath_null+0x1f/0x30
     [<60165829>] start_this_handle+0x569/0x580
     [<60165d4e>] jbd2__journal_start+0x11e/0x220
     [<60146690>] __ext4_journal_start_sb+0x60/0xa0
     [<60120a81>] ext4_truncate+0x131/0x3a0
     [<60123677>] ext4_setattr+0x757/0x840
     [<600d5d0f>] notify_change+0x16f/0x2a0
     [<600b2b16>] do_truncate+0x76/0xc0
     [<600c3e56>] path_openat+0x806/0x1300
     [<600c55c9>] do_filp_open+0x89/0xf0
     [<600b4074>] do_sys_open+0x134/0x1e0
     [<600b4140>] SyS_open+0x20/0x30
     [<6001ea68>] handle_syscall+0x88/0x90
     [<600295fd>] userspace+0x3fd/0x500
     [<6001ac55>] fork_handler+0x85/0x90

    ---[ end trace 08b0b88b6387a244 ]---

[ Commit message modified and the extent tree depath check changed
from 5 to 32 -- tytso ]

Cc: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:50 +02:00
Nicolai Stange 674cee9cdf ext4: silence UBSAN in ext4_mb_init()
commit 935244cd54b86ca46e69bc6604d2adfb1aec2d42 upstream.

Currently, in ext4_mb_init(), there's a loop like the following:

  do {
    ...
    offset += 1 << (sb->s_blocksize_bits - i);
    i++;
  } while (i <= sb->s_blocksize_bits + 1);

Note that the updated offset is used in the loop's next iteration only.

However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
and UBSAN reports

  UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
  shift exponent 4294967295 is too large for 32-bit type 'int'
  [...]
  Call Trace:
   [<ffffffff818c4d25>] dump_stack+0xbc/0x117
   [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169
   [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e
   [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
   [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
   [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390
   [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0
   [<ffffffff814293c7>] ? create_cache+0x57/0x1f0
   [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0
   [<ffffffff821c2168>] ? mutex_lock+0x38/0x60
   [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50
   [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0
   [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0
   [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0
   [...]

Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.

Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of offset is never used again.

Silence UBSAN by introducing another variable, offset_incr, holding the
next increment to apply to offset and adjust that one by right shifting it
by one position per loop iteration.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161

Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:50 +02:00
Nicolai Stange 38804c01f1 ext4: address UBSAN warning in mb_find_order_for_block()
commit b5cb316cdf3a3f5f6125412b0f6065185240cfdc upstream.

Currently, in mb_find_order_for_block(), there's a loop like the following:

  while (order <= e4b->bd_blkbits + 1) {
    ...
    bb += 1 << (e4b->bd_blkbits - order);
  }

Note that the updated bb is used in the loop's next iteration only.

However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports

  UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
  shift exponent -1 is negative
  [...]
  Call Trace:
   [<ffffffff818c4d35>] dump_stack+0xbc/0x117
   [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
   [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e
   [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
   [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
   [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590
   [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80
   [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240
   [...]

Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of bb is never used again.

Silence UBSAN by introducing another variable, bb_incr, holding the next
increment to apply to bb and adjust that one by right shifting it by one
position per loop iteration.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161

Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:50 +02:00
Theodore Ts'o 93bff884cc ext4: fix hang when processing corrupted orphaned inode list
commit c9eb13a9105e2e418f72e46a2b6da3f49e696902 upstream.

If the orphaned inode list contains inode #5, ext4_iget() returns a
bad inode (since the bootloader inode should never be referenced
directly).  Because of the bad inode, we end up processing the inode
repeatedly and this hangs the machine.

This can be reproduced via:

   mke2fs -t ext4 /tmp/foo.img 100
   debugfs -w -R "ssv last_orphan 5" /tmp/foo.img
   mount -o loop /tmp/foo.img /mnt

(But don't do this if you are using an unpatched kernel if you care
about the system staying functional.  :-)

This bug was found by the port of American Fuzzy Lop into the kernel
to find file system problems[1].  (Since it *only* happens if inode #5
shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
surprising that AFL needed two hours before it found it.)

[1] http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf

Cc: stable@vger.kernel.org
Reported by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-07-27 21:41:49 +02:00
Eric Rannaud f6ab8eed12 fs: allow open(dir, O_TMPFILE|..., 0) with mode 0
The man page for open(2) indicates that when O_CREAT is specified, the
'mode' argument applies only to future accesses to the file:

	Note that this mode applies only to future accesses of the newly
	created file; the open() call that creates a read-only file
	may well return a read/write file descriptor.

The man page for open(2) implies that 'mode' is treated identically by
O_CREAT and O_TMPFILE.

O_TMPFILE, however, behaves differently:

	int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
	assert(fd == -1);
	assert(errno == EACCES);

	int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
	assert(fd > 0);

For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:

	if (*opened & FILE_CREATED) {
		/* Don't check for write permission, don't truncate */
		open_flag &= ~O_TRUNC;
		will_truncate = false;
		acc_mode = MAY_OPEN;
		path_to_nameidata(path, nd);
		goto finish_open_created;
	}

But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
may_open().

This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
inode is created, may_open() is called with acc_mode = MAY_OPEN, in
do_tmpfile().

A different, but related glibc bug revealed the discrepancy:
https://sourceware.org/bugzilla/show_bug.cgi?id=17523

The glibc lazily loads the 'mode' argument of open() and openat() using
va_arg() only if O_CREAT is present in 'flags' (to support both the 2
argument and the 3 argument forms of open; same idea for openat()).
However, the glibc ignores the 'mode' argument if O_TMPFILE is in
'flags'.

On x86_64, for open(), it magically works anyway, as 'mode' is in
RDX when entering open(), and is still in RDX on SYSCALL, which is where
the kernel looks for the 3rd argument of a syscall.

But openat() is not quite so lucky: 'mode' is in RCX when entering the
glibc wrapper for openat(), while the kernel looks for the 4th argument
of a syscall in R10. Indeed, the syscall calling convention differs from
the regular calling convention in this respect on x86_64. So the kernel
sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
fails with EACCES.

Change-Id: Ib052bbc6fcc68d3060f91732a78ddbff6f71e0a6
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-03 11:52:42 +01:00
Miklos Szeredi 1a75c3190f ext[34]: fix double put in tmpfile
d_tmpfile() already swallowed the inode ref.

Change-Id: I22411f145d675948cff55b5a8cc3c0cd3a0d484c
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:41 +01:00
Zheng Liu 2a5d64d372 vfs: add missing check for __O_TMPFILE in fcntl_init()
As comment in include/uapi/asm-generic/fcntl.h described, when
introducing new O_* bits, we need to check its uniqueness in
fcntl_init().  But __O_TMPFILE bit is missing.  So fix it.

Change-Id: I5c6ddf8ef05e96e4799bac823caa7dd60ce558dd
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:41 +01:00
Andy Lutomirski 134461f4c9 fs: Fix file mode for O_TMPFILE
O_TMPFILE, like O_CREAT, should respect the requested mode and should
create regular files.

This fixes two bugs: O_TMPFILE required privilege (because the mode
ended up as 000) and it produced bogus inodes with no type.

Change-Id: Ie4da9ede57e481c7edb113c5bc6329fefef41f4e
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:40 +01:00
Zheng Liu aa1d69d2ab ext3: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel: kernel BUG at fs/ext3/namei.c:1992!
kernel: invalid opcode: 0000 [#1] SMP
kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd
kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4
kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010
kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000
kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP: 0018:ffff8801124d5cc8  EFLAGS: 00010202
kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0
kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8
kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f
kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000
kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8
kernel: FS:  00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0
kernel: Stack:
kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7
kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000
kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1
kernel: Call Trace:
kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd]
kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3]
kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7
kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30
kernel: [<ffffffff82065fa2>] ?  __dequeue_entity+0x33/0x38
kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d
kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102
kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd
kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20
kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b
kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0
kernel: RIP  [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP <ffff8801124d5cc8>

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Change-Id: I6e953c0a1188d2099f9202e2f8ba8145fa3531b5
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:39 +01:00
Zheng Liu 09a2084042 ext4: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel BUG at fs/ext4/namei.c:2572!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli
nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir
da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp
kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm
CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000
RIP: 0010:[<ffffffff8125ce69>]  [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0
RSP: 0018:ffff88010f7b7cf8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001
RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668
R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0
FS:  00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0
DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00
 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c
 ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004
Call Trace:
 [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180
 [<ffffffff811cba78>] path_openat+0x238/0x700
 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80
 [<ffffffff811cc647>] do_filp_open+0x47/0xa0
 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200
 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210
 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290
 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20
 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2
 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0
Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Change-Id: Ie8a8009970d1e38c6863d94296f2738918da5429
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:38 +01:00
Al Viro d585cdd4db allow O_TMPFILE to work with O_WRONLY
Change-Id: I90171d1b53a4c35bfa76757ecfdfb6f95330d107
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:37 +01:00
Al Viro 09b20ad836 Safer ABI for O_TMPFILE
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...

Change-Id: Iaa3c8b487d44515b539150bdb5d0b749b87d3ea2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:36 +01:00
Al Viro 0d62dee661 ext4: ->tmpfile() support
very similar to ext3 counterpart...

Change-Id: Ibb9de458c172ad50c4c202b971cb7243c8e43c82
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:35 +01:00
Al Viro e6c66b91e7 ext3 ->tmpfile() support
In this case we do need a bit more than usual, due to orphan
list handling.

Change-Id: I355e9c97e04a03f89bb760009fecc160b25caeb7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:12 +01:00
Al Viro 51d8156285 allow the temp files created by open() to be linked to
O_TMPFILE | O_CREAT => linkat() with AT_SYMLINK_FOLLOW and /proc/self/fd/<n>
as oldpath (i.e. flink()) will create a link
O_TMPFILE | O_CREAT | O_EXCL => ENOENT on attempt to link those guys

Change-Id: I5e28485680c3320cd0fccc0ba1bea8b963fca7fe
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:08 +01:00
Al Viro dd8e7e379c it's still short a few helpers, but infrastructure should be OK now...
Change-Id: I0adb8fe9c5029bad3ac52629003c3b78e9442936
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-03 11:52:03 +01:00
Daniel Rosenberg b3f94d8e12 ANDROID: sdcardfs: Protect set_top
If the top is changed while we're attempting to use it, it's
possible that the reference will be put while we are in the
process of grabbing a reference.

Now we grab a spinlock to protect grabbing our reference count.

Additionally, we now set the inode_info's top value to point to
it's own data when initializing, which makes tracking changes
easier.

Change-Id: If15748c786ce4c0480ab8c5051a92523aff284d2
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2018-08-15 14:40:15 +02:00
Daniel Rosenberg 0b9b661574 Revert "ANDROID: sdcardfs: notify lower file of opens"
This reverts commit fd825dd8ffd9c4873f80438c3030dd21c204512d.

Instead of calling notify within sdcardfs, which reverse the
order of notifications during an open with truncate, we'll
make fs_notify worry about it.

Change-Id: Ic634401c0f223500066300a4df8b1453a0b35b60
Bug: 70706497
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2018-08-15 14:40:14 +02:00
Daniel Rosenberg 3dcf0addfd ANDROID: sdcardfs: Use lower getattr times/size
We now use the lower filesystem's getattr for time and size related
information.

Change-Id: I3dd05614a0c2837a13eeb033444fbdf070ddce2a
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 72007585
2018-08-15 14:40:13 +02:00
Daniel Rosenberg fb17543000 ANDROID: xattr: Pass EOPNOTSUPP to permission2
The permission call for xattr operations happens regardless of
whether or not the xattr functions are implemented.

The xattr functions currently don't have support for permission2.
Passing EOPNOTSUPP as the mount point in xattr_permission allows
us to return EOPNOTSUPP early in permission2, if the filesystem
supports it.

Change-Id: I9d07e4cd633cf40af60450ffbff7ac5c1b4e8c2c
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35848445
2018-08-15 14:40:12 +02:00
Daniel Rosenberg 743fc1afaf ANDROID: sdcardfs: Move default_normal to superblock
Moving default_normal from mount info to superblock info
as it doesn't need to change between mount points.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 72158116
Change-Id: I16c6a0577c601b4f7566269f7e189fcf697afd4e
2018-08-15 14:40:11 +02:00
Daniel Rosenberg 79724adf67 ANDROID: sdcardfs: Fix missing break on default_normal
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 64672411
Change-Id: I98796df95dc9846adb77a11f49a1a254fb1618b1
2018-08-15 14:40:10 +02:00
Daniel Rosenberg 324eac605d ANDROID: sdcardfs: Add default_normal option
The default_normal option causes mounts with the gid set to
AID_SDCARD_RW to have user specific gids, as in the normal case.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I9619b8ac55f41415df943484dc8db1ea986cef6f
Bug: 64672411
2018-08-15 14:40:10 +02:00
Daniel Rosenberg 5a67a9da20 ANDROID: sdcardfs: notify lower file of opens
fsnotify_open is not called within dentry_open,
so we need to call it ourselves.

Change-Id: Ia7f323b3d615e6ca5574e114e8a5d7973fb4c119
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 70706497
2018-08-15 14:40:09 +02:00
LuK1337 65f8423215 Import T813XXS2BRC2 kernel source changes
Change-Id: I90bb6c013287c1edbf8ca607d1666cc4c62d504e
2018-05-26 00:39:42 +02:00
Daniel Rosenberg dd8a556a1b ANDROID: sdcardfs: Add missing break
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 63245673
Change-Id: I5fc596420301045895e5a9a7e297fd05434babf9
2018-02-06 13:12:27 +01:00
Daniel Rosenberg 1648fc202c ANDROID: Sdcardfs: Move gid derivation under flag
This moves the code to adjust the gid/uid of lower filesystem
files under the mount flag derive_gid.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I44eaad4ef67c7fcfda3b6ea3502afab94442610c
Bug: 63245673
2018-02-06 13:12:27 +01:00
Jaegeuk Kim d31b74b602 ANDROID: sdcardfs: override credential for ioctl to lower fs
Otherwise, lower_fs->ioctl() fails due to inode_owner_or_capable().

Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Bug: 63260873
Change-Id: I623a6c7c5f8a3cbd7ec73ef89e18ddb093c43805
2018-02-06 13:12:27 +01:00
Daniel Rosenberg a36b661e1b ANDROID: sdcardfs: Remove unnecessary lock
The mmap_sem lock does not appear to be protecting
anything, and has been removed in Samsung's more
recent versions of sdcardfs.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I76ff3e33002716b8384fc8be368028ed63dffe4e
Bug: 63785372
2018-02-06 13:12:27 +01:00
Gao Xiang 1ecae17890 ANDROID: sdcardfs: use mount_nodev and fix a issue in sdcardfs_kill_sb
Use the VFS mount_nodev instead of customized mount_nodev_with_options
and fix generic_shutdown_super to kill_anon_super because of set_anon_super

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Change-Id: Ibe46647aa2ce49d79291aa9d0295e9625cfccd80
2018-02-06 13:12:27 +01:00
Greg Hackmann e5ef1e677a ANDROID: sdcardfs: remove dead function open_flags_to_access_mode()
smatch warns about the suspicious formatting in the last line of
open_flags_to_access_mode().  It turns out the only caller was deleted
over a year ago by "ANDROID: sdcardfs: Bring up to date with Android M
permissions:", so we can "fix" the function's formatting by deleting it.

Change-Id: Id85946f3eb01722eef35b1815f405a6fda3aa4ff
Signed-off-by: Greg Hackmann <ghackmann@google.com>
2018-02-06 13:12:27 +01:00
Andrew Chant 9c3fbd74fc sdcardfs: limit stacking depth
Limit filesystem stacking to prevent stack overflow.

Bug: 32761463
Change-Id: I8b1462b9c0d6c7f00cf110724ffb17e7f307c51e
Signed-off-by: Andrew Chant <achant@google.com>
2018-02-06 13:12:27 +01:00