Move actual pte filling for non-linear file mappings into the new special
vma operation: ->remap_pages().
Filesystems must implement this method to get non-linear mapping support,
if it uses filemap_fault() then generic_file_remap_pages() can be used.
Now device drivers can implement this method and obtain nonlinear vma support.
Change-Id: Ifbbbdfcdf871a8173856a13087400885357f95ee
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com> #arch/tile
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
page allocated in fuse_dentry_canonical_path to be handled in
fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL).
This may not return a page with data filled with 0. Now this
page may not have a null terminator at all.
If this happens and userspace fuse daemon screws up by passing a string
to kernel which is not NULL terminated (or did not fill anything),
then inside fuse driver in kernel when we try to do
strlen(fuse_dev_write->kern_path->getname_kernel)
on that page data -> it may give us issue with kernel paging request.
Unable to handle kernel paging request at virtual address
------------[ cut here ]------------
<..>
PC is at strlen+0x10/0x90
LR is at getname_kernel+0x2c/0xf4
<..>
strlen+0x10/0x90
kern_path+0x28/0x4c
fuse_dev_do_write+0x5b8/0x694
fuse_dev_write+0x74/0x94
do_iter_readv_writev+0x80/0xb8
do_readv_writev+0xec/0x1cc
vfs_writev+0x54/0x64
SyS_writev+0x64/0xe4
el0_svc_naked+0x24/0x28
To avoid this we should ensure in case of FUSE_CANONICAL_PATH,
the page is null terminated.
Change-Id: I33ca7cc76b4472eaa982c67bb20685df451121f5
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Bug: 75984715
[Daniel - small edit, using args size ]
Signed-off-by: Daniel Rosenberg <drosen@google.com>
09f363c7 ("vmscan: fix shrinker callback bug in fs/super.c") fixed a
shrinker callback which was returning -1 when nr_to_scan is zero, which
caused excessive slab scanning. But 635697c6 ("vmscan: fix initial
shrinker size handling") fixed the problem, again so we can freely return
-1 although nr_to_scan is zero. So let's revert 09f363c7 because the
comment added in 09f363c7 made an unnecessary rule.
Change-Id: Ic57b698b97406b980e06bd213afa283868a779a2
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We already use them for openat() and friends, but fstat() also wants to
be able to use O_PATH file descriptors. This should make it more
directly comparable to the O_SEARCH of Solaris.
Note that you could already do the same thing with "fstatat()" and an
empty path, but just doing "fstat()" directly is simpler and faster, so
there is no reason not to just allow it directly.
See also commit 332a2e1244, which did the same thing for fchdir, for
the same reasons.
Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I98db77b7f3fcadb1f0ecdb5c50ff4a7ad9db8bb3
We already use them for openat() and friends, but fstat() also wants to
be able to use O_PATH file descriptors. This should make it more
directly comparable to the O_SEARCH of Solaris.
Note that you could already do the same thing with "fstatat()" and an
empty path, but just doing "fstat()" directly is simpler and faster, so
there is no reason not to just allow it directly.
See also commit 332a2e1244, which did the same thing for fchdir, for
the same reasons.
Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SafetyNet checks androidboot.veritymode in Nougat, so remove it.
Additionally, remove androidboot.enable_dm_verity and androidboot.secboot
in case SafetyNet will check them in the future.
Change-Id: Idd3422012aa62f39edadc1eee9b5b84879a87b33
Signed-off-by: Sultanxda <sultanxda@gmail.com>
Userspace parses this and sets the ro.boot.verifiedbootstate prop
according to the value that this flag has. When ro.boot.verifiedbootstate
is not 'green', SafetyNet is tripped and fails the CTS test.
Hide verifiedbootstate from /proc/cmdline in order to fix the failed
SafetyNet CTS check.
Change-Id: I530a1d1abb8f24c5a84dc423367715c353e005d3
Signed-off-by: Sultanxda <sultanxda@gmail.com>
Move actual pte filling for non-linear file mappings into the new special
vma operation: ->remap_pages().
Filesystems must implement this method to get non-linear mapping support,
if it uses filemap_fault() then generic_file_remap_pages() can be used.
Now device drivers can implement this method and obtain nonlinear vma support.
Change-Id: Ifbbbdfcdf871a8173856a13087400885357f95ee
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com> #arch/tile
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
page allocated in fuse_dentry_canonical_path to be handled in
fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL).
This may not return a page with data filled with 0. Now this
page may not have a null terminator at all.
If this happens and userspace fuse daemon screws up by passing a string
to kernel which is not NULL terminated (or did not fill anything),
then inside fuse driver in kernel when we try to do
strlen(fuse_dev_write->kern_path->getname_kernel)
on that page data -> it may give us issue with kernel paging request.
Unable to handle kernel paging request at virtual address
------------[ cut here ]------------
<..>
PC is at strlen+0x10/0x90
LR is at getname_kernel+0x2c/0xf4
<..>
strlen+0x10/0x90
kern_path+0x28/0x4c
fuse_dev_do_write+0x5b8/0x694
fuse_dev_write+0x74/0x94
do_iter_readv_writev+0x80/0xb8
do_readv_writev+0xec/0x1cc
vfs_writev+0x54/0x64
SyS_writev+0x64/0xe4
el0_svc_naked+0x24/0x28
To avoid this we should ensure in case of FUSE_CANONICAL_PATH,
the page is null terminated.
Change-Id: I33ca7cc76b4472eaa982c67bb20685df451121f5
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Bug: 75984715
[Daniel - small edit, using args size ]
Signed-off-by: Daniel Rosenberg <drosen@google.com>
09f363c7 ("vmscan: fix shrinker callback bug in fs/super.c") fixed a
shrinker callback which was returning -1 when nr_to_scan is zero, which
caused excessive slab scanning. But 635697c6 ("vmscan: fix initial
shrinker size handling") fixed the problem, again so we can freely return
-1 although nr_to_scan is zero. So let's revert 09f363c7 because the
comment added in 09f363c7 made an unnecessary rule.
Change-Id: Ic57b698b97406b980e06bd213afa283868a779a2
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ Upstream commit f15133df088ecadd141ea1907f2c96df67c729f0 ]
path_openat() jumps to the wrong place after do_tmpfile() - it has
already done path_cleanup() (as part of path_lookupat() called by
do_tmpfile()), so doing that again can lead to double fput().
Change-Id: I83bb7f0a15db8d2202a010b75ade98f80e7270f2
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The man page for open(2) indicates that when O_CREAT is specified, the
'mode' argument applies only to future accesses to the file:
Note that this mode applies only to future accesses of the newly
created file; the open() call that creates a read-only file
may well return a read/write file descriptor.
The man page for open(2) implies that 'mode' is treated identically by
O_CREAT and O_TMPFILE.
O_TMPFILE, however, behaves differently:
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
assert(fd == -1);
assert(errno == EACCES);
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
assert(fd > 0);
For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:
if (*opened & FILE_CREATED) {
/* Don't check for write permission, don't truncate */
open_flag &= ~O_TRUNC;
will_truncate = false;
acc_mode = MAY_OPEN;
path_to_nameidata(path, nd);
goto finish_open_created;
}
But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
may_open().
This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
inode is created, may_open() is called with acc_mode = MAY_OPEN, in
do_tmpfile().
A different, but related glibc bug revealed the discrepancy:
https://sourceware.org/bugzilla/show_bug.cgi?id=17523
The glibc lazily loads the 'mode' argument of open() and openat() using
va_arg() only if O_CREAT is present in 'flags' (to support both the 2
argument and the 3 argument forms of open; same idea for openat()).
However, the glibc ignores the 'mode' argument if O_TMPFILE is in
'flags'.
On x86_64, for open(), it magically works anyway, as 'mode' is in
RDX when entering open(), and is still in RDX on SYSCALL, which is where
the kernel looks for the 3rd argument of a syscall.
But openat() is not quite so lucky: 'mode' is in RCX when entering the
glibc wrapper for openat(), while the kernel looks for the 4th argument
of a syscall in R10. Indeed, the syscall calling convention differs from
the regular calling convention in this respect on x86_64. So the kernel
sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
fails with EACCES.
Change-Id: I4da221448695c2aca15818d8d4f44784ecdbdac6
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Without this patch fanotify_init does not validate the value passed in
event_f_flags.
When a fanotify event is read from the fanotify file descriptor a new
file descriptor is created where file.f_flags = event_f_flags.
Internal and external open flags are stored together in field f_flags of
struct file. Hence, an application might create file descriptors with
internal flags like FMODE_EXEC, FMODE_NOCMTIME set.
Jan Kara and Eric Paris both aggreed that this is a bug and the value of
event_f_flags should be checked:
https://lkml.org/lkml/2014/4/29/522https://lkml.org/lkml/2014/4/29/539
This updated patch version considers the comments by Michael Kerrisk in
https://lkml.org/lkml/2014/5/4/10
With the patch the value of event_f_flags is checked.
When specifying an invalid value error EINVAL is returned.
Internal flags are disallowed.
File creation flags are disallowed:
O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TRUNC, and O_TTY_INIT.
Flags which do not make sense with fanotify are disallowed:
__O_TMPFILE, O_PATH, FASYNC, and O_DIRECT.
This leaves us with the following allowed values:
O_RDONLY, O_WRONLY, O_RDWR are basic functionality. The are stored in the
bits given by O_ACCMODE.
O_APPEND is working as expected. The value might be useful in a logging
application which appends the current status each time the log is opened.
O_LARGEFILE is needed for files exceeding 4GB on 32bit systems.
O_NONBLOCK may be useful when monitoring slow devices like tapes.
O_NDELAY is equal to O_NONBLOCK except for platform parisc.
To avoid code breaking on parisc either both flags should be
allowed or none. The patch allows both.
__O_SYNC and O_DSYNC may be used to avoid data loss on power disruption.
O_NOATIME may be useful to reduce disk activity.
O_CLOEXEC may be useful, if separate processes shall be used to scan files.
Once this patch is accepted, the fanotify_init.2 manpage has to be updated.
Change-Id: I0e3a23ccbb38fc612df14068164dde3cb7f94f86
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As comment in include/uapi/asm-generic/fcntl.h described, when
introducing new O_* bits, we need to check its uniqueness in
fcntl_init(). But __O_TMPFILE bit is missing. So fix it.
Change-Id: I914b76ab4282717b88afbbcde3c630726daef747
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
O_TMPFILE, like O_CREAT, should respect the requested mode and should
create regular files.
This fixes two bugs: O_TMPFILE required privilege (because the mode
ended up as 000) and it produced bogus inodes with no type.
Change-Id: I322c3f4a60bcae4f376898aee75ea838daa1c8d3
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...
Change-Id: I90b6ad396a8053eadd5cb32501f55cbb1d4be2db
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix documentation of ->atomic_open() and related functions: finish_open()
and finish_no_open(). Also add details that seem to be unclear and a
source of bugs (some of which are fixed in the following series).
Cc-ing maintainers of all filesystems implementing ->atomic_open().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Steve French <sfrench@samba.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Ic3734901961cb69079189f7d4ded66af5a88d8f2
In this case we do need a bit more than usual, due to orphan
list handling.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I3a2da2b3f9bde5ac5a8158005a3068a6a67b7a83
O_TMPFILE | O_CREAT => linkat() with AT_SYMLINK_FOLLOW and /proc/self/fd/<n>
as oldpath (i.e. flink()) will create a link
O_TMPFILE | O_CREAT | O_EXCL => ENOENT on attempt to link those guys
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I1c10dfd653cb48f4e7a42344337601210779178a
Update proc_ns_follow_link to use nd_jump_link instead of just
manually updating nd.path.dentry.
This fixes the BUG_ON(nd->inode != parent->d_inode) reported by Dave
Jones and reproduced trivially with mkdir /proc/self/ns/uts/a.
Sigh it looks like the VFS change to require use of nd_jump_link
happend while proc_ns_follow_link was baking and since the common case
of proc_ns_follow_link continued to work without problems the need for
making this change was overlooked.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Change-Id: I465f73b64069aca5b059bad28bfef098dddc1b99
It's "normal" - it can happen if the file descriptor you followed was
opened with O_NOFOLLOW.
Reported-by: Dave Jones <davej@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ic8bcf2195ef87b424c2121691ca8fe78c6f8eb73
In commit 800179c9b8 ("This adds symlink and hardlink restrictions to
the Linux VFS"), the new link protections were enabled by default, in
the hope that no actual application would care, despite it being
technically against legacy UNIX (and documented POSIX) behavior.
However, it does turn out to break some applications. It's rare, and
it's unfortunate, but it's unacceptable to break existing systems, so
we'll have to default to legacy behavior.
In particular, it has broken the way AFD distributes files, see
http://www.dwd.de/AFD/
along with some legacy scripts.
Distributions can end up setting this at initrd time or in system
scripts: if you have security problems due to link attacks during your
early boot sequence, you have bigger problems than some kernel sysctl
setting. Do:
echo 1 > /proc/sys/fs/protected_symlinks
echo 1 > /proc/sys/fs/protected_hardlinks
to re-enable the link protections.
Alternatively, we may at some point introduce a kernel config option
that sets these kinds of "more secure but not traditional" behavioural
options automatically.
Reported-by: Nick Bowler <nbowler@elliptictech.com>
Reported-by: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # v3.6
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I0f626d9487972c6dcae2dd98d80f72c2e7727087
In the common case where a name is much smaller than PATH_MAX, an extra
allocation for struct filename is unnecessary. Before allocating a
separate one, try to embed the struct filename inside the buffer first. If
it turns out that that's not long enough, then fall back to allocating a
separate struct filename and redoing the copy.
Change-Id: I57df0c4e642cc7a76efaa621ba1ce10e717447ff
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a couple of places got missed back when Linus has introduced that one...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I47ad6735f70d32e54a1ca9b15fa43b2fbcc6b999
...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.
For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.
Change-Id: Ibeb0479a22019e78b22990406d54c4ebed76a567
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
...and make the user_path callers use that variant instead.
Change-Id: I2d162b8859702febd366a4920b896b26bacf5136
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.
For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.
This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.
Later, we'll add other information to the struct as it becomes
convenient.
Change-Id: Ib690c3dd4d56624f0ddb081e1c1d4f23c2dd0cd1
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
I see no callers in module code.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I88117f368a130770b6e4d4686cadde6723c1d7fc
The follow_link() function always initializes its *p argument,
or returns an error, but when building with 'gcc -s', the compiler
gets confused by the __always_inline attribute to the function
and can no longer detect where the cookie was initialized.
The solution is to always initialize the pointer from follow_link,
even in the error path. When building with -O2, this has zero impact
on generated code and adds a single instruction in the error path
for a -Os build on ARM.
Without this patch, building with gcc-4.6 through gcc-4.8 and
CONFIG_CC_OPTIMIZE_FOR_SIZE results in:
fs/namei.c: In function 'link_path_walk':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:1544:9: note: 'cookie' was declared here
fs/namei.c: In function 'path_lookupat':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:1934:10: note: 'cookie' was declared here
fs/namei.c: In function 'path_openat':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:2899:9: note: 'cookie' was declared here
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Ib640b0c8b111da37b389ceb24f468497ad97622e
Commit "fs: add link restriction audit reporting" has added auditing of failed
attempts to follow symlinks. Unfortunately, the auditing was being done after
the struct path structure was released earlier.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Id6639dd23f00eb29ee19c8c7c714769ba25efca7
get_write_access() is needed for nfsd, not binfmt_aout (the latter
has no business doing anything of that kind, of course)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I210f8b92bdd26966b4ca47f000b58433a8f8eca6
If ->atomic_open() returns -ENOENT, we take care to return the create
error (e.g., EACCES), if any. Do the same when ->atomic_open() returns 1
and provides a negative dentry.
This fixes a regression where an unprivileged open O_CREAT fails with
ENOENT instead of EACCES, introduced with the new atomic_open code. It
is tested by the open/08.t test in the pjd posix test suite, and was
observed on top of fuse (backed by ceph-fuse).
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Change-Id: Ie92bf84be4469484b005d0ea9b9886a0bd36d922
Pass the umask-ed create mode to may_o_create() instead of the original one.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Change-Id: Ie873439e8135f579c91dba57e88665e96d646ae4
Don't mask S_ISREG off the create mode before passing to ->atomic_open(). Other
methods (->create, ->mknod) also get the complete file mode and filesystems
expect it.
Reported-by: Steve <steveamigauk@yahoo.co.uk>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Change-Id: Idd21534c4124f2c7ade8b9afbd40b6fa303dbc4d
Currently, mnt_want_write() is sometimes called with i_mutex held and sometimes
without it. This isn't really a problem because mnt_want_write() is a
non-blocking operation (essentially has a trylock semantics) but when the
function starts to handle also frozen filesystems, it will get a full lock
semantics and thus proper lock ordering has to be established. So move
all mnt_want_write() calls outside of i_mutex.
One non-trivial case needing conversion is kern_path_create() /
user_path_create() which didn't include mnt_want_write() but now needs to
because it acquires i_mutex. Because there are virtual file systems which
don't bother with freeze / remount-ro protection we actually provide both
versions of the function - one which calls mnt_want_write() and one which does
not.
[AV: scratch the previous, mnt_want_write() has been moved to kern_path_create()
by now]
Change-Id: I460255fabb9bfcebe6974aabdcd0b5dca1856a9e
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The write ref to vfsmount taken in lookup_open()/atomic_open() is going to
be dropped; we take the one to stay in dentry_open(). Just grab the temporary
in caller if it looks like we are going to need it (create/truncate/writable open)
and pass (by value) "has it succeeded" flag. Instead of doing mnt_want_write()
inside, check that flag and treat "false" as "mnt_want_write() has just failed".
mnt_want_write() is cheap and the things get considerably simpler and more robust
that way - we get it and drop it in the same function, to start with, rather
than passing a "has something in the guts of really scary functions taken it"
back to caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Icda3799935abd688cbad95d4a1f22563b1f653d5
O_EXCL without O_CREAT has different semantics; it's "fail if already opened",
not "fail if already exists". commit 71574865 broke that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I59e7ab80df02e7fff2f4f9118d78921f60399a02
Adds audit messages for unexpected link restriction violations so that
system owners will have some sort of potentially actionable information
about misbehaving processes.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I4a6ef885b0680e1d554e32b7cc3506f8e0ba0b8a
This adds symlink and hardlink restrictions to the Linux VFS.
Symlinks:
A long-standing class of security issues is the symlink-based
time-of-check-time-of-use race, most commonly seen in world-writable
directories like /tmp. The common method of exploitation of this flaw
is to cross privilege boundaries when following a given symlink (i.e. a
root process follows a symlink belonging to another user). For a likely
incomplete list of hundreds of examples across the years, please see:
http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
The solution is to permit symlinks to only be followed when outside
a sticky world-writable directory, or when the uid of the symlink and
follower match, or when the directory owner matches the symlink's owner.
Some pointers to the history of earlier discussion that I could find:
1996 Aug, Zygo Blaxell
http://marc.info/?l=bugtraq&m=87602167419830&w=2
1996 Oct, Andrew Tridgell
http://lkml.indiana.edu/hypermail/linux/kernel/9610.2/0086.html
1997 Dec, Albert D Cahalan
http://lkml.org/lkml/1997/12/16/4
2005 Feb, Lorenzo Hernández García-Hierro
http://lkml.indiana.edu/hypermail/linux/kernel/0502.0/1896.html
2010 May, Kees Cook
https://lkml.org/lkml/2010/5/30/144
Past objections and rebuttals could be summarized as:
- Violates POSIX.
- POSIX didn't consider this situation and it's not useful to follow
a broken specification at the cost of security.
- Might break unknown applications that use this feature.
- Applications that break because of the change are easy to spot and
fix. Applications that are vulnerable to symlink ToCToU by not having
the change aren't. Additionally, no applications have yet been found
that rely on this behavior.
- Applications should just use mkstemp() or O_CREATE|O_EXCL.
- True, but applications are not perfect, and new software is written
all the time that makes these mistakes; blocking this flaw at the
kernel is a single solution to the entire class of vulnerability.
- This should live in the core VFS.
- This should live in an LSM. (https://lkml.org/lkml/2010/5/31/135)
- This should live in an LSM.
- This should live in the core VFS. (https://lkml.org/lkml/2010/8/2/188)
Hardlinks:
On systems that have user-writable directories on the same partition
as system files, a long-standing class of security issues is the
hardlink-based time-of-check-time-of-use race, most commonly seen in
world-writable directories like /tmp. The common method of exploitation
of this flaw is to cross privilege boundaries when following a given
hardlink (i.e. a root process follows a hardlink created by another
user). Additionally, an issue exists where users can "pin" a potentially
vulnerable setuid/setgid file so that an administrator will not actually
upgrade a system fully.
The solution is to permit hardlinks to only be created when the user is
already the existing file's owner, or if they already have read/write
access to the existing file.
Many Linux users are surprised when they learn they can link to files
they have no access to, so this change appears to follow the doctrine
of "least surprise". Additionally, this change does not violate POSIX,
which states "the implementation may require that the calling process
has permission to access the existing file"[1].
This change is known to break some implementations of the "at" daemon,
though the version used by Fedora and Ubuntu has been fixed[2] for
a while. Otherwise, the change has been undisruptive while in use in
Ubuntu for the last 1.5 years.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/linkat.html
[2] http://anonscm.debian.org/gitweb/?p=collab-maint/at.git;a=commitdiff;h=f4114656c3a6c6f6070e315ffdf940a49eda3279
This patch is based on the patches in Openwall and grsecurity, along with
suggestions from Al Viro. I have added a sysctl to enable the protected
behavior, and documentation.
Change-Id: Ic4872c58e8a0672147c73b13175ea143e19915ba
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
One side effect - attempt to create a cross-device link on a read-only fs fails
with EROFS instead of EXDEV now. Makes more sense, POSIX allows, etc.
Change-Id: I264b03a230dbd310f3b3671d2da06ceb2930179b
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Note that applying umask can't affect their results. While
that affects errno in cases like
mknod("/no_such_directory/a", 030000)
yielding -EINVAL (due to impossible mode_t) instead of
-ENOENT (due to inexistent directory), IMO that makes a lot
more sense, POSIX allows to return either and any software
that relies on getting -ENOENT instead of -EINVAL in that
case deserves everything it gets.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I1abb3e8ad247f3f48bde931d70e6f546126c62d7
releases what needs to be released after {kern,user}_path_create()
Change-Id: If7fa7455e2ba8a6f4f4c4d2db502a38b4a60d7c7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
locking/unlocking for rcu walk taken to a couple of inline helpers
Change-Id: I19f7f437641bb56f186f5d4c197425886f3625ca
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
really convoluted test in there has grown up during struct mount
introduction; what it checks is that we'd reached the root of
mount tree.
Change-Id: Ia48bdc985ae689345cfd409d8c81eb52fca6e014
No need to bother with lookup_one_len() here - it's an overkill
Signed-off-by Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I733256aed797e9c0ac52f9c7cbc17b40e5b151fe
Add a helper that abstracts out the jump to an already parsed struct path
from ->follow_link operation from procfs. Not only does this clean up
the code by moving the two sides of this game into a single helper, but
it also prepares for making struct nameidata private to namei.c
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: If2392e9a3db44877f3976b543b12d3402cd29c22
Currently the non-nd_set_link based versions of ->follow_link are expected
to do a path_put(&nd->path) on failure. This calling convention is unexpected,
undocumented and doesn't match what the nd_set_link-based instances do.
Move the path_put out of the only non-nd_set_link based ->follow_link
instance into the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I6e06cf2be5425e752622a33eb63308bced33b0bb
Since commit 197e37d9, the banner comment on lookup_open() no longer matches
what the function returns. It used to return a struct file pointer or NULL and
now it returns an integer and is passed the struct file pointer it is to use
amongst its arguments. Update the comment to reflect this.
Also add a banner comment to atomic_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Ia49cbec8cd15bd0b4af0b44bb16d79faa80947e0
all we want is a boolean flag, same as the method gets now
Change-Id: I0cbe220b96bbbec6d50228cac774a0439f6a29f2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.
Change-Id: I25efea9892458f6f64070c62bd1adb5194dcd8c1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Id5a9a96c3202f724156c32fb266190334e7dbe48
since the method wrapped by it doesn't need that anymore...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I2d0b8680f4ff4dd4d46e0e9b4673370081929137
Same conventions as for ->atomic_open(). Trimmed the
forest of labels a bit, while we are at it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I94a25b547d3caaf3c20e2b6fbe4183ac5e1b87d7
namely, 1 ;-) That's what we want to return from ->atomic_open()
instances after finish_no_open().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Id629fb7d43cca5a4ca91802ba13b61aa95288d47
Just pass struct file *. Methods are happier that way...
There's no need to return struct file * from finish_open() now,
so let it return int. Next: saner prototypes for parts in
namei.c
Change-Id: I984f0f992330c959a2f9703d9e7647ef340e2845
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
make put_filp() conditional on flag set by finish_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I79833cd7a54d635bad80c6fca31eb55631e96c8b
Change of calling conventions:
old new
NULL 1
file 0
ERR_PTR(-ve) -ve
Caller *knows* that struct file *; no need to return it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I883d67181a0100447a2e077ed537ee393e862e0b
... and let finish_open() report having opened the file via that sucker.
Next step: don't modify od->filp at all.
[AV: FILE_CREATE was already used by cifs; Miklos' fix folded]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I6ea0f871fab215a2901710392abbda88c80008c1
Add an ->atomic_open implementation which replaces the atomic lookup+open+create
operation implemented via ->lookup and ->create operations.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I9cd73db22147a760ee2f69b498aacd16689908b1
What was the purpose of this?
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I5f994a7aa9edc51f6e7ec4f746bf51332d5b496b
Add an ->atomic_open implementation which replaces the atomic open+create
operation implemented via ->create. No functionality is changed.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I0c8c0998fcf940f603963a876ef2be825babf6a7
is_atomic_open() is now only used by nfs4_lookup_revalidate() to check whether
it's okay to skip normal revalidation.
It does a racy check for mount read-onlyness and falls back to normal
revalidation if the open would fail. This makes little sense now that this
function isn't used for determining whether to actually open the file or not.
The d_mountpoint() check still makes sense since it is an indication that we
might be following a mount and so open may not revalidate the dentry.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Ibad58603956e406c63b4b7c63243502eabe6febf
Instead check LOOKUP_EXCL in nd->flags, which is basically what the open intent
flags were used for.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I9afc0a255a45c8d976efdd17ab991b71fc3c41f3
Don't pass nfs_open_context() to ->create(). Only the NFS4 implementation
needed that and only because it wanted to return an open file using open
intents. That task has been replaced by ->atomic_open so it is not necessary
anymore to pass the context to the create rpc operation.
Despite nfs4_proc_create apparently being okay with a NULL context it Oopses
somewhere down the call chain. So allocate a context here.
Change-Id: I1241a81129cb80a7f06338969ea95f28e10d40f0
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Replace NFS4 specific ->lookup implementation with ->atomic_open impelementation
and use the generic nfs_lookup for other lookups.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I287cb7db22925d56e7f37c8ae8869086e9e17841
NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
mount needs to be followed or not. It does check d_mountpoint() on the dentry,
which can result in a weird error if the VFS found that the mount does not in
fact need to be followed, e.g.:
# mount --bind /mnt/nfs /mnt/nfs-clone
# echo something > /mnt/nfs/tmp/bar
# echo x > /tmp/file
# mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
# cat /mnt/nfs/tmp/bar
cat: /mnt/nfs/tmp/bar: Not a directory
Which should, by any sane filesystem, result in "something" being printed.
So instead do the open in f_op->open() and in the unlikely case that the cached
dentry turned out to be invalid, drop the dentry and return EOPENSTALE to let
the VFS retry.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I680d2732dd85ae175a2ad9142f42bb3db16dc533
Add an ->atomic_open implementation which replaces the atomic open+create
operation implemented via ->create. No functionality is changed.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I5d06b0b21d17a68854b2b2b22a15d25b75e07724
Add an ->atomic_open implementation which replaces the atomic lookup+open+create
operation implemented via ->lookup and ->create operations.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Steve French <sfrench@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I8b39432768b5b6336ffb643026e565f1aefc40f5
Perform open_check_o_direct() in a common place in do_last after opening the
file.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: Icce292d477d2cb452acedd08ed35df5207294c07
Move the lookup retry logic to the bottom of the function to make the normal
case simpler to read.
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I6f0d88aae8d935c1138b3170bf171a69d453ab32
Consistently use bool for boolean values in do_last().
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I25241ab09fa97c0b27c9006c34691faf2ff53989
All users of open intents have been converted to use ->atomic_{open,create}.
This patch gets rid of nd->intent.open and related infrastructure.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change-Id: I9143c542eaa8f34e899e647d1e39fa289816639a
Add a new inode operation which is called on the last component of an open.
Using this the filesystem can look up, possibly create and open the file in one
atomic operation. If it cannot perform this (e.g. the file type turned out to
be wrong) it may signal this by returning NULL instead of an open struct file
pointer.
i_op->atomic_open() is only called if the last component is negative or needs
lookup. Handling cached positive dentries here doesn't add much value: these
can be opened using f_op->open(). If the cached file turns out to be invalid,
the open can be retried, this time using ->atomic_open() with a fresh dentry.
For now leave the old way of using open intents in lookup and revalidate in
place. This will be removed once all the users are converted.
David Howells noticed that if ->atomic_open() opens the file but does not create
it, handle_truncate() will be called on it even if it is not a regular file.
Fix this by checking the file type in this case too.
Change-Id: Ic8ccf5b34eb678bec8ab12ea2f5aebbf79f8473b
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Copy __lookup_hash() into lookup_open(). The next patch will insert the atomic
open call just before the real lookup.
Change-Id: I0764bfb794f152e95f3592a3c8171f3935ab7101
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Split out lookup + maybe create from do_last(). This is the part under i_mutex
protection.
The function is called lookup_open() and returns a filp even though the open
part is not used yet.
Change-Id: If0cfeff6b9a91306c1e3ad98ad7168e50308aea6
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi points out that we need to also worry about memory
odering when doing the dentry name comparison asynchronously with RCU.
In particular, doing a rename can do a memcpy() of one dentry name over
another, and we want to make sure that any unlocked reader will always
see the proper terminating NUL character, so that it won't ever run off
the allocation.
Rather than having to be extra careful with the name copy or at lookup
time for each character, this resolves the issue by making sure that all
names that are inlined in the dentry always have a NUL character at the
end of the name allocation. If we do that at dentry allocation time, we
know that no future name copy will ever change that final NUL to
anything else, so there are no memory ordering issues.
So even if a concurrent rename ends up overwriting the NUL character
that terminates the original name, we always know that there is one
final NUL at the end, and there is no worry about the lockless RCU
lookup traversing the name too far.
The out-of-line allocations are never copied over, so we can just make
sure that we write the name (with terminating NULL) and do a write
barrier before we expose the name to anything else by setting it in the
dentry.
Reported-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I5890f1c8d9d207dcf508aa17b373a6d7e6d4e586
This allows comparing hash and len in one operation on 64-bit
architectures. Right now only __d_lookup_rcu() takes advantage of this,
since that is the case we care most about.
The use of anonymous struct/unions hides the alternate 64-bit approach
from most users, the exception being a few cases where we initialize a
'struct qstr' with a static initializer. This makes the problematic
cases use a new QSTR_INIT() helper function for that (but initializing
just the name pointer with a "{ .name = xyzzy }" initializer remains
valid, as does just copying another qstr structure).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I710810eb8717563264428e39c99e62599d31907e
All callers do want to check the dentry length, but some of them can
check the length and the hash together, so doing it in dentry_cmp() can
be counter-productive.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I19bda91a21594b0150f8eecef0f2738984c3b13e
Commit 12f8ad4b05 ("vfs: clean up __d_lookup_rcu() and dentry_cmp()
interfaces") did the careful ACCESS_ONCE() of the dentry name only for
the word-at-a-time case, even though the issue is generic.
Admittedly I don't really see gcc ever reloading the value in the middle
of the loop, so the ACCESS_ONCE() protects us from a fairly theoretical
issue. But better safe than sorry.
Also, this consolidates the common parts of the word-at-a-time and
bytewise logic, which includes checking the length. We'll be changing
that later.
Change-Id: Icd5773ff1910d0509e250eb65a469e99d0145795
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>