* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller. This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.
Signed-off-by: Jesse Gross <jesse@nicira.com>
C assignment can handle struct in6_addr copying.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pervasive, but implicit presence of <linux/module.h> meant
that things like this file would happily compile as-is. But
with the desire to phase out the module.h being included everywhere,
point this file at export.h which will give it THIS_MODULE and
the EXPORT_SYMBOL variants.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
My @hp.com will no longer be valid starting August 5, 2011 so an update is
necessary. My new email address is employer independent so we don't have
to worry about doing this again any time soon.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
isofs: Remove global fs lock
jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
mm/truncate.c: fix build for CONFIG_BLOCK not enabled
fs:update the NOTE of the file_operations structure
Remove dead code in dget_parent()
AFS: Fix silly characters in a comment
switch d_add_ci() to d_splice_alias() in "found negative" case as well
simplify gfs2_lookup()
jfs_lookup(): don't bother with . or ..
get rid of useless dget_parent() in btrfs rename() and link()
get rid of useless dget_parent() in fs/btrfs/ioctl.c
fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
drivers: fix up various ->llseek() implementations
fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
Ext4: handle SEEK_HOLE/SEEK_DATA generically
Btrfs: implement our own ->llseek
fs: add SEEK_HOLE and SEEK_DATA flags
reiserfs: make reiserfs default to barrier=flush
...
Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits)
ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever
ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction
connector: add an event for monitoring process tracers
ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
ptrace_init_task: initialize child->jobctl explicitly
has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/
ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/
ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()
ptrace: ptrace_reparented() should check same_thread_group()
redefine thread_group_leader() as exit_signal >= 0
do not change dead_task->exit_signal
kill task_detached()
reparent_leader: check EXIT_DEAD instead of task_detached()
make do_notify_parent() __must_check, update the callers
__ptrace_detach: avoid task_detached(), check do_notify_parent()
kill tracehook_notify_death()
make do_notify_parent() return bool
ptrace: s/tracehook_tracer_task()/ptrace_parent()/
...
tracehook.h is on the way out. Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
This is a rather hot function that is called with a potentially NULL
"struct common_audit_data" pointer argument. And in that case it has to
provide and initialize its own dummy common_audit_data structure.
However, all the _common_ cases already pass it a real audit-data
structure, so that uncommon NULL case not only creates a silly run-time
test, more importantly it causes that function to have a big stack frame
for the dummy variable that isn't even used in the common case!
So get rid of that stupid run-time behavior, and make the (few)
functions that currently call with a NULL pointer just call a new helper
function instead (naturally called inode_has_perm_noapd(), since it has
no adp argument).
This makes the run-time test be a static code generation issue instead,
and allows for a much denser stack since none of the common callers need
the dummy structure. And a denser stack not only means less stack space
usage, it means better cache behavior. So we have a win-win-win from
this simplification: less code executed, smaller stack footprint, and
better cache behavior.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.
Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
We currently have inode_has_perm and dentry_has_perm. dentry_has_perm just
calls inode_has_perm with additional audit data. But dentry_has_perm can
take either a dentry or a path. Split those to make the code obvious and
to fix the previous problem where I thought dentry_has_perm always had a
valid dentry and mnt.
Signed-off-by: Eric Paris <eparis@redhat.com>
New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.
Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.
This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch separates and audit message that only contains a dentry from
one that contains a full path. This allows us to make it harder to
misuse the interfaces or for the interfaces to be implemented wrong.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
The lsm common audit code has wacky contortions making sure which pieces
of information are set based on if it was given a path, dentry, or
inode. Split this into path and inode to get rid of some of the code
complexity.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.
This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.
Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
If one builds a kernel without CONFIG_BUG there are a number of 'may be
used uninitialized' warnings. Silence these by returning after the BUG().
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.
Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
And give it a kernel-doc comment.
[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
bonding: enable netpoll without checking link status
xfrm: Refcount destination entry on xfrm_lookup
net: introduce rx_handler results and logic around that
bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
bonding: wrap slave state work
net: get rid of multiple bond-related netdevice->priv_flags
bonding: register slave pointer for rx_handler
be2net: Bump up the version number
be2net: Copyright notice change. Update to Emulex instead of ServerEngines
e1000e: fix kconfig for crc32 dependency
netfilter ebtables: fix xt_AUDIT to work with ebtables
xen network backend driver
bonding: Improve syslog message at device creation time
bonding: Call netif_carrier_off after register_netdevice
bonding: Incorrect TX queue offset
net_sched: fix ip_tos2prio
xfrm: fix __xfrm_route_forward()
be2net: Fix UDP packet detected status in RX compl
Phonet: fix aligned-mode pipe socket buffer header reserve
netxen: support for GbE port settings
...
Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.
This is the first step to move in that direction.
Signed-off-by: David S. Miller <davem@davemloft.net>
For SELinux we do not allow security information to change during a remount
operation. Thus this hook simply strips the security module options from
the data and verifies that those are the same options as exist on the
current superblock.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
The security context for the newly created socket shares the same
user, role and MLS attribute as its creator but may have a different
type, which could be specified by a type_transition rule in the relevant
policy package.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
[fix call to security_transition_sid to include qstr, Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 242631c49d.
Conflicts:
security/selinux/hooks.c
SELinux used to recognize certain individual ioctls and check
permissions based on the knowledge of the individual ioctl. In commit
242631c49d the SELinux code stopped trying to understand
individual ioctls and to instead looked at the ioctl access bits to
determine in we should check read or write for that operation. This
same suggestion was made to SMACK (and I believe copied into TOMOYO).
But this suggestion is total rubbish. The ioctl access bits are
actually the access requirements for the structure being passed into the
ioctl, and are completely unrelated to the operation of the ioctl or the
object the ioctl is being performed upon.
Take FS_IOC_FIEMAP as an example. FS_IOC_FIEMAP is defined as:
FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap)
So it has access bits R and W. What this really means is that the
kernel is going to both read and write to the struct fiemap. It has
nothing at all to do with the operations that this ioctl might perform
on the file itself!
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
The IPSKB_FORWARDED and IP6SKB_FORWARDED flags are used only in the
multicast forwarding case to indicate that a packet looped back after
forward. So these flags are not a good indicator for packet forwarding.
A better indicator is the incoming interface. If we have no socket context,
but an incoming interface and we see the packet in the ip postroute hook,
the packet is going to be forwarded.
With this patch we use the incoming interface as an indicator on packet
forwarding.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
selinux_sock_rcv_skb_compat and selinux_ip_postroute_compat are just
called if selinux_policycap_netpeer is not set. However in these
functions we check if selinux_policycap_netpeer is set. This leads
to some dead code and to the fact that selinux_xfrm_postroute_last
is never executed. This patch removes the dead code and the checks
for selinux_policycap_netpeer in the compatibility functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
returns an error. As a result, BUG() will be triggered if SELinux is enabled
or CONFIG_DEBUG_CREDENTIALS=y.
If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
cred->magic == 0. Failing that, BUG() is called from selinux_cred_free()
because selinux_cred_free() is not expecting cred->security == NULL. This does
not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().
Fix these bugs by
(1) Set new->magic before calling security_cred_alloc_blank().
(2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes an old (2007) selinux regression: filesystem labeling for
/proc/sys returned
-r--r--r-- unknown /proc/sys/fs/file-nr
instead of
-r--r--r-- system_u:object_r:sysctl_fs_t:s0 /proc/sys/fs/file-nr
Events that lead to breaking of /proc/sys/ selinux labeling:
1) sysctl was reimplemented to route all calls through /proc/sys/
commit 77b14db502
[PATCH] sysctl: reimplement the sysctl proc support
2) proc_dir_entry was removed from ctl_table:
commit 3fbfa98112
[PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables
3) selinux still walked the proc_dir_entry tree to apply
labeling. Because ctl_tables don't have a proc_dir_entry, we did
not label /proc/sys/ inodes any more. To achieve this the /proc/sys/
inodes were marked private and private inodes were ignored by
selinux.
commit bbaca6c2e7
[PATCH] selinux: enhance selinux to always ignore private inodes
commit 86a71dbd3e
[PATCH] sysctl: hide the sysctl proc inodes from selinux
Access control checks have been done by means of a special sysctl hook
that was called for read/write accesses to any /proc/sys/ entry.
We don't have to do this because, instead of walking the
proc_dir_entry tree we can walk the dentry tree (as done in this
patch). With this patch:
* we don't mark /proc/sys/ inodes as private
* we don't need the sysclt security hook
* we walk the dentry tree to find the path to the inode.
We have to strip the PID in /proc/PID/ entries that have a
proc_dir_entry because selinux does not know how to label paths like
'/1/net/rpc/nfsd.fh' (and defaults to 'proc_t' labeling). Selinux does
know of '/net/rpc/nfsd.fh' (and applies the 'sysctl_rpc_t' label).
PID stripping from the path was done implicitly in the previous code
because the proc_dir_entry tree had the root in '/net' in the example
from above. The dentry tree has the root in '/1'.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Currently SELinux has rules which label new objects according to 3 criteria.
The label of the process creating the object, the label of the parent
directory, and the type of object (reg, dir, char, block, etc.) This patch
adds a 4th criteria, the dentry name, thus we can distinguish between
creating a file in an etc_t directory called shadow and one called motd.
There is no file globbing, regex parsing, or anything mystical. Either the
policy exactly (strcmp) matches the dentry name of the object or it doesn't.
This patch has no changes from today if policy does not implement the new
rules.
Signed-off-by: Eric Paris <eparis@redhat.com>
SELinux would like to implement a new labeling behavior of newly created
inodes. We currently label new inodes based on the parent and the creating
process. This new behavior would also take into account the name of the
new object when deciding the new label. This is not the (supposed) full path,
just the last component of the path.
This is very useful because creating /etc/shadow is different than creating
/etc/passwd but the kernel hooks are unable to differentiate these
operations. We currently require that userspace realize it is doing some
difficult operation like that and than userspace jumps through SELinux hoops
to get things set up correctly. This patch does not implement new
behavior, that is obviously contained in a seperate SELinux patch, but it
does pass the needed name down to the correct LSM hook. If no such name
exists it is fine to pass NULL.
Signed-off-by: Eric Paris <eparis@redhat.com>