android_kernel_google_msm

Commit Graph

Author	SHA1	Message	Date
Eric Biggers	1396cfea05	net: socket: don't set sk_uid to garbage value in ->setattr() ->setattr() was recently implemented for socket files to sync the socket inode's uid to the new 'sk_uid' member of struct sock. It does this by copying over the ia_uid member of struct iattr. However, ia_uid is actually only valid when ATTR_UID is set in ia_valid, indicating that the uid is being changed, e.g. by chown. Other metadata operations such as chmod or utimes leave ia_uid uninitialized. Therefore, sk_uid could be set to a "garbage" value from the stack. Fix this by only copying the uid over when ATTR_UID is set. [backport of net e1a3a60a2ebe991605acb14cd58e39c0545e174e] Bug: 16355602 Change-Id: I20e53848e54282b72a388ce12bfa88da5e3e9efe Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.") Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:37:20 +01:00
Lorenzo Colitti	d3cd043531	net: core: Add a UID field to struct sock. Protocol sockets (struct sock) don't have UIDs, but most of the time, they map 1:1 to userspace sockets (struct socket) which do. Various operations such as the iptables xt_owner match need access to the "UID of a socket", and do so by following the backpointer to the struct socket. This involves taking sk_callback_lock and doesn't work when there is no socket because userspace has already called close(). Simplify this by adding a sk_uid field to struct sock whose value matches the UID of the corresponding struct socket. The semantics are as follows: 1. Whenever sk_socket is non-null: sk_uid is the same as the UID in sk_socket, i.e., matches the return value of sock_i_uid. Specifically, the UID is set when userspace calls socket(), fchown(), or accept(). 2. When sk_socket is NULL, sk_uid is defined as follows: - For a socket that no longer has a sk_socket because userspace has called close(): the previous UID. - For a cloned socket (e.g., an incoming connection that is established but on which userspace has not yet called accept): the UID of the socket it was cloned from. - For a socket that has never had an sk_socket: UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. Kernel sockets created by sock_create_kern are a special case of #1 and sk_uid is the user that created them. For kernel sockets created at network namespace creation time, such as the per-processor ICMP and TCP sockets, this is the user that created the network namespace. [Backport of net-next 86741ec25462e4c8cdce6df2f41ead05568c7d5e] Bug: 16355602 Change-Id: I73e1a57dfeedf672f4c2dfc9ce6867838b55974b Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:37:04 +01:00
Masatake YAMATO	941eda2515	net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries lsof reports some of socket descriptors as "can't identify protocol" like: [yamato@localhost]/tmp% sudo lsof \| grep dbus \| grep iden dbus-daem 652 dbus 6u sock ... 17812 can't identify protocol dbus-daem 652 dbus 34u sock ... 24689 can't identify protocol dbus-daem 652 dbus 42u sock ... 24739 can't identify protocol dbus-daem 652 dbus 48u sock ... 22329 can't identify protocol ... lsof cannot resolve the protocol used in a socket because procfs doesn't provide the map between inode number on sockfs and protocol type of the socket. For improving the situation this patch adds an extended attribute named 'system.sockprotoname' in which the protocol name for /proc/PID/fd/SOCKET is stored. So lsof can know the protocol for a given /proc/PID/fd/SOCKET with getxattr system call. A few weeks ago I submitted a patch for the same purpose. The patch was introduced /proc/net/sockfs which enumerates inodes and protocols of all sockets alive on a system. However, it was rejected because (1) a global lock was needed, and (2) the layout of struct socket was changed with the patch. This patch doesn't use any global lock; and doesn't change the layout of any structs. In this patch, a protocol name is stored to dentry->d_name of sockfs when new socket is associated with a file descriptor. Before this patch dentry->d_name was not used; it was just filled with empty string. lsof may use an extended attribute named 'system.sockprotoname' to retrieve the value of dentry->d_name. It is nice if we can see the protocol name with ls -l /proc/PID/fd. However, "socket:[#INODE]", the name format returned from sockfs_dname() was already defined. To keep the compatibility between kernel and user land, the extended attribute is used to prepare the value of dentry->d_name. Change-Id: I04143ee6da5c236835a897086fc0de819abb0cdc Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:37:00 +01:00
Daniel Borkmann	e9ff904465	BACKPORT: net: sock: make sock_tx_timestamp void Currently, sock_tx_timestamp() always returns 0. The comment that describes the sock_tx_timestamp() function wrongly says that it returns an error when an invalid argument is passed (from commit `20d4947353`, ``net: socket infrastructure for SO_TIMESTAMPING''). Make the function void, so that we can also remove all the unneeded if conditions that check for such a _non-existant_ error case in the output path. Change-Id: Ibdfd5071737190371d4abec5ae76046b5aa8de23 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:32:19 +01:00
Artem Borisov	d7992e6feb	Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 All bluetooth-related changes were omitted because of our ancient incompatible bt stack. Change-Id: I96440b7be9342a9c1adc9476066272b827776e64	2017-12-27 17:13:15 +03:00
Arnaldo Carvalho de Melo	a84ab4f34b	net: Fix use after free in the recvmmsg exit path The syzkaller fuzzer hit the following use-after-free: Call Trace: [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295 [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261 [< inline >] SYSC_recvmmsg net/socket.c:2281 [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270 [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 And, as Dmitry rightly assessed, that is because we can drop the reference and then touch it when the underlying recvmsg calls return some packets and then hit an error, which will make recvmmsg to set sock->sk->sk_err, oops, fix it. Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Fixes: `a2e2725541` ("net: Introduce recvmmsg socket syscall") http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: Ie3b6ee89ad3e8cd3a0fe8f50f74aaa4834d0b4ca	2016-10-31 23:25:23 +11:00
Arnaldo Carvalho de Melo	887cbce4a7	net: Fix use after free in the recvmmsg exit path commit 34b88a68f26a75e4fded796f1a49c40f82234b7d upstream. The syzkaller fuzzer hit the following use-after-free: Call Trace: [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295 [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261 [< inline >] SYSC_recvmmsg net/socket.c:2281 [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270 [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 And, as Dmitry rightly assessed, that is because we can drop the reference and then touch it when the underlying recvmsg calls return some packets and then hit an error, which will make recvmmsg to set sock->sk->sk_err, oops, fix it. Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Fixes: `a2e2725541` ("net: Introduce recvmmsg socket syscall") http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-10-26 23:15:47 +08:00
Al Viro	b1d288e247	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Bug: 28759139 Change-Id: I561a14b514d714838ef539a94275b117d7f475f4 Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-15 06:22:38 +00:00
Junling Zheng	dbccb188fd	net: socket: Fix the wrong returns for recvmsg and sendmsg Based on 08adb7dabd4874cc5666b4490653b26534702ce0 upstream. We found that after v3.10.73, recvmsg might return -EFAULT while -EINVAL was expected. We tested it through the recvmsg01 testcase come from LTP testsuit. It set msg->msg_namelen to -1 and the recvmsg syscall returned errno 14, which is unexpected (errno 22 is expected): recvmsg01 4 TFAIL : invalid socket length ; returned -1 (expected -1), errno 14 (expected 22) Linux mainline has no this bug for commit 08adb7dab fixes it accidentally. However, it is too large and complex to be backported to LTS 3.10. Commit `281c9c36` (net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour) made get_compat_msghdr() return error if msg_sys->msg_namelen was negative, which changed the behaviors of recvmsg and sendmsg syscall in a lib32 system: Before commit `281c9c36`, get_compat_msghdr() wouldn't fail and it would return -EINVAL in move_addr_to_user() or somewhere if msg_sys->msg_namelen was invalid and then syscall returned -EINVAL, which is correct. And now, when msg_sys->msg_namelen is negative, get_compat_msghdr() will fail and wants to return -EINVAL, however, the outer syscall will return -EFAULT directly, which is unexpected. This patch gets the return value of get_compat_msghdr() as well as copy_msghdr_from_user(), then returns this expected value if get_compat_msghdr() fails. Fixes: `281c9c36` (net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour) Signed-off-by: Junling Zheng <zhengjunling@huawei.com> Signed-off-by: Hanbing Xu <xuhanbing@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>	2015-09-18 09:20:46 +08:00
Ani Sinha	adc507681d	net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland. commit `6a2a2b3ae0` upstream. Linux manpage for recvmsg and sendmsg calls does not explicitly mention setting msg_namelen to 0 when msg_name passed set as NULL. When developers don't set msg_namelen member in msghdr, it might contain garbage value which will fail the validation check and sendmsg and recvmsg calls from kernel will return EINVAL. This will break old binaries and any code for which there is no access to source code. To fix this, we set msg_namelen to 0 when msg_name is passed as NULL from userland. Signed-off-by: Ani Sinha <ani@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Zefan Li <lizefan@huawei.com>	2015-04-14 17:34:03 +08:00
Matthew Leach	d82c152b56	net: socket: error on a negative msg_namelen [ Upstream commit `dbb490b965` ] When copying in a struct msghdr from the user, if the user has set the msg_namelen parameter to a negative value it gets clamped to a valid size due to a comparison between signed and unsigned values. Ensure the syscall errors when the user passes in a negative value. Signed-off-by: Matthew Leach <matthew.leach@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-04-26 17:13:17 -07:00
Dan Carpenter	eb7eb184ac	net: clamp ->msg_namelen instead of returning an error [ Upstream commit `db31c55a6f` ] If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the original code that would lead to memory corruption in the kernel if you had audit configured. If you didn't have audit configured it was harmless. There are some programs such as beta versions of Ruby which use too large of a buffer and returning an error code breaks them. We should clamp the ->msg_namelen value instead. Fixes: `1661bf364a` ("net: heap overflow in __audit_sockaddr()") Reported-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Eric Wong <normalperson@yhbt.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 07:29:42 -08:00
Hannes Frederic Sowa	ea9d7dc958	net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) [ Upstream commit `68c6beb373` ] In that case it is probable that kernel code overwrote part of the stack. So we should bail out loudly here. The BUG_ON may be removed in future if we are sure all protocols are conformant. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 07:29:41 -08:00
Hannes Frederic Sowa	18719a4c7a	net: rework recvmsg handler msg_name and msg_namelen logic [ Upstream commit `f3d3342602` ] This patch now always passes msg->msg_namelen as 0. recvmsg handlers must set msg_namelen to the proper size <= sizeof(struct sockaddr_storage) to return msg_name to the user. This prevents numerous uninitialized memory leaks we had in the recvmsg handlers and makes it harder for new code to accidentally leak uninitialized memory. Optimize for the case recvfrom is called with NULL as address. We don't need to copy the address at all, so set it to NULL before invoking the recvmsg handler. We can do so, because all the recvmsg handlers must cope with the case a plain read() is called on them. read() also sets msg_name to NULL. Also document these changes in include/linux/net.h as suggested by David Miller. Changes since RFC: Set msg->msg_name = NULL if user specified a NULL in msg_name but had a non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't affect sendto as it would bail out earlier while trying to copy-in the address. It also more naturally reflects the logic by the callers of verify_iovec. With this change in place I could remove " if (!uaddr \|\| msg_sys->msg_namelen == 0) msg->msg_name = NULL ". This change does not alter the user visible error logic as we ignore msg_namelen as long as msg_name is NULL. Also remove two unnecessary curly brackets in ___sys_recvmsg and change comments to netdev style. Cc: David Miller <davem@davemloft.net> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-12-08 07:29:41 -08:00
Dan Carpenter	5684fac30a	net: heap overflow in __audit_sockaddr() [ Upstream commit `1661bf364a` ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <juri.aedla@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-11-04 04:23:40 -08:00
Andy Lutomirski	c0cc94f2f6	net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg [ Upstream commits `1be374a051` and `a7526eb5d0` ] MSG_CMSG_COMPAT is (AFAIK) not intended to be part of the API -- it's a hack that steals a bit to indicate to other networking code that a compat entry was used. So don't allow it from a non-compat syscall. This prevents an oops when running this code: int main() { int s; struct sockaddr_in addr; struct msghdr hdr; char highpage = mmap((void)(TASK_SIZE_MAX - 4096), 4096, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS \| MAP_FIXED, -1, 0); if (highpage == MAP_FAILED) err(1, "mmap"); s = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (s == -1) err(1, "socket"); addr.sin_family = AF_INET; addr.sin_port = htons(1); addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); if (connect(s, (struct sockaddr)&addr, sizeof(addr)) != 0) err(1, "connect"); void *evil = highpage + 4096 - COMPAT_MSGHDR_SIZE; printf("Evil address is %p\n", evil); if (syscall(__NR_sendmmsg, s, evil, 1, MSG_CMSG_COMPAT) < 0) err(1, "sendmmsg"); return 0; } Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-27 11:27:32 -07:00
Mathias Krause	d09b3b2b11	net: fix info leak in compat dev_ifconf() [ Upstream commit `43da5f2e0d` ] The implementation of dev_ifconf() for the compat ioctl interface uses an intermediate ifc structure allocated in userland for the duration of the syscall. Though, it fails to initialize the padding bytes inserted for alignment and that for leaks four bytes of kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-02 10:29:37 -07:00
Mikulas Patocka	43da476d7f	Fix order of arguments to compat_put_time[spec\|val] commit `ed6fe9d614` upstream. Commit `644595f896` ("compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c") introduced a bug where the helper functions to take either a 64-bit or compat time[spec\|val] got the arguments in the wrong order, passing the kernel stack pointer off as a user pointer (and vice versa). Because of the user address range check, that in turn then causes an EFAULT due to the user pointer range checking failing for the kernel address. Incorrectly resuling in a failed system call for 32-bit processes with a 64-bit kernel. On odder architectures like HP-PA (with separate user/kernel address spaces), it can be used read kernel memory. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-14 10:00:23 -07:00
Mikulas Patocka	20855fe209	tun: fix a crash bug and a memory leak commit `b09e786bd1` upstream. This patch fixes a crash tun_chr_close -> netdev_run_todo -> tun_free_netdev -> sk_release_kernel -> sock_release -> iput(SOCK_INODE(sock)) introduced by commit `1ab5ecb90c` The problem is that this socket is embedded in struct tun_struct, it has no inode, iput is called on invalid inode, which modifies invalid memory and optionally causes a crash. sock_release also decrements sockets_in_use, this causes a bug that "sockets: used" field in /proc/*/net/sockstat keeps on decreasing when creating and closing tun devices. This patch introduces a flag SOCK_EXTERNALLY_ALLOCATED that instructs sock_release to not free the inode and not decrement sockets_in_use, fixing both memory corruption and sockets_in_use underflow. It should be backported to 3.3 an 3.4 stabke. Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-09 08:31:30 -07:00
Eric Dumazet	35f9c09fe9	tcp: tcp_sendpages() should call tcp_push() once commit `2f53384424` (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: Tom Herbert <therbert@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 19:04:27 -04:00
Linus Torvalds	a591afc01d	Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x32 support for x86-64 from Ingo Molnar: "This tree introduces the X32 binary format and execution mode for x86: 32-bit data space binaries using 64-bit instructions and 64-bit kernel syscalls. This allows applications whose working set fits into a 32 bits address space to make use of 64-bit instructions while using a 32-bit address space with shorter pointers, more compressed data structures, etc." Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c} * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) x32: Fix alignment fail in struct compat_siginfo x32: Fix stupid ia32/x32 inversion in the siginfo format x32: Add ptrace for x32 x32: Switch to a 64-bit clock_t x32: Provide separate is_ia32_task() and is_x32_task() predicates x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls x86/x32: Fix the binutils auto-detect x32: Warn and disable rather than error if binutils too old x32: Only clear TIF_X32 flag once x32: Make sure TS_COMPAT is cleared for x32 tasks fs: Remove missed ->fds_bits from cessation use of fd_set structs internally fs: Fix close_on_exec pointer in alloc_fdtable x32: Drop non-__vdso weak symbols from the x32 VDSO x32: Fix coding style violations in the x32 VDSO code x32: Add x32 VDSO support x32: Allow x32 to be configured x32: If configured, add x32 system calls to system call tables x32: Handle process creation x32: Signal-related system calls x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h> ...	2012-03-29 18:12:23 -07:00
Maciej Żenczykowski	43db362d3a	net: get rid of some pointless casts to sockaddr The following 4 functions: move_addr_to_kernel move_addr_to_user verify_iovec verify_compat_iovec are always effectively called with a sockaddr_storage. Make this explicit by changing their signature. This removes a large number of casts from sockaddr_storage to sockaddr. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-11 19:11:22 -07:00
H. Peter Anvin	644595f896	compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c Use helper functions aware of COMPAT_USE_64BIT_TIME to write struct timeval and struct timespec to userspace in net/socket.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-02-20 12:48:48 -08:00
Eric Dumazet	cf778b00e9	net: reintroduce missing rcu_assign_pointer() calls commit `a9b3cd7f32` (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-12 12:26:56 -08:00
Linus Torvalds	9753dfe19a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits) net: pack skb_shared_info more efficiently net_sched: red: split red_parms into parms and vars net_sched: sfq: extend limits cnic: Improve error recovery on bnx2x devices cnic: Re-init dev->stats_addr after chip reset net_sched: Bug in netem reordering bna: fix sparse warnings/errors bna: make ethtool_ops and strings const xgmac: cleanups net: make ethtool_ops const vmxnet3" make ethtool ops const xen-netback: make ops structs const virtio_net: Pass gfp flags when allocating rx buffers. ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call igb: reset PHY after recovering from PHY power down igb: add basic runtime PM support igb: Add support for byte queue limits. e1000: cleanup CE4100 MDIO registers access e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove ...	2012-01-06 17:22:09 -08:00
Linus Torvalds	07d106d0a3	vfs: fix up ENOIOCTLCMD error handling We're doing some odd things there, which already messes up various users (see the net/socket.c code that this removes), and it was going to add yet more crud to the block layer because of the incorrect error code translation. ENOIOCTLCMD is not an error return that should be returned to user mode from the "ioctl()" system call, but it should not be translated as EINVAL ("Invalid argument"). It should be translated as ENOTTY ("Inappropriate ioctl for device"). That EINVAL confusion has apparently so permeated some code that the block layer actually checks for it, which is sad. We continue to do so for now, but add a big comment about how wrong that is, and we should remove it entirely eventually. In the meantime, this tries to keep the changes localized to just the EINVAL -> ENOTTY fix, and removing code that makes it harder to do the right thing. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-05 15:40:12 -08:00
Ben Hutchings	55664f324c	ethtool: Allow drivers to select RX NFC rule locations Define special location values for RX NFC that request the driver to select the actual rule location. This allows for implementation on devices that use hash-based filter lookup, whereas currently the API is more suited to devices with TCAM lookup or linear search. In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy the structure back to user-space after insertion so that the actual location is returned. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-04 14:09:10 -05:00
Neil Horman	5bc1421e34	net: add network priority cgroup infrastructure (v4) This patch adds in the infrastructure code to create the network priority cgroup. The cgroup, in addition to the standard processes file creates two control files: 1) prioidx - This is a read-only file that exports the index of this cgroup. This is a value that is both arbitrary and unique to a cgroup in this subsystem, and is used to index the per-device priority map 2) priomap - This is a writeable file. On read it reports a table of 2-tuples <name:priority> where name is the name of a network interface and priority is indicates the priority assigned to frames egresessing on the named interface and originating from a pid in this cgroup This cgroup allows for skb priority to be set prior to a root qdisc getting selected. This is benenficial for DCB enabled systems, in that it allows for any application to use dcb configured priorities so without application modification Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> CC: Robert Love <robert.w.love@intel.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 15:22:23 -05:00
Johannes Berg	6e3e939f3b	net: add wireless TX status socket option The 802.1X EAPOL handshake hostapd does requires knowing whether the frame was ack'ed by the peer. Currently, we fudge this pretty badly by not even transmitting the frame as a normal data frame but injecting it with radiotap and getting the status out of radiotap monitor as well. This is rather complex, confuses users (mon.wlan0 presence) and doesn't work with all hardware. To get rid of that hack, introduce a real wifi TX status option for data frame transmissions. This works similar to the existing TX timestamping in that it reflects the SKB back to the socket's error queue with a SCM_WIFI_STATUS cmsg that has an int indicating ACK status (0/1). Since it is possible that at some point we will want to have TX timestamping and wifi status in a single errqueue SKB (there's little point in not doing that), redefine SO_EE_ORIGIN_TIMESTAMPING to SO_EE_ORIGIN_TXSTATUS which can collect more than just the timestamp; keep the old constant as an alias of course. Currently the internal APIs don't make that possible, but it wouldn't be hard to split them up in a way that makes it possible. Thanks to Neil Horman for helping me figure out the functions that add the control messages. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-11-09 16:01:02 -05:00
David S. Miller	8decf86879	Merge branch 'master' of github.com:davem330/net Conflicts: MAINTAINERS drivers/net/Kconfig drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c drivers/net/ethernet/broadcom/tg3.c drivers/net/wireless/iwlwifi/iwl-pci.c drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c drivers/net/wireless/rt2x00/rt2800usb.c drivers/net/wireless/wl12xx/main.c	2011-09-22 03:23:13 -04:00
Mathieu Desnoyers	bc909d9ddb	sendmmsg/sendmsg: fix unsafe user pointer access Dereferencing a user pointer directly from kernel-space without going through the copy_from_user family of functions is a bad idea. Two of such usages can be found in the sendmsg code path called from sendmmsg, added by commit `c71d8ebe7a` upstream. commit 5b47b8038f183b44d2d8ff1c7d11a5c1be706b34 in the 3.0-stable tree. Usages are performed through memcmp() and memcpy() directly. Fix those by using the already copied msg_sys structure instead of the __user *msg structure. Note that msg_sys can be set to NULL by verify_compat_iovec() or verify_iovec(), which requires additional NULL pointer checks. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Goulet <dgoulet@ev0ke.net> CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> CC: Anton Blanchard <anton@samba.org> CC: David S. Miller <davem@davemloft.net> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-24 19:45:03 -07:00
David S. Miller	19fd61785a	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-08-07 23:20:26 -07:00
Tetsuo Handa	c71d8ebe7a	net: Fix security_socket_sendmsg() bypass problem. The sendmmsg() introduced by commit `228e548e` "net: Add sendmmsg socket system call" is capable of sending to multiple different destination addresses. SMACK is using destination's address for checking sendmsg() permission. However, security_socket_sendmsg() is called for only once even if multiple different destination addresses are passed to sendmmsg(). Therefore, we need to call security_socket_sendmsg() for each destination address rather than only the first destination address. Since calling security_socket_sendmsg() every time when only single destination address was passed to sendmmsg() is a waste of time, omit calling security_socket_sendmsg() unless destination address of previous datagram and that of current datagram differs. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Anton Blanchard <anton@samba.org> Cc: stable <stable@kernel.org> [3.0+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-05 03:31:03 -07:00
Anton Blanchard	98382f419f	net: Cap number of elements for sendmmsg To limit the amount of time we can spend in sendmmsg, cap the number of elements to UIO_MAXIOV (currently 1024). For error handling an application using sendmmsg needs to retry at the first unsent message, so capping is simpler and requires less application logic than returning EINVAL. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable <stable@kernel.org> [3.0+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-05 03:31:03 -07:00
Anton Blanchard	728ffb86f1	net: sendmmsg should only return an error if no messages were sent sendmmsg uses a similar error return strategy as recvmmsg but it turns out to be a confusing way to communicate errors. The current code stores the error code away and returns it on the next sendmmsg call. This means a call with completely valid arguments could get an error from a previous call. Change things so we only return an error if no datagrams could be sent. If less than the requested number of messages were sent, the application must retry starting at the first failed one and if the problem is persistent the error will be returned. This matches the behaviour of other syscalls like read/write - it is not an error if less than the requested number of elements are sent. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable <stable@kernel.org> [3.0+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-05 03:31:02 -07:00
Stephen Hemminger	a9b3cd7f32	rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 04:29:23 -07:00
Linus Torvalds	d5eab9152a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits) tg3: Remove 5719 jumbo frames and TSO blocks tg3: Break larger frags into 4k chunks for 5719 tg3: Add tx BD budgeting code tg3: Consolidate code that calls tg3_tx_set_bd() tg3: Add partial fragment unmapping code tg3: Generalize tg3_skb_error_unmap() tg3: Remove short DMA check for 1st fragment tg3: Simplify tx bd assignments tg3: Reintroduce tg3_tx_ring_info ASIX: Use only 11 bits of header for data size ASIX: Simplify condition in rx_fixup() Fix cdc-phonet build bonding: reduce noise during init bonding: fix string comparison errors net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared net: add IFF_SKB_TX_SHARED flag to priv_flags net: sock_sendmsg_nosec() is static forcedeth: fix vlans gianfar: fix bug caused by `87c288c6e9` gro: Only reset frag0 when skb can be pulled ...	2011-07-28 05:58:19 -07:00
Eric Dumazet	894dc24ce7	net: sock_sendmsg_nosec() is static Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-27 22:39:30 -07:00
Eric Dumazet	a209dfc7b0	vfs: dont chain pipe/anon/socket on superblock s_inodes list Workloads using pipes and sockets hit inode_sb_list_lock contention. superblock s_inodes list is needed for quota, dirty, pagecache and fsnotify management. pipe/anon/socket fs are clearly not candidates for these. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-26 12:57:09 -04:00
Linus Torvalds	06f4e926d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits) macvlan: fix panic if lowerdev in a bond tg3: Add braces around 5906 workaround. tg3: Fix NETIF_F_LOOPBACK error macvlan: remove one synchronize_rcu() call networking: NET_CLS_ROUTE4 depends on INET irda: Fix error propagation in ircomm_lmp_connect_response() irda: Kill set but unused variable 'bytes' in irlan_check_command_param() irda: Kill set but unused variable 'clen' in ircomm_connect_indication() rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport() be2net: Kill set but unused variable 'req' in lancer_fw_download() irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication() atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined. rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer(). rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler() rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection() rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window() pkt_sched: Kill set but unused variable 'protocol' in tc_classify() isdn: capi: Use pr_debug() instead of ifdefs. tg3: Update version to 3.119 tg3: Apply rx_discards fix to 5719/5720 ... Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c as per Davem.	2011-05-20 13:43:21 -07:00
David S. Miller	9cbc94eabb	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/vmxnet3/vmxnet3_ethtool.c net/core/dev.c	2011-05-17 17:33:11 -04:00
Anton Blanchard	b9eb8b8752	net: recvmmsg: Strip MSG_WAITFORONE when calling recvmsg recvmmsg fails on a raw socket with EINVAL. The reason for this is packet_recvmsg checks the incoming flags: err = -EINVAL; if (flags & ~(MSG_PEEK\|MSG_DONTWAIT\|MSG_TRUNC\|MSG_CMSG_COMPAT\|MSG_ERRQUEUE)) goto out; This patch strips out MSG_WAITFORONE when calling recvmmsg which fixes the issue. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@kernel.org [2.6.34+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-17 15:38:57 -04:00
Lai Jiangshan	6184522024	net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() The rcu callback wq_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(wq_free_rcu). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-07 22:51:10 -07:00
Anton Blanchard	228e548e60	net: Add sendmmsg socket system call This patch adds a multiple message send syscall and is the send version of the existing recvmmsg syscall. This is heavily based on the patch by Arnaldo that added recvmmsg. I wrote a microbenchmark to test the performance gains of using this new syscall: http://ozlabs.org/~anton/junkcode/sendmmsg_test.c The test was run on a ppc64 box with a 10 Gbit network card. The benchmark can send both UDP and RAW ethernet packets. 64B UDP batch pkts/sec 1 804570 2 872800 (+ 8 %) 4 916556 (+14 %) 8 939712 (+17 %) 16 952688 (+18 %) 32 956448 (+19 %) 64 964800 (+20 %) 64B raw socket batch pkts/sec 1 1201449 2 1350028 (+12 %) 4 1461416 (+22 %) 8 1513080 (+26 %) 16 1541216 (+28 %) 32 1553440 (+29 %) 64 1557888 (+30 %) We see a 20% improvement in throughput on UDP send and 30% on raw socket send. [ Add sparc syscall entries. -DaveM ] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-05 11:10:14 -07:00
David S. Miller	1c01a80cfe	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/smsc911x.c	2011-04-11 13:44:25 -07:00
Alexander Duyck	127fe533ae	v3 ethtool: add ntuple flow specifier data to network flow classifier This change is meant to add an ntuple data extensions to the rx network flow classification specifiers. The idea is to allow ntuple to be displayed via the network flow classification interface. The first patch had some left over stuff from the original flow extension flags I had added. That bit is removed in this patch. The second had some left over comments that stated we ignored bits in the masks when we actually match them. This work is based on input from Ben Hutchings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-11 13:20:49 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Ben Hutchings	3a7da39d16	ethtool: Compat handling for struct ethtool_rxnfc This structure was accidentally defined such that its layout can differ between 32-bit and 64-bit processes. Add compat structure definitions and an ioctl wrapper function. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: stable@kernel.org [2.6.30+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-18 15:13:11 -07:00
stephen hemminger	c3f52ae6a3	socket: suppress sparse warnings Use __force to quiet sparse warnings for cases where the code is simulating user space pointers. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-23 14:11:30 -08:00
Eric Dumazet	eaefd1105b	net: add __rcu annotations to sk_wq and wq Add proper RCU annotations/verbs to sk_wq and wq members Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access) Fix sunrpc sk_sleep() abuse too Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:19:31 -08:00

1 2 3 4 5

232 Commits