android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Pablo Neira Ayuso	7b8d805651	netfilter updates for net-next (batch 3) On Tue, Jun 19, 2012 at 05:16:25AM +0200, pablo@netfilter.org wrote: [...] > You can pull these changes from: > > git://1984.lsi.us.es/nf-next master Please, also take the small patch attached after this 4 patch series. It fixes one linking issue. Sorry, I'll put more care next time testing compilation options more extensively. >From af6b248c22759fb7448668bbe495f1cbe0a9109d Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso <pablo@netfilter.org> Date: Tue, 19 Jun 2012 05:25:46 +0200 Subject: [PATCH] netfilter: fix missing symbols if CONFIG_NETFILTER_NETLINK_QUEUE_CT unset ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined! We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-03 13:59:01 +01:00
Florian Westphal	6b989d3879	netfilter: add connlabel conntrack extension similar to connmarks, except labels are bit-based; i.e. all labels may be attached to a flow at the same time. Up to 128 labels are supported. Supporting more labels is possible, but requires increasing the ct offset delta from u8 to u16 type due to increased extension sizes. Mapping of bit-identifier to label name is done in userspace. The extension is enabled at run-time once "-m connlabel" netfilter rules are added. Change-Id: I98cfb16533e7e4bde1d73f2aa30e1be425f0b67e Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-12-07 22:04:24 +04:00
Pablo Neira Ayuso	e224886882	netfilter: xt_CT: recover NOTRACK target support Florian Westphal reported that the removal of the NOTRACK target (`9655050` netfilter: remove xt_NOTRACK) is breaking some existing setups. That removal was scheduled for removal since long time ago as described in Documentation/feature-removal-schedule.txt What: xt_NOTRACK Files: net/netfilter/xt_NOTRACK.c When: April 2011 Why: Superseded by xt_CT Still, people may have not notice / may have decided to stick to an old iptables version. I agree with him in that some more conservative approach by spotting some printk to warn users for some time is less agressive. Current iptables 1.4.16.3 already contains the aliasing support that makes it point to the CT target, so upgrading would fix it. Still, the policy so far has been to avoid pushing our users to upgrade. As a solution, this patch recovers the NOTRACK target inside the CT target and it now spots a warning. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I93292ad4f6e377df739e0366e36ab72786c3750d	2018-12-07 22:04:24 +04:00
Jozsef Kadlecsik	2e46149f8c	netfilter: nf_nat: Handle routing changes in MASQUERADE target When the route changes (backup default route, VPNs) which affect a masqueraded target, the packets were sent out with the outdated source address. The patch addresses the issue by comparing the outgoing interface directly with the masqueraded interface in the nat table. Events are inefficient in this case, because it'd require adding route events to the network core and then scanning the whole conntrack table and re-checking the route for all entry. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I148103e8a4b257d004b8f0f03455243a28ff0eec	2018-12-07 22:02:09 +04:00
Patrick McHardy	dd6490947e	netfilter: ip6tables: add MASQUERADE target Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I2296776746567afe4f8067628308c88ba2233e5c	2018-12-07 22:02:09 +04:00
Patrick McHardy	104f2e20f9	netfilter: ipv6: add IPv6 NAT support Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I843d848e57d95e8fc932d95ca939a8c94fc8cfd8	2018-12-07 22:02:09 +04:00
Patrick McHardy	b06bea8f8f	net: core: add function for incremental IPv6 pseudo header checksum updates Add inet_proto_csum_replace16 for incrementally updating IPv6 pseudo header checksums for IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net> Change-Id: I858e211dfe3e1f4dc9daa42db3bd6fc535887324	2018-12-07 22:02:09 +04:00
Patrick McHardy	d1e8c3336e	ipv4: fix path MTU discovery with connection tracking IPv4 conntrack defragments incoming packet at the PRE_ROUTING hook and (in case of forwarded packets) refragments them at POST_ROUTING independent of the IP_DF flag. Refragmentation uses the dst_mtu() of the local route without caring about the original fragment sizes, thereby breaking PMTUD. This patch fixes this by keeping track of the largest received fragment with IP_DF set and generates an ICMP fragmentation required error during refragmentation if that size exceeds the MTU. Change-Id: Ibac77b728baba05841286ea5a8a2089d56e6ad65 Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net>	2018-12-07 22:02:09 +04:00
Patrick McHardy	7ce958755b	netfilter: add protocol independent NAT core Convert the IPv4 NAT implementation to a protocol independent core and address family specific modules. Change-Id: I926b42af53b37c96fb654021e7f568450e8c63c0 Signed-off-by: Patrick McHardy <kaber@trash.net>	2018-12-07 22:02:09 +04:00
Patrick McHardy	d7349a49b5	netfilter: nf_nat: add protoff argument to packet mangling functions For mangling IPv6 packets the protocol header offset needs to be known by the NAT packet mangling functions. Add a so far unused protoff argument and convert the conntrack and NAT helpers to use it in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I34f8597c072b0e9d99d19ae958c1788e3d8d51ee	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	a506f5d363	netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled In "9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink" the compilation with NF_CONNTRACK disabled is broken. This patch fixes this issue. I have moved the conntrack part into nfnetlink_queue_ct.c to avoid peppering the entire nfnetlink_queue.c code with ifdefs. I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c to update the net/netfilter/Makefile to support conditional compilation of the conntrack integration. This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly disable the integration between nf_conntrack and nfnetlink_queue. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I211402286b8e154f13af067cfbf470ab6a635282	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	f3f602eba6	netfilter: nf_ct_ext: support variable length extensions We can now define conntrack extensions of variable size. This patch is useful to get rid of these unions: union nf_conntrack_help union nf_conntrack_proto union nf_conntrack_nat_help Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Icc8a18ec32f495a1e18afebc748bfdcff6b9c658	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	d9bd97f7e3	netfilter: ctnetlink: add CTA_HELP_INFO attribute This attribute can be used to modify and to dump the internal protocol information. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ifa704afd118b8f9b860ebf8ffba843ea804f815d	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	678d79d6bc	netfilter: nf_ct_helper: implement variable length helper private data This patch uses the new variable length conntrack extensions. Instead of using union nf_conntrack_help that contain all the helper private data information, we allocate variable length area to store the private helper data. This patch includes the modification of all existing helpers. It also includes a couple of include header to avoid compilation warnings. Change-Id: I2b855f3687c16ac0996053006d0543ad05411acd Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	b258b03178	netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names This patch modifies the struct nf_conntrack_helper to allocate the room for the helper name. The maximum length is 16 bytes (this was already introduced in 2.6.24). For the maximum length for expectation policy names, I have also selected 16 bytes. This patch is required by the follow-up patch to support user-space connection tracking helpers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I9ed449622e5c101c87519604b1c6ea37c0bd91cd	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	26948d946b	netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled User-space programs that receive traffic via NFQUEUE may mangle packets. If NAT is enabled, this usually puzzles sequence tracking, leading to traffic disruptions. With this patch, nfnl_queue will make the corresponding NAT TCP sequence adjustment if: 1) The packet has been mangled, 2) the NFQA_CFG_F_CONNTRACK flag has been set, and 3) NAT is detected. There are some records on the Internet complaning about this issue: http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables By now, we only support TCP since we have no helpers for DCCP or SCTP. Better to add this if we ever have some helper over those layer 4 protocols. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Id74b2db6fa3264a322d682b64a3104ce6ff42eaf	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	191c3003e8	netfilter: add user-space connection tracking helper infrastructure There are good reasons to supports helpers in user-space instead: * Rapid connection tracking helper development, as developing code in user-space is usually faster. * Reliability: A buggy helper does not crash the kernel. Moreover, we can monitor the helper process and restart it in case of problems. * Security: Avoid complex string matching and mangling in kernel-space running in privileged mode. Going further, we can even think about running user-space helpers as a non-root process. * Extensibility: It allows the development of very specific helpers (most likely non-standard proprietary protocols) that are very likely not to be accepted for mainline inclusion in the form of kernel-space connection tracking helpers. This patch adds the infrastructure to allow the implementation of user-space conntrack helpers by means of the new nfnetlink subsystem `nfnetlink_cthelper' and the existing queueing infrastructure (nfnetlink_queue). I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register ipv[4\|6]_helper which results from splitting ipv[4\|6]_confirm into two pieces. This change is required not to break NAT sequence adjustment and conntrack confirmation for traffic that is enqueued to our user-space conntrack helpers. Basic operation, in a few steps: 1) Register user-space helper by means of `nfct': nfct helper add ftp inet tcp [ It must be a valid existing helper supported by conntrack-tools ] 2) Add rules to enable the FTP user-space helper which is used to track traffic going to TCP port 21. For locally generated packets: iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp For non-locally generated packets: iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp 3) Run the test conntrackd in helper mode (see example files under doc/helper/conntrackd.conf conntrackd 4) Generate FTP traffic going, if everything is OK, then conntrackd should create expectations (you can check that with `conntrack': conntrack -E expect [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp This confirms that our test helper is receiving packets including the conntrack information, and adding expectations in kernel-space. The user-space helper can also store its private tracking information in the conntrack structure in the kernel via the CTA_HELP_INFO. The kernel will consider this a binary blob whose layout is unknown. This information will be included in the information that is transfered to user-space via glue code that integrates nfnetlink_queue and ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ifad12a30d86a6eb0b72d20f079336a24348d711b	2018-12-07 22:02:09 +04:00
Eric Leblond	01cb0ca0ff	netfilter: nf_ct_helper: allow to disable automatic helper assignment This patch allows you to disable automatic conntrack helper lookup based on TCP/UDP ports, eg. echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper [ Note: flows that already got a helper will keep using it even if automatic helper assignment has been disabled ] Once this behaviour has been disabled, you have to explicitly use the iptables CT target to attach helper to flows. There are good reasons to stop supporting automatic helper assignment, for further information, please read: http://www.netfilter.org/news.html#2012-04-03 This patch also adds one message to inform that automatic helper assignment is deprecated and it will be removed soon (this is spotted only once, with the first flow that gets a helper attached to make it as less annoying as possible). Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ie1356fa575971063d579cf943ac0354d1513cfc3	2018-12-07 21:59:38 +04:00
David S. Miller	e712257bdf	netlink: Add nla_put_be{16,32,64}() helpers. Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I38d685dea0c7d87ef7378ec64b15c8ddde6a24fc	2018-12-07 21:59:38 +04:00
Cong Wang	d806df2b3a	ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif As suggested by Julian: Simply, flowi4_iif must not contain 0, it does not look logical to ignore all ip rules with specified iif. because in fib_rule_match() we do: if (rule->iifindex && (rule->iifindex != fl->flowi_iif)) goto out; flowi4_iif should be LOOPBACK_IFINDEX by default. We need to move LOOPBACK_IFINDEX to include/net/flow.h: 1) It is mostly used by flowi_iif 2) Fix the following compile error if we use it in flow.h by the patches latter: In file included from include/linux/netfilter.h:277:0, from include/net/netns/netfilter.h:5, from include/net/net_namespace.h:21, from include/linux/netdevice.h:43, from include/linux/icmpv6.h:12, from include/linux/ipv6.h:61, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:27, from include/linux/nfs_fs.h:30, from init/do_mounts.c:32: include/net/flow.h: In function ‘flowi4_init_output’: include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function) [Backport of net-next `6a662719c9`] Change-Id: Ib7a0a08d78c03800488afa1b2c170cb70e34cfd9 Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2018-08-27 14:52:49 +00:00
Pavel Emelyanov	36640b6239	net: Loopback ifindex is constant now As pointed out, there are places, that access net->loopback_dev->ifindex and after ifindex generation is made per-net this value becomes constant equals 1. So go ahead and introduce the LOOPBACK_IFINDEX constant and use it where appropriate. Change-Id: I29fd08fa01a9522240ab654d436b02a577bb610c Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 14:52:48 +00:00
David Herrmann	1528eb776d	Bluetooth: hidp: verify l2cap sockets commit `b3916db32c` upstream. We need to verify that the given sockets actually are l2cap sockets. If they aren't, we are not supposed to access bt_sk(sock) and we shouldn't start the session if the offsets turn out to be valid local BT addresses. That is, if someone passes a TCP socket to HIDCONNADD, then we access some random offset in the TCP socket (which isn't even guaranteed to be valid). Fix this by checking that the socket is an l2cap socket. Change-Id: I401bca741588b34876a1c835d8d4567852b4ec75 Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2018-01-13 17:14:28 +03:00
Artem Borisov	d7992e6feb	Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 All bluetooth-related changes were omitted because of our ancient incompatible bt stack. Change-Id: I96440b7be9342a9c1adc9476066272b827776e64	2017-12-27 17:13:15 +03:00
Lorenzo Colitti	18c446bb98	net: diag: Support destroying TCP sockets. This implements SOCK_DESTROY for TCP sockets. It causes all blocking calls on the socket to fail fast with ECONNABORTED and causes a protocol close of the socket. It informs the other end of the connection by sending a RST, i.e., initiating a TCP ABORT as per RFC 793. ECONNABORTED was chosen for consistency with FreeBSD. [Backport of net-next c1e64e298b8cad309091b95d8436a0255c84f54a] Change-Id: Ic5410d3a2f39db28a322c30f4b3b2bffd35ec2de Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 22:54:58 +03:00
Eric Dumazet	e1e6232909	ipv6: do not clear pinet6 field [ Upstream commit `f77d602124` ] We have seen multiple NULL dereferences in __inet6_lookup_established() After analysis, I found that inet6_sk() could be NULL while the check for sk_family == AF_INET6 was true. Bug was added in linux-2.6.29 when RCU lookups were introduced in UDP and TCP stacks. Once an IPv6 socket, using SLAB_DESTROY_BY_RCU is inserted in a hash table, we no longer can clear pinet6 field. This patch extends logic used in commit `fcbdf09d96` ("net: fix nulls list corruptions in sk_prot_alloc") TCP/UDP/UDPLite IPv6 protocols provide their own .clear_sk() method to make sure we do not clear pinet6 field. At socket clone phase, we do not really care, as cloning the parent (non NULL) pinet6 is not adding a fatal race. Change-Id: Idd03ce3344a5490a896b8a763c6f81bc6ceba173 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-15 22:54:52 +03:00
Lorenzo Colitti	2f68f939e1	net: diag: Add the ability to destroy a socket. This patch adds a SOCK_DESTROY operation, a destroy function pointer to sock_diag_handler, and a diag_destroy function pointer. It does not include any implementation code. [Backport of net-next 64be0aed59ad519d6f2160868734f7e278290ac1] Change-Id: I1d998e1c5f836b2f5638c0f79244c372c8d2d9d9 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 16:50:17 +03:00
Ben Seri	bcd33ae876	Bluetooth: Properly check L2CAP config option output buffer length upstream a8149a65c1db9c3980873a32e4a96331b7a61f5b from https://github.com/LineageOS/android_kernel_samsung_msm8974.git Validate the output buffer length for L2CAP config requests and responses to avoid overflowing the stack buffer used for building the option blocks. Change-Id: I7a0ff0b9dd0156c0e6383214a9c86e4ec4c0d236 Cc: stable@vger.kernel.org Signed-off-by: Ben Seri <ben@armis.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> CVE-2017-1000251 Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2017-11-13 07:15:19 +00:00
David S. Miller	9d814faae9	net: Do delayed neigh confirmation. When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Change-Id: I75acf86d11c9125c8ce2c2144e4a0fb717b29f03 Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:10 +03:00
Eldad Zack	0c612305c0	include/net/dst.h: neaten asterisk placement Fix code style - place the asterisk where it belongs. Change-Id: Icf8c4e16bfa4fa4a10843202e2ad842dff08f94c Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:10 +03:00
Vladimir Kondratiev	45b0684fd1	BACKPORT: {nl,cfg}80211: support high bitrates Until now, a u16 value was used to represent bitrate value. With VHT bitrates this becomes too small. Introduce a new 32-bit bitrate attribute. nl80211 will report both the new and the old attribute, unless the bitrate doesn't fit into the old u16 attribute in which case only the new one will be reported. User space tools encouraged to prefer the 32-bit attribute, if available (since it won't be available on older kernels.) Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> [reword commit message and comments a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com> [AdrianDC] Includes if () improvements from commit "nl80211/cfg80211: add VHT MCS support" by Johannes Berg, Fri, 9 Nov 2012 Change-Id: Ib5823475ba368954e51ac4129b400a58fd74e28d Signed-off-by: Adrian DC <radian.dc@gmail.com>	2017-09-01 13:38:08 +03:00
Lorenzo Colitti	966db7310d	net: core: add UID to flows, rules, and routes - Define a new FIB rule attributes, FRA_UID_RANGE, to describe a range of UIDs. - Define a RTA_UID attribute for per-UID route lookups and dumps. - Support passing these attributes to and from userspace via rtnetlink. The value INVALID_UID indicates no UID was specified. - Add a UID field to the flow structures. [Backport of net-next 622ec2c9d52405973c9f1ca5116eb1c393adfc7d] Bug: 16355602 Change-Id: Iea98e6fedd0fd4435a1f4efa3deb3629505619ab Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:07 +03:00
Eric W. Biederman	d621e5dc9c	userns: make each net (net_ns) belong to a user_ns The user namespace which creates a new network namespace owns that namespace and all resources created in it. This way we can target capability checks for privileged operations against network resources to the user_ns which created the network namespace in which the resource lives. Privilege to the user namespace which owns the network namespace, or any parent user namespace thereof, provides the same privilege to the network resource. This patch is reworked from a version originally by Serge E. Hallyn <serge.hallyn@canonical.com> Change-Id: Ifa426537c47cce669099cc96e80b17e1d814457b Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-09-01 13:38:07 +03:00
Simon Shields	f33f0fc3ed	Revert "net: core: Support UID-based routing." This reverts commit dbadd302523011ab746840885b93e8bd0b2ff499. Change-Id: I8d996d5a84a3203443987c844d10534cd0dfc2c1	2017-08-27 19:09:20 +03:00
Hannes Frederic Sowa	47029c9e50	unix: correctly track in-flight fds in sending process user_struct commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 upstream. The commit referenced in the Fixes tag incorrectly accounted the number of in-flight fds over a unix domain socket to the original opener of the file-descriptor. This allows another process to arbitrary deplete the original file-openers resource limit for the maximum of open files. Instead the sending processes and its struct cred should be credited. To do so, we add a reference counted struct user_struct pointer to the scm_fp_list and use it to account for the number of inflight unix fds. Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets") Change-Id: Ie3c77ac9cbfff505ad74334a420c5c0030ff9cc7 Reported-by: David Herrmann <dh.herrmann@gmail.com> Cc: David Herrmann <dh.herrmann@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2017-06-26 16:09:55 +03:00
Eric Dumazet	1c6a02e572	tcp: fix use after free in tcp_xmit_retransmit_queue() When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the tail of the write queue using tcp_add_write_queue_tail() Then it attempts to copy user data into this fresh skb. If the copy fails, we undo the work and remove the fresh skb. Unfortunately, this undo lacks the change done to tp->highest_sack and we can leave a dangling pointer (to a freed skb) Later, tcp_xmit_retransmit_queue() can dereference this pointer and access freed memory. For regular kernels where memory is not unmapped, this might cause SACK bugs because tcp_highest_sack_seq() is buggy, returning garbage instead of tp->snd_nxt, but with various debug features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel. This bug was found by Marco Grassi thanks to syzkaller. Change-Id: I264f97d30d0a623011d9ee811c63fa0e0c2149a2 Fixes: `6859d49475` ("[TCP]: Abstract tp->highest_sack accessing & point to next skb") Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 23:36:25 +11:00
Sabrina Dubroca	9664667610	ipv6: clean up anycast when an interface is destroyed If we try to rmmod the driver for an interface while sockets with setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up and we get stuck on: unregister_netdevice: waiting for ens3 to become free. Usage count = 1 If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no problem. We need to perform a cleanup similar to the one for multicast in addrconf_ifdown(how == 1). BUG: 18902601 Change-Id: I6d51aed5755eb5738fcba91950e7773a1c985d2e Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [Backported from android-3.10 86a47ad60de5221c3869821d3552dcd1c89199f5] Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2016-10-29 23:12:33 +08:00
Ravi Joshi	886ccee9f7	WLAN subsystem: Sysctl support for key TCP/IP parameters It has been observed that default values for some of key tcp/ip parameters are affecting the tput/performance of the system. Hence extending configuration capabilities to TCP/Ip stack through sysctl interface Change-Id: Ia92fc229c0a4a8c3f7e4e02cf7e3c8849719ddff CRs-Fixed: 507581 Signed-off-by: Kiran Kumar Lokere <klokere@codeaurora.org>	2016-10-29 23:12:27 +08:00
Hannes Frederic Sowa	dcf5c5397a	net: add validation for the socket syscall protocol argument 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. Change-Id: I653fad90da54908144cc8916c2dccb1fa6f14eed CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:11 +08:00
Andrey Vagin	be41f71342	netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len "len" contains sizeof(nf_ct_ext) and size of extensions. In a worst case it can contain all extensions. Bellow you can find sizes for all types of extensions. Their sum is definitely bigger than 256. nf_ct_ext_types[0]->len = 24 nf_ct_ext_types[1]->len = 32 nf_ct_ext_types[2]->len = 24 nf_ct_ext_types[3]->len = 32 nf_ct_ext_types[4]->len = 152 nf_ct_ext_types[5]->len = 2 nf_ct_ext_types[6]->len = 16 nf_ct_ext_types[7]->len = 8 I have seen "len" up to 280 and my host has crashes w/o this patch. The right way to fix this problem is reducing the size of the ecache extension (4) and Florian is going to do this, but these changes will be quite large to be appropriate for a stable tree. Change-Id: If9efaf2b103cf304bbfa583e354cfad3faa77ac2 Fixes: `5b423f6a40` (netfilter: nf_conntrack: fix racy timer handling with reliable) Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-29 23:12:10 +08:00
Hannes Frederic Sowa	ef06d64826	ipv4: try to cache dst_entries which would cause a redirect Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit `f886497212` ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I53e4b500a4db2f5fece937a42a3bd810b2640c44	2016-10-29 23:12:10 +08:00
Nicolas Dichtel	af706acbb5	ipv6: fix handling of blackhole and prohibit routes commit `ef2c7d7b59` upstream. When adding a blackhole or a prohibit route, they were handling like classic routes. Moreover, it was only possible to add this kind of routes by specifying an interface. Bug already reported here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498 Before the patch: $ ip route add blackhole 2001::1/128 RTNETLINK answers: No such device $ ip route add blackhole 2001::1/128 dev eth0 $ ip -6 route \| grep 2001 2001::1 dev eth0 metric 1024 After: $ ip route add blackhole 2001::1/128 $ ip -6 route \| grep 2001 blackhole 2001::1 dev lo metric 1024 error -22 v2: wrong patch v3: add a field fc_type in struct fib6_config to store RTN_* type Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-10-26 23:15:43 +08:00
Michal Kubeček	a4ea6252cc	ipv6: don't call fib6_run_gc() until routing is ready commit `2c861cc65e` upstream. When loading the ipv6 module, ndisc_init() is called before ip6_route_init(). As the former registers a handler calling fib6_run_gc(), this opens a window to run the garbage collector before necessary data structures are initialized. If a network device is initialized in this window, adding MAC address to it triggers a NETDEV_CHANGEADDR event, leading to a crash in fib6_clean_all(). Take the event handler registration out of ndisc_init() into a separate function ndisc_late_init() and move it after ip6_route_init(). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-10-26 23:15:43 +08:00
Eric Dumazet	5781d89c54	af_unix: fix a fatal race with bit fields commit `60bc851ae5` upstream. Using bit fields is dangerous on ppc64/sparc64, as the compiler [1] uses 64bit instructions to manipulate them. If the 64bit word includes any atomic_t or spinlock_t, we can lose critical concurrent changes. This is happening in af_unix, where unix_sk(sk)->gc_candidate/ gc_maybe_cycle/lock share the same 64bit word. This leads to fatal deadlock, as one/several cpus spin forever on a spinlock that will never be available again. A safer way would be to use a long to store flags. This way we are sure compiler/arch wont do bad things. As we own unix_gc_lock spinlock when clearing or setting bits, we can use the non atomic __set_bit()/__clear_bit(). recursion_level can share the same 64bit location with the spinlock, as it is set only with this spinlock held. [1] bug fixed in gcc-4.8.0 : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080 Reported-by: Ambrose Feinstein <ambrose@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: hejianet <hejianet@gmail.com> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-10-26 23:15:42 +08:00
lucien	8a26248a33	sctp: start t5 timer only when peer rwnd is 0 and local state is SHUTDOWN_PENDING commit 8a0d19c5ed417c78d03f4e0fa7215e58c40896d8 upstream. when A sends a data to B, then A close() and enter into SHUTDOWN_PENDING state, if B neither claim his rwnd is 0 nor send SACK for this data, A will keep retransmitting this data until t5 timeout, Max.Retrans times can't work anymore, which is bad. if B's rwnd is not 0, it should send abort after Max.Retrans times, only when B's rwnd == 0 and A's retransmitting beyonds Max.Retrans times, A will start t5 timer, which is also commit `f8d9605243` ("sctp: Enforce retransmission limit during shutdown") means, but it lacks the condition peer rwnd == 0. so fix it by adding a bit (zero_window_announced) in peer to record if the last rwnd is 0. If it was, zero_window_announced will be set. and use this bit to decide if start t5 timer when local.state is SHUTDOWN_PENDING. Fixes: commit `f8d9605243` ("sctp: Enforce retransmission limit during shutdown") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: change sack_needed to bitfield as done earlier upstream] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-10-26 23:15:35 +08:00
Eric Dumazet	d5127daf88	ipv6: add complete rcu protection around np->opt [ Upstream commit 45f6fad84cc305103b28d73482b344d7f5b76f39 ] This patch addresses multiple problems : UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions while socket is not locked : Other threads can change np->opt concurrently. Dmitry posted a syzkaller (http://github.com/google/syzkaller) program desmonstrating use-after-free. Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock() and dccp_v6_request_recv_sock() also need to use RCU protection to dereference np->opt once (before calling ipv6_dup_options()) This patch adds full RCU protection to np->opt BUG: 28746669 Change-Id: I207da29ac48bb6dd7c40d65f9e27c4e3ff508da0 Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Pierre Imai <imaipi@google.com>	2016-06-17 02:54:32 +00:00
Hannes Frederic Sowa	ca7d623e1e	net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration commit 7bbadd2d1009575dad675afc16650ebb5aa10612 upstream. Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-03-21 09:17:57 +08:00
Michal Kubeček	5317d9af12	ipv6: prevent fib6_run_gc() contention commit `2ac3ac8f86` upstream. On a high-traffic router with many processors and many IPv6 dst entries, soft lockup in fib6_run_gc() can occur when number of entries reaches gc_thresh. This happens because fib6_run_gc() uses fib6_gc_lock to allow only one thread to run the garbage collector but ip6_dst_gc() doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc() returns. On a system with many entries, this can take some time so that in the meantime, other threads pass the tests in ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for the lock. They then have to run the garbage collector one after another which blocks them for quite long. Resolve this by replacing special value ~0UL of expire parameter to fib6_run_gc() by explicit "force" parameter to choose between spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with force=false if gc_thresh is reached but not max_size. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-03-21 09:17:57 +08:00
Hannes Frederic Sowa	39f79797d2	net: add validation for the socket syscall protocol argument commit 79462ad02e861803b3840cc782248c7359451cd9 upstream. 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: open-code U8_MAX] Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-03-21 09:17:53 +08:00
Rainer Weikusat	ec54d5ae9d	unix: avoid use-after-free in ep_remove_wait_queue commit 7d267278a9ece963d77eefec61630223fce08c6c upstream. Rainer Weikusat <rweikusat@mobileactivedefense.com> writes: An AF_UNIX datagram socket being the client in an n:1 association with some server socket is only allowed to send messages to the server if the receive queue of this socket contains at most sk_max_ack_backlog datagrams. This implies that prospective writers might be forced to go to sleep despite none of the message presently enqueued on the server receive queue were sent by them. In order to ensure that these will be woken up once space becomes again available, the present unix_dgram_poll routine does a second sock_poll_wait call with the peer_wait wait queue of the server socket as queue argument (unix_dgram_recvmsg does a wake up on this queue after a datagram was received). This is inherently problematic because the server socket is only guaranteed to remain alive for as long as the client still holds a reference to it. In case the connection is dissolved via connect or by the dead peer detection logic in unix_dgram_sendmsg, the server socket may be freed despite "the polling mechanism" (in particular, epoll) still has a pointer to the corresponding peer_wait queue. There's no way to forcibly deregister a wait queue with epoll. Based on an idea by Jason Baron, the patch below changes the code such that a wait_queue_t belonging to the client socket is enqueued on the peer_wait queue of the server whenever the peer receive queue full condition is detected by either a sendmsg or a poll. A wake up on the peer queue is then relayed to the ordinary wait queue of the client socket via wake function. The connection to the peer wait queue is again dissolved if either a wake up is about to be relayed or the client socket reconnects or a dead peer is detected or the client socket is itself closed. This enables removing the second sock_poll_wait from unix_dgram_poll, thus avoiding the use-after-free, while still ensuring that no blocked writer sleeps forever. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Fixes: `ec0d215f94` ("af_unix: fix 'poll for write'/connected DGRAM sockets") Reviewed-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>	2016-03-21 09:17:53 +08:00
Marcelo Ricardo Leitner	493d6a2da3	sctp: fix ASCONF list handling commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 upstream. ->auto_asconf_splist is per namespace and mangled by functions like sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization. Also, the call to inet_sk_copy_descendant() was backuping ->auto_asconf_list through the copy but was not honoring ->do_auto_asconf, which could lead to list corruption if it was different between both sockets. This commit thus fixes the list handling by using ->addr_wq_lock spinlock to protect the list. A special handling is done upon socket creation and destruction for that. Error handlig on sctp_init_sock() will never return an error after having initialized asconf, so sctp_destroy_sock() can be called without addrq_wq_lock. The lock now will be take on sctp_close_sock(), before locking the socket, so we don't do it in inverse order compared to sctp_addr_wq_timeout_handler(). Instead of taking the lock on sctp_sock_migrate() for copying and restoring the list values, it's preferred to avoid rewritting it by implementing sctp_copy_descendant(). Issue was found with a test application that kept flipping sysctl default_auto_asconf on and off, but one could trigger it by issuing simultaneous setsockopt() calls on multiple sockets or by creating/destroying sockets fast enough. This is only triggerable locally. Fixes: `9f7d653b67` ("sctp: Add Auto-ASCONF support (core).") Reported-by: Ji Jianwen <jiji@redhat.com> Suggested-by: Neil Horman <nhorman@tuxdriver.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: - use global spinlock instead of per-namespace lock] Signed-off-by: Zefan Li <lizefan@huawei.com>	2015-10-22 09:20:04 +08:00

1 2 3 4 5 ...

5193 commits