android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Xin Long	103a20d70a	ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt [ Upstream commit 99253eb750fda6a644d5188fb26c43bad8d5a745 ] Commit `5e1859fbcc` ("ipv4: ipmr: various fixes and cleanups") fixed the issue for ipv4 ipmr: ip_mroute_setsockopt() & ip_mroute_getsockopt() should not access/set raw_sk(sk)->ipmr_table before making sure the socket is a raw socket, and protocol is IGMP The same fix should be done for ipv6 ipmr as well. This patch can fix the panic caused by overwriting the same offset as ipmr_table as in raw_sk(sk) when accessing other type's socket by ip_mroute_setsockopt(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I8d48f4611a2f2d0cb7ad5146036f571f12ecb1fc CVE-2017-18509 Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:38:56 +01:00
Lorenzo Colitti	afd1d2b38a	net: inet: Support UID-based routing in IP protocols. - Use the UID in routing lookups made by protocol connect() and sendmsg() functions. - Make sure that routing lookups triggered by incoming packets (e.g., Path MTU discovery) take the UID of the socket into account. - For packets not associated with a userspace socket, (e.g., ping replies) use UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. This allows all namespaces to apply routing and iptables rules to kernel-originated traffic in that namespaces by matching UID 0. This is better than using the UID of the kernel socket that is sending the traffic, because the UID of kernel sockets created at namespace creation time (e.g., the per-processor ICMP and TCP sockets) is the UID of the user that created the socket, which might not be mapped in the namespace. Bug: 16355602 Change-Id: I910504b508948057912bc188fd1e8aca28294de3 Tested: compiles allnoconfig, allyesconfig, allmodconfig Tested: https://android-review.googlesource.com/253302 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:38:56 +01:00
Daniel Borkmann	e9ff904465	BACKPORT: net: sock: make sock_tx_timestamp void Currently, sock_tx_timestamp() always returns 0. The comment that describes the sock_tx_timestamp() function wrongly says that it returns an error when an invalid argument is passed (from commit `20d4947353`, ``net: socket infrastructure for SO_TIMESTAMPING''). Make the function void, so that we can also remove all the unneeded if conditions that check for such a _non-existant_ error case in the output path. Change-Id: Ibdfd5071737190371d4abec5ae76046b5aa8de23 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>	2023-02-18 18:32:19 +01:00
Roger Hu	7c56badc9a	kernel: Revert "tcp: do not lock listener to process SYN packets" This commit belongs to the patch set (https://lwn.net/Articles/659199/) that attempts to remove the use of locks on the socket table by relocating the SYN table to a separate hash table and adding a spin lock to protect the SYN request queue. Adding only this commit introduces a race condition for LineageOS kernels for TCP listens, since the TCP SYN data structures can be corrupted. A TCP curl bomb on a TCP listen port will corrupt the SYN accept backlog: for i in $(seq 1 400); do curl -x localhost:443 https://myhost.com -L --connect-timeout 30 -o /dev/null -sS & done Run `ss -nltp` and usually the RecVQ column does not drain to 0. This reverts commit 7d9f104f9cabe1d72a50c4816a48f64fc1da7a64. This really needs to be reverted across all LineageOS forks: https://gitlab.com/LineageOS/issues/android/-/issues/3916#note_669493796 Change-Id: Ia7969aeedae411677b307a8e094f9a4cc02b801d	2022-07-05 01:10:45 -04:00
Eric Dumazet	e336f6ab76	tcp: fix more NULL deref after prequeue changes When I cooked commit `c3658e8d0f` ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. [Backport of net-next 0f85feae6b710ced3abad5b2b47d31dfcb956b62] Bug: 16355602 Change-Id: Ibe1def7979625ee7902bff2f33ec8945b9945948 Reported-by: Dann Frazier <dann.frazier@canonical.com> Bisected-by: Dann Frazier <dann.frazier@canonical.com> Tested-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `ca777eff51` ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>	2021-01-23 17:17:18 +03:00
David S. Miller	e80824b3dd	ipv6: Fix types of ip6_update_pmtu(). The mtu should be a __be32, not the mark. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: Ie321dcc3652921f8f28491d39c8262268aeb22bc	2021-01-23 17:17:04 +03:00
David S. Miller	9ce6cd5eab	ipv6: Handle PMTU in ICMP error handlers. One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts to handle the error pass the 32-bit info cookie in network byte order whereas ipv4 passes it around in host byte order. Like the ipv4 side, we have two helper functions. One for when we have a socket context and one for when we do not. ip6ip6 tunnels are not handled here, because they handle PMTU events by essentially relaying another ICMP packet-too-big message back to the original sender. This patch allows us to get rid of rt6_do_pmtu_disc(). It handles all kinds of situations that simply cannot happen when we do the PMTU update directly using a fully resolved route. In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very likely be removed or changed into a BUG_ON() check. We should never have a prefixed ipv6 route when we get there. Another piece of strange history here is that TCP and DCCP, unlike in ipv4, never invoke the update_pmtu() method from their ICMP error handlers. This is incredibly astonishing since this is the context where we have the most accurate context in which to make a PMTU update, namely we have a fully connected socket and associated cached socket route. Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: Ibb5cae4316256108c5130459b07288c9fc7380c3	2021-01-23 17:16:44 +03:00
Nicolas Dichtel	1b5c151075	xfrm: allow to avoid copying DSCP during encapsulation By default, DSCP is copying during encapsulation. Copying the DSCP in IPsec tunneling may be a bit dangerous because packets with different DSCP may get reordered relative to each other in the network and then dropped by the remote IPsec GW if the reordering becomes too big compared to the replay window. It is possible to avoid this copy with netfilter rules, but it's very convenient to be able to configure it for each SA directly. This patch adds a toogle for this purpose. By default, it's not set to maintain backward compatibility. Field flags in struct xfrm_usersa_info is full, hence I add a new attribute. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Change-Id: I885117f02790536e2c5002232b3b33be651a568d	2020-11-30 19:39:33 +03:00
David S. Miller	c275431b28	net: Document dst->obsolete better. Add a big comment explaining how the field works, and use defines instead of magic constants for the values assigned to it. Suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I2c8ca84d38cd49ccc1207db588108d78e6f9403a	2020-11-30 19:39:24 +03:00
Eric Dumazet	e7e3467ab1	tcp: TCP Small Queues This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I37d5e4d7c9ced1846385b6a04ae3ad134763a949	2020-11-30 19:35:00 +03:00
Marcelo Ricardo Leitner	461a9cb3c4	tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info This patch tracks the total number of inbound and outbound segments on a TCP socket. One may use this number to have an idea on connection quality when compared against the retransmissions. RFC4898 named these : tcpEStatsPerfSegsIn and tcpEStatsPerfSegsOut These are a 32bit field each and can be fetched both from TCP_INFO getsockopt() if one has a handle on a TCP socket, or from inet_diag netlink facility (iproute2/ss patch will follow) Note that tp->segs_out was placed near tp->snd_nxt for good data locality and minimal performance impact, while tp->segs_in was placed near tp->bytes_received for the same reason. Join work with Eric Dumazet. Note that received SYN are accounted on the listener, but sent SYNACK are not accounted. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: Ic3e5daea102e13fb24c5fb1ce6913d5ece13c521	2020-11-30 19:31:48 +03:00
Eric W. Biederman	ec0a45cd4a	net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq Replace the bh safe variant with the hard irq safe variant. We need a hard irq safe variant to deal with netpoll transmitting packets from hard irq context, and we need it in most if not all of the places using the bh safe variant. Except on 32bit uni-processor the code is exactly the same so don't bother with a bh variant, just have a hard irq safe variant that everyone can use. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> [javelinanddart]: Merge conflicts were resolved for drivers that exist in 3.4, and additionally a treewide find and replace was run for these functions. Signed-off-by: Paul Keith <javelinanddart@gmail.com> Change-Id: Ib74db793de5e546414a0599f23095f82f0e20c86	2020-11-30 19:26:49 +03:00
Li RongQing	a559d1f125	ipv6: fix the use of pcpu_tstats in ip6_tunnel when read/write the 64bit data, the correct lock should be hold. Fixes: `87b6d218f3` ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Keith <javelinanddart@gmail.com> Change-Id: Ib90ecb495739eafb1563d5cfd3d2dc275e4944ea	2020-11-30 19:26:46 +03:00
Li RongQing	2e641339f3	ipv6: fix the use of pcpu_tstats in sit when read/write the 64bit data, the correct lock should be hold. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Keith <javelinanddart@gmail.com> Change-Id: Ie9c73149b86eb028e1646d1001bdb4310a72fe14	2020-11-30 19:26:43 +03:00
John Stultz	ad74346428	net: Explicitly initialize u64_stats_sync structures for lockdep In order to enable lockdep on seqcount/seqlock structures, we must explicitly initialize any locks. The u64_stats_sync structure, uses a seqcount, and thus we need to introduce a u64_stats_init() function and use it to initialize the structure. This unfortunately adds a lot of fairly trivial initialization code to a number of drivers. But the benefit of ensuring correctness makes this worth while. Because these changes are required for lockdep to be enabled, and the changes are quite trivial, I've not yet split this patch out into 30-some separate patches, as I figured it would be better to get the various maintainers thoughts on how to best merge this change along with the seqcount lockdep enablement. Feedback would be appreciated! Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Roger Luethi <rl@hellgate.ch> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Simon Horman <horms@verge.net.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Change-Id: Ieda06b95f0ec302dbef8576ef4c8fb4bd28ffb2f	2020-11-30 19:26:40 +03:00
stephen hemminger	a9d180c942	tunnel: implement 64 bits statistics Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels to use 64 bit statistics. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Keith <javelinanddart@gmail.com> Change-Id: I317ad78a7f3a31326a1bc6f3069712f348b38a6c	2020-11-30 19:26:33 +03:00
Florian Westphal	0edd378e5e	netfilter: xt_rpfilter: depend on raw or mangle table rpfilter is only valid in raw/mangle PREROUTING, i.e. RPFILTER=y\|m is useless without raw or mangle table support. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I26118146ace6e802b5700472222198ef31f9d562	2018-12-07 22:04:24 +04:00
Arne Coucheron	70e12c60f4	net: Add missing LOOPBACK_IFINDEX change in ipv6/route.c Missed in "net: Loopback ifindex is constant now" backport. Change-Id: I7259ac064e9ece4520ff355d55ffd04646d384ce	2018-12-07 22:04:24 +04:00
Florent Fourcot	f499215262	netfilter: nf_conntrack_ipv6: fix comment for packets without data Remove ambiguity of double negation. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ife5632986d43a9e6e719415c8a234f100e8de958	2018-12-07 22:02:09 +04:00
Andrew Collins	d4739bc9f5	netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE Since (`a0ecb85` netfilter: nf_nat: Handle routing changes in MASQUERADE target), the MASQUERADE target handles routing changes which affect the output interface of a connection, but only for ESTABLISHED connections. It is also possible for NEW connections which already have a conntrack entry to be affected by routing changes. This adds a check to drop entries in the NEW+conntrack state when the oif has changed. Signed-off-by: Andrew Collins <bsderandrew@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I6ae5c391d502f44cebc0563f51f4a65667efd0ae	2018-12-07 22:02:09 +04:00
Mukund Jampala	1b6cc2ac53	netfilter: ip[6]t_REJECT: fix wrong transport header pointer in TCP reset The problem occurs when iptables constructs the tcp reset packet. It doesn't initialize the pointer to the tcp header within the skb. When the skb is passed to the ixgbe driver for transmit, the ixgbe driver attempts to access the tcp header and crashes. Currently, other drivers (such as our 1G e1000e or igb drivers) don't access the tcp header on transmit unless the TSO option is turned on. <1>BUG: unable to handle kernel NULL pointer dereference at 0000000d <1>IP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>pdpt = 0000000085e5d001 pde = 0000000000000000 <0>Oops: 0000 [#1] SMP [...] <4>Pid: 0, comm: swapper Tainted: P 2.6.35.12 #1 Greencity/Thurley <4>EIP: 0060:[<d081621c>] EFLAGS: 00010246 CPU: 16 <4>EIP is at ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>EAX: c7628820 EBX: 00000007 ECX: 00000000 EDX: 00000000 <4>ESI: 00000008 EDI: c6882180 EBP: dfc6b000 ESP: ced95c48 <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 <0>Process swapper (pid: 0, ti=ced94000 task=ced73bd0 task.ti=ced94000) <0>Stack: <4> cbec7418 c779e0d8 c77cc888 c77cc8a8 0903010a 00000000 c77c0008 00000002 <4><0> cd4997c0 00000010 dfc6b000 00000000 d0d176c9 c77cc8d8 c6882180 cbec7318 <4><0> 00000004 00000004 cbec7230 cbec7110 00000000 cbec70c0 c779e000 00000002 <0>Call Trace: <4> [<d0d176c9>] ? 0xd0d176c9 <4> [<d0d18a4d>] ? 0xd0d18a4d <4> [<411e243e>] ? dev_hard_start_xmit+0x218/0x2d7 <4> [<411f03d7>] ? sch_direct_xmit+0x4b/0x114 <4> [<411f056a>] ? __qdisc_run+0xca/0xe0 <4> [<411e28b0>] ? dev_queue_xmit+0x2d1/0x3d0 <4> [<411e8120>] ? neigh_resolve_output+0x1c5/0x20f <4> [<411e94a1>] ? neigh_update+0x29c/0x330 <4> [<4121cf29>] ? arp_process+0x49c/0x4cd <4> [<411f80c9>] ? nf_hook_slow+0x3f/0xac <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121c6d5>] ? T.901+0x38/0x3b <4> [<4121c918>] ? arp_rcv+0xa3/0xb4 <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<411e1173>] ? __netif_receive_skb+0x32b/0x346 <4> [<411e19e1>] ? netif_receive_skb+0x5a/0x5f <4> [<411e1ea9>] ? napi_skb_finish+0x1b/0x30 <4> [<d0816eb4>] ? ixgbe_xmit_frame_ring+0x1564/0x2260 [ixgbe] <4> [<41013468>] ? lapic_next_event+0x13/0x16 <4> [<410429b2>] ? clockevents_program_event+0xd2/0xe4 <4> [<411e1b03>] ? net_rx_action+0x55/0x127 <4> [<4102da1a>] ? __do_softirq+0x77/0xeb <4> [<4102dab1>] ? do_softirq+0x23/0x27 <4> [<41003a67>] ? do_IRQ+0x7d/0x8e <4> [<41002a69>] ? common_interrupt+0x29/0x30 <4> [<41007bcf>] ? mwait_idle+0x48/0x4d <4> [<4100193b>] ? cpu_idle+0x37/0x4c <0>Code: df 09 d7 0f 94 c2 0f b6 d2 e9 e7 fb ff ff 31 db 31 c0 e9 38 ff ff ff 80 78 06 06 0f 85 3e fb ff ff 8b 7c 24 38 8b 8f b8 00 00 00 <0f> b6 51 0d f6 c2 01 0f 85 27 fb ff ff 80 e2 02 75 0d 8b 6c 24 <0>EIP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] SS:ESP Signed-off-by: Mukund Jampala <jbmukund@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I4866944d4992f703e55afcb20e9746a416d3d498	2018-12-07 22:02:09 +04:00
Jozsef Kadlecsik	2e46149f8c	netfilter: nf_nat: Handle routing changes in MASQUERADE target When the route changes (backup default route, VPNs) which affect a masqueraded target, the packets were sent out with the outdated source address. The patch addresses the issue by comparing the outgoing interface directly with the masqueraded interface in the nat table. Events are inefficient in this case, because it'd require adding route events to the network core and then scanning the whole conntrack table and re-checking the route for all entry. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I148103e8a4b257d004b8f0f03455243a28ff0eec	2018-12-07 22:02:09 +04:00
Patrick McHardy	5a296c2724	netfilter: ip6tables: add stateless IPv6-to-IPv6 Network Prefix Translation target Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I0508ce6f05db08fa455814ef3b86e189687d3087	2018-12-07 22:02:09 +04:00
Patrick McHardy	e5a2b99b4d	netfilter: ip6tables: add NETMAP target Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: Ifc6a92823b7a29d3500e998ab16436b01dd440cb	2018-12-07 22:02:09 +04:00
Patrick McHardy	1e28beb0aa	netfilter: ip6tables: add REDIRECT target Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I9e6ba8908914bd52a771da6b73f458036c2b30eb	2018-12-07 22:02:09 +04:00
Patrick McHardy	dd6490947e	netfilter: ip6tables: add MASQUERADE target Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I2296776746567afe4f8067628308c88ba2233e5c	2018-12-07 22:02:09 +04:00
Patrick McHardy	104f2e20f9	netfilter: ipv6: add IPv6 NAT support Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: I843d848e57d95e8fc932d95ca939a8c94fc8cfd8	2018-12-07 22:02:09 +04:00
Patrick McHardy	5e85f6149f	netfilter: nf_conntrack_ipv6: fix tracking of ICMPv6 error messages containing fragments ICMPv6 error messages are tracked by extracting the conntrack tuple of the inner packet and looking up the corresponding conntrack entry. Tuple extraction uses the ->get_l4proto() callback, which in case of fragments returns NEXTHDR_FRAGMENT instead of the upper protocol, even for the first fragment when the entire next header is present, resulting in a failure to find the correct connection tracking entry. This patch changes ipv6_get_l4proto() to use ipv6_skip_exthdr() instead of nf_ct_ipv6_skip_exthdr() in order to skip fragment headers when the fragment offset is zero. Signed-off-by: Patrick McHardy <kaber@trash.net> Change-Id: Ic8a86d7038676b044c89700226225ffa20e01079	2018-12-07 22:02:09 +04:00
Patrick McHardy	69adc8757e	netfilter: nf_conntrack_ipv6: improve fragmentation handling The IPv6 conntrack fragmentation currently has a couple of shortcomings. Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the defragmented packet is then passed to conntrack, the resulting conntrack information is attached to each original fragment and the fragments then continue their way through the stack. Helper invocation occurs in the POSTROUTING hook, at which point only the original fragments are available. The result of this is that fragmented packets are never passed to helpers. This patch improves the situation in the following way: - If a reassembled packet belongs to a connection that has a helper assigned, the reassembled packet is passed through the stack instead of the original fragments. - During defragmentation, the largest received fragment size is stored. On output, the packet is refragmented if required. If the largest received fragment size exceeds the outgoing MTU, a "packet too big" message is generated, thus behaving as if the original fragments were passed through the stack from an outside point of view. - The ipv6_helper() hook function can't receive fragments anymore for connections using a helper, so it is switched to use ipv6_skip_exthdr() instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the reassembled packets are passed to connection tracking helpers. The result of this is that we can properly track fragmented packets, but still generate ICMPv6 Packet too big messages if we would have before. This patch is also required as a precondition for IPv6 NAT, where NAT helpers might enlarge packets up to a point that they require fragmentation. In that case we can't generate Packet too big messages since the proper MTU can't be calculated in all cases (f.i. when changing textual representation of a variable amount of addresses), so the packet is transparently fragmented iff the original packet or fragments would have fit the outgoing MTU. IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>. Change-Id: I75d83668e7de723fb271232f475f46f4037a4a4f Signed-off-by: Patrick McHardy <kaber@trash.net>	2018-12-07 22:02:09 +04:00
Pablo Neira Ayuso	191c3003e8	netfilter: add user-space connection tracking helper infrastructure There are good reasons to supports helpers in user-space instead: * Rapid connection tracking helper development, as developing code in user-space is usually faster. * Reliability: A buggy helper does not crash the kernel. Moreover, we can monitor the helper process and restart it in case of problems. * Security: Avoid complex string matching and mangling in kernel-space running in privileged mode. Going further, we can even think about running user-space helpers as a non-root process. * Extensibility: It allows the development of very specific helpers (most likely non-standard proprietary protocols) that are very likely not to be accepted for mainline inclusion in the form of kernel-space connection tracking helpers. This patch adds the infrastructure to allow the implementation of user-space conntrack helpers by means of the new nfnetlink subsystem `nfnetlink_cthelper' and the existing queueing infrastructure (nfnetlink_queue). I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register ipv[4\|6]_helper which results from splitting ipv[4\|6]_confirm into two pieces. This change is required not to break NAT sequence adjustment and conntrack confirmation for traffic that is enqueued to our user-space conntrack helpers. Basic operation, in a few steps: 1) Register user-space helper by means of `nfct': nfct helper add ftp inet tcp [ It must be a valid existing helper supported by conntrack-tools ] 2) Add rules to enable the FTP user-space helper which is used to track traffic going to TCP port 21. For locally generated packets: iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp For non-locally generated packets: iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp 3) Run the test conntrackd in helper mode (see example files under doc/helper/conntrackd.conf conntrackd 4) Generate FTP traffic going, if everything is OK, then conntrackd should create expectations (you can check that with `conntrack': conntrack -E expect [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp This confirms that our test helper is receiving packets including the conntrack information, and adding expectations in kernel-space. The user-space helper can also store its private tracking information in the conntrack structure in the kernel via the CTA_HELP_INFO. The kernel will consider this a binary blob whose layout is unknown. This information will be included in the information that is transfered to user-space via glue code that integrates nfnetlink_queue and ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ifad12a30d86a6eb0b72d20f079336a24348d711b	2018-12-07 22:02:09 +04:00
Hans Schillstrom	f1d65de19b	netfilter: ip6_tables: add flags parameter to ipv6_find_hdr() This patch adds the flags parameter to ipv6_find_hdr. This flags allows us to: * know if this is a fragment. * stop at the AH header, so the information contained in that header can be used for some specific packet handling. This patch also adds the offset parameter for inspection of one inner IPv6 header that is contained in error messages. Change-Id: I8aa13399597dcb72c73084bcd7f8ca4156326357 Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-12-07 22:02:06 +04:00
Pablo Neira Ayuso	a2688d5375	netfilter: remove ip_queue support This patch removes ip_queue support which was marked as obsolete years ago. The nfnetlink_queue modules provides more advanced user-space packet queueing mechanism. This patch also removes capability code included in SELinux that refers to ip_queue. Otherwise, we break compilation. Several warning has been sent regarding this to the mailing list in the past month without anyone rising the hand to stop this with some strong argument. Change-Id: I62ab355af31e708b3c1000f2252c8196fb8ba428 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-12-07 22:00:11 +04:00
Johannes Berg	dca89a7ac0	ipv6: add option to drop unsolicited neighbor advertisements In certain 802.11 wireless deployments, there will be NA proxies that use knowledge of the network to correctly answer requests. To prevent unsolicitd advertisements on the shared medium from being a problem, on such deployments wireless needs to drop them. Enable this by providing an option called "drop_unsolicited_na". Change-Id: I2567a9973e72165a8e546f3638b509fbd1c95298 Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 21:59:38 +04:00
Johannes Berg	038345d130	ipv6: add option to drop unicast encapsulated in L2 multicast In order to solve a problem with 802.11, the so-called hole-196 attack, add an option (sysctl) called "drop_unicast_in_l2_multicast" which, if enabled, causes the stack to drop IPv6 unicast packets encapsulated in link-layer multi- or broadcast frames. Such frames can (as an attack) be created by any member of the same wireless network and transmitted as valid encrypted frames since the symmetric key for broadcast frames is shared between all stations. Change-Id: I8a0b45fbd533236fbd785e6e8aa20fb780aa1397 Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 21:59:38 +04:00
Cong Wang	d806df2b3a	ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif As suggested by Julian: Simply, flowi4_iif must not contain 0, it does not look logical to ignore all ip rules with specified iif. because in fib_rule_match() we do: if (rule->iifindex && (rule->iifindex != fl->flowi_iif)) goto out; flowi4_iif should be LOOPBACK_IFINDEX by default. We need to move LOOPBACK_IFINDEX to include/net/flow.h: 1) It is mostly used by flowi_iif 2) Fix the following compile error if we use it in flow.h by the patches latter: In file included from include/linux/netfilter.h:277:0, from include/net/netns/netfilter.h:5, from include/net/net_namespace.h:21, from include/linux/netdevice.h:43, from include/linux/icmpv6.h:12, from include/linux/ipv6.h:61, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:27, from include/linux/nfs_fs.h:30, from init/do_mounts.c:32: include/net/flow.h: In function ‘flowi4_init_output’: include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function) [Backport of net-next `6a662719c9`] Change-Id: Ib7a0a08d78c03800488afa1b2c170cb70e34cfd9 Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2018-08-27 14:52:49 +00:00
Florian Westphal	e4a17b5b70	netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too Alex Efros reported rpfilter module doesn't match following packets: IN=br.qemu SRC=192.168.2.1 DST=192.168.2.255 [ .. ] (netfilter bugzilla #814). Problem is that network stack arranges for the locally generated broadcasts to appear on the interface they were sent out, so the IFF_LOOPBACK check doesn't trigger. As -m rpfilter is restricted to PREROUTING, we can check for existing rtable instead, it catches locally-generated broad/multicast case, too. Change-Id: I2d921ac4d53e5b1ca9a5249e489c33e4fa4a4b3a Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-08-27 14:52:48 +00:00
Erik Kline	ec3c4f0012	Revert "netfilter: have ip*t REJECT set the sock err when an icmp is to be sent" This reverts commit `6f489c42a9`. Bug: 28719525 Change-Id: I77707cc93b3c5f0339e6bce36734027586c639d3	2018-08-27 14:52:47 +00:00
Artem Borisov	d7992e6feb	Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 All bluetooth-related changes were omitted because of our ancient incompatible bt stack. Change-Id: I96440b7be9342a9c1adc9476066272b827776e64	2017-12-27 17:13:15 +03:00
Lorenzo Colitti	18c446bb98	net: diag: Support destroying TCP sockets. This implements SOCK_DESTROY for TCP sockets. It causes all blocking calls on the socket to fail fast with ECONNABORTED and causes a protocol close of the socket. It informs the other end of the connection by sending a RST, i.e., initiating a TCP ABORT as per RFC 793. ECONNABORTED was chosen for consistency with FreeBSD. [Backport of net-next c1e64e298b8cad309091b95d8436a0255c84f54a] Change-Id: Ic5410d3a2f39db28a322c30f4b3b2bffd35ec2de Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 22:54:58 +03:00
Eric Dumazet	e1e6232909	ipv6: do not clear pinet6 field [ Upstream commit `f77d602124` ] We have seen multiple NULL dereferences in __inet6_lookup_established() After analysis, I found that inet6_sk() could be NULL while the check for sk_family == AF_INET6 was true. Bug was added in linux-2.6.29 when RCU lookups were introduced in UDP and TCP stacks. Once an IPv6 socket, using SLAB_DESTROY_BY_RCU is inserted in a hash table, we no longer can clear pinet6 field. This patch extends logic used in commit `fcbdf09d96` ("net: fix nulls list corruptions in sk_prot_alloc") TCP/UDP/UDPLite IPv6 protocols provide their own .clear_sk() method to make sure we do not clear pinet6 field. At socket clone phase, we do not really care, as cloning the parent (non NULL) pinet6 is not adding a fatal race. Change-Id: Idd03ce3344a5490a896b8a763c6f81bc6ceba173 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-15 22:54:52 +03:00
David S. Miller	9d814faae9	net: Do delayed neigh confirmation. When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Change-Id: I75acf86d11c9125c8ce2c2144e4a0fb717b29f03 Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:10 +03:00
Eric Dumazet	4998f1cbf7	tcp: fix possible NULL dereference in tcp_vX_send_reset() After commit `ca777eff51` ("tcp: remove dst refcount false sharing for prequeue mode") we have to relax check against skb dst in tcp_v[46]_send_reset() if prequeue dropped the dst. If a socket is provided, a full lookup was done to find this socket, so the dst test can be skipped. Change-Id: I1de6666d0c0503076cb9ca28498b36ea7fb3305b Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88191 Reported-by: Jaša Bartelj <jasa.bartelj@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Daniel Borkmann <dborkman@redhat.com> Fixes: `ca777eff51` ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:09 +03:00
Lorenzo Colitti	966db7310d	net: core: add UID to flows, rules, and routes - Define a new FIB rule attributes, FRA_UID_RANGE, to describe a range of UIDs. - Define a RTA_UID attribute for per-UID route lookups and dumps. - Support passing these attributes to and from userspace via rtnetlink. The value INVALID_UID indicates no UID was specified. - Add a UID field to the flow structures. [Backport of net-next 622ec2c9d52405973c9f1ca5116eb1c393adfc7d] Bug: 16355602 Change-Id: Iea98e6fedd0fd4435a1f4efa3deb3629505619ab Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 13:38:07 +03:00
Simon Shields	f33f0fc3ed	Revert "net: core: Support UID-based routing." This reverts commit dbadd302523011ab746840885b93e8bd0b2ff499. Change-Id: I8d996d5a84a3203443987c844d10534cd0dfc2c1	2017-08-27 19:09:20 +03:00
Eric Dumazet	5ff7bc82fb	tcp: do not lock listener to process SYN packets Everything should now be ready to finally allow SYN packets processing without holding listener lock. Tested: 3.5 Mpps SYNFLOOD. Plenty of cpu cycles available. Next bottleneck is the refcount taken on listener, that could be avoided if we remove SLAB_DESTROY_BY_RCU strict semantic for listeners, and use regular RCU. 13.18% [kernel] [k] __inet_lookup_listener 9.61% [kernel] [k] tcp_conn_request 8.16% [kernel] [k] sha_transform 5.30% [kernel] [k] inet_reqsk_alloc 4.22% [kernel] [k] sock_put 3.74% [kernel] [k] tcp_make_synack 2.88% [kernel] [k] ipt_do_table 2.56% [kernel] [k] memcpy_erms 2.53% [kernel] [k] sock_wfree 2.40% [kernel] [k] tcp_v4_rcv 2.08% [kernel] [k] fib_table_lookup 1.84% [kernel] [k] tcp_openreq_init_rwin Change-Id: I113630b943ccf27a7c57920b03635ea5a986fbaf Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-02 13:03:27 +03:00
David S. Miller	06f8886176	ipv6: Check ip6_find_1stfragopt() return value properly. commit 7dd7eb9513bd02184d45f000ab69d78cb1fa1531 upstream. Do not use unsigned variables to see if it returns a negative error or not. Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options") Change-Id: I0e9602acb9965d222904b91a94f289f9317cbe7f Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2017-06-26 21:06:24 +03:00
Craig Gallek	060af5e881	ipv6: Prevent overrun when parsing v6 header options commit 2423496af35d94a87156b063ea5cedffc10a70a1 upstream. The KASAN warning repoted below was discovered with a syzkaller program. The reproducer is basically: int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP); send(s, &one_byte_of_data, 1, MSG_MORE); send(s, &more_than_mtu_bytes_data, 2000, 0); The socket() call sets the nexthdr field of the v6 header to NEXTHDR_HOP, the first send call primes the payload with a non zero byte of data, and the second send call triggers the fragmentation path. The fragmentation code tries to parse the header options in order to figure out where to insert the fragment option. Since nexthdr points to an invalid option, the calculation of the size of the network header can made to be much larger than the linear section of the skb and data is read outside of it. This fix makes ip6_find_1stfrag return an error if it detects running out-of-bounds. [ 42.361487] ================================================================== [ 42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730 [ 42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789 [ 42.366469] [ 42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41 [ 42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 [ 42.368824] Call Trace: [ 42.369183] dump_stack+0xb3/0x10b [ 42.369664] print_address_description+0x73/0x290 [ 42.370325] kasan_report+0x252/0x370 [ 42.370839] ? ip6_fragment+0x11c8/0x3730 [ 42.371396] check_memory_region+0x13c/0x1a0 [ 42.371978] memcpy+0x23/0x50 [ 42.372395] ip6_fragment+0x11c8/0x3730 [ 42.372920] ? nf_ct_expect_unregister_notifier+0x110/0x110 [ 42.373681] ? ip6_copy_metadata+0x7f0/0x7f0 [ 42.374263] ? ip6_forward+0x2e30/0x2e30 [ 42.374803] ip6_finish_output+0x584/0x990 [ 42.375350] ip6_output+0x1b7/0x690 [ 42.375836] ? ip6_finish_output+0x990/0x990 [ 42.376411] ? ip6_fragment+0x3730/0x3730 [ 42.376968] ip6_local_out+0x95/0x160 [ 42.377471] ip6_send_skb+0xa1/0x330 [ 42.377969] ip6_push_pending_frames+0xb3/0xe0 [ 42.378589] rawv6_sendmsg+0x2051/0x2db0 [ 42.379129] ? rawv6_bind+0x8b0/0x8b0 [ 42.379633] ? _copy_from_user+0x84/0xe0 [ 42.380193] ? debug_check_no_locks_freed+0x290/0x290 [ 42.380878] ? ___sys_sendmsg+0x162/0x930 [ 42.381427] ? rcu_read_lock_sched_held+0xa3/0x120 [ 42.382074] ? sock_has_perm+0x1f6/0x290 [ 42.382614] ? ___sys_sendmsg+0x167/0x930 [ 42.383173] ? lock_downgrade+0x660/0x660 [ 42.383727] inet_sendmsg+0x123/0x500 [ 42.384226] ? inet_sendmsg+0x123/0x500 [ 42.384748] ? inet_recvmsg+0x540/0x540 [ 42.385263] sock_sendmsg+0xca/0x110 [ 42.385758] SYSC_sendto+0x217/0x380 [ 42.386249] ? SYSC_connect+0x310/0x310 [ 42.386783] ? __might_fault+0x110/0x1d0 [ 42.387324] ? lock_downgrade+0x660/0x660 [ 42.387880] ? __fget_light+0xa1/0x1f0 [ 42.388403] ? __fdget+0x18/0x20 [ 42.388851] ? sock_common_setsockopt+0x95/0xd0 [ 42.389472] ? SyS_setsockopt+0x17f/0x260 [ 42.390021] ? entry_SYSCALL_64_fastpath+0x5/0xbe [ 42.390650] SyS_sendto+0x40/0x50 [ 42.391103] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.391731] RIP: 0033:0x7fbbb711e383 [ 42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383 [ 42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003 [ 42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018 [ 42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad [ 42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00 [ 42.397257] [ 42.397411] Allocated by task 3789: [ 42.397702] save_stack_trace+0x16/0x20 [ 42.398005] save_stack+0x46/0xd0 [ 42.398267] kasan_kmalloc+0xad/0xe0 [ 42.398548] kasan_slab_alloc+0x12/0x20 [ 42.398848] __kmalloc_node_track_caller+0xcb/0x380 [ 42.399224] __kmalloc_reserve.isra.32+0x41/0xe0 [ 42.399654] __alloc_skb+0xf8/0x580 [ 42.400003] sock_wmalloc+0xab/0xf0 [ 42.400346] __ip6_append_data.isra.41+0x2472/0x33d0 [ 42.400813] ip6_append_data+0x1a8/0x2f0 [ 42.401122] rawv6_sendmsg+0x11ee/0x2db0 [ 42.401505] inet_sendmsg+0x123/0x500 [ 42.401860] sock_sendmsg+0xca/0x110 [ 42.402209] ___sys_sendmsg+0x7cb/0x930 [ 42.402582] __sys_sendmsg+0xd9/0x190 [ 42.402941] SyS_sendmsg+0x2d/0x50 [ 42.403273] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.403718] [ 42.403871] Freed by task 1794: [ 42.404146] save_stack_trace+0x16/0x20 [ 42.404515] save_stack+0x46/0xd0 [ 42.404827] kasan_slab_free+0x72/0xc0 [ 42.405167] kfree+0xe8/0x2b0 [ 42.405462] skb_free_head+0x74/0xb0 [ 42.405806] skb_release_data+0x30e/0x3a0 [ 42.406198] skb_release_all+0x4a/0x60 [ 42.406563] consume_skb+0x113/0x2e0 [ 42.406910] skb_free_datagram+0x1a/0xe0 [ 42.407288] netlink_recvmsg+0x60d/0xe40 [ 42.407667] sock_recvmsg+0xd7/0x110 [ 42.408022] ___sys_recvmsg+0x25c/0x580 [ 42.408395] __sys_recvmsg+0xd6/0x190 [ 42.408753] SyS_recvmsg+0x2d/0x50 [ 42.409086] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.409513] [ 42.409665] The buggy address belongs to the object at ffff88000969e780 [ 42.409665] which belongs to the cache kmalloc-512 of size 512 [ 42.410846] The buggy address is located 24 bytes inside of [ 42.410846] 512-byte region [ffff88000969e780, ffff88000969e980) [ 42.411941] The buggy address belongs to the page: [ 42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 [ 42.413298] flags: 0x100000000008100(slab\|head) [ 42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c [ 42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000 [ 42.415074] page dumped because: kasan: bad access detected [ 42.415604] [ 42.415757] Memory state around the buggy address: [ 42.416222] ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.416904] ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 42.418273] ^ [ 42.418588] ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419273] ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419882] ================================================================== Change-Id: I82861005bd93b193037ec73c475c91a735f8c599 Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2017-06-26 21:06:21 +03:00
WANG Cong	92e4d26280	ipv6/dccp: do not inherit ipv6_mc_list from parent Like commit 657831ffc38e ("dccp/tcp: do not inherit mc_list from parent") we should clear ipv6_mc_list etc. for IPv6 sockets too. Change-Id: I06fe2a15f8f7c75d3cee7abe09b81b8764f6046f Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-07 12:42:53 -06:00
Eric Dumazet	c0138cdce3	ipv6: fix out of bound writes in __ip6_append_data() Andrey Konovalov and idaifish@gmail.com reported crashes caused by one skb shared_info being overwritten from __ip6_append_data() Andrey program lead to following state : copy -4200 datalen 2000 fraglen 2040 maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200 The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); is overwriting skb->head and skb_shared_info Since we apparently detect this rare condition too late, move the code earlier to even avoid allocating skb and risking crashes. Once again, many thanks to Andrey and syzkaller team. Change-Id: Id1fced04f082e49100d0454779dcd90da4017e20 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: <idaifish@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-07 12:30:05 -06:00
Eric Dumazet	8711b467e9	udp: properly support MSG_PEEK with truncated buffers Backport of this upstream commit into stable kernels : 89c22d8c3b27 ("net: Fix skb csum races when peeking") exposed a bug in udp stack vs MSG_PEEK support, when user provides a buffer smaller than skb payload. In this case, skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov); returns -EFAULT. This bug does not happen in upstream kernels since Al Viro did a great job to replace this into : skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg); This variant is safe vs short buffers. For the time being, instead reverting Herbert Xu patch and add back skb->ip_summed invalid changes, simply store the result of udp_lib_checksum_complete() so that we avoid computing the checksum a second time, and avoid the problematic skb_copy_and_csum_datagram_iovec() call. This patch can be applied on recent kernels as it avoids a double checksumming, then backported to stable kernels as a bug fix. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> CVE-2016-10229 Change-Id: I2c372dd5340b004da21394f6fb54d35f94a23b79 (cherry picked from commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191)	2017-04-03 16:12:35 -06:00

1 2 3 4 5 ...

2943 commits