android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Liu Yu	3fd6294100	tcp_cubic: fix the range of delayed_ack [ Upstream commit `0cda345d1b` ] commit `b9f47a3aae` (tcp_cubic: limit delayed_ack ratio to prevent divide error) try to prevent divide error, but there is still a little chance that delayed_ack can reach zero. In case the param cnt get negative value, then ratio+cnt would overflow and may happen to be zero. As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero. In some old kernels, such as 2.6.32, there is a bug that would pass negative param, which then ultimately leads to this divide error. commit `5b35e1e6e9` (tcp: fix tcp_trim_head() to adjust segment count with skb MSS) fixed the negative param issue. However, it's safe that we fix the range of delayed_ack as well, to make sure we do not hit a divide by zero. Change-Id: I3897a284a654ea5c0519a81e190ccc130f29d1c3 CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Liu Yu <allanyuliu@tencent.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-29 23:12:13 +08:00
Marcelo Leitner	84a5789882	ipv6: addrconf: validate new MTU before applying it Currently we don't check if the new MTU is valid or not and this allows one to configure a smaller than minimum allowed by RFCs or even bigger than interface own MTU, which is a problem as it may lead to packet drops. If you have a daemon like NetworkManager running, this may be exploited by remote attackers by forging RA packets with an invalid MTU, possibly leading to a DoS. (NetworkManager currently only validates for values too small, but not for too big ones.) The fix is just to make sure the new value is valid. That is, between IPV6_MIN_MTU and interface's MTU. Note that similar check is already performed at ndisc_router_discovery(), for when kernel itself parses the RA. Change-Id: Id2c8fd3cb68ae157dc31d663e5439ddecc109c0c Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:12 +08:00
Hannes Frederic Sowa	dcf5c5397a	net: add validation for the socket syscall protocol argument 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. Change-Id: I653fad90da54908144cc8916c2dccb1fa6f14eed CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:11 +08:00
David S. Miller	29b4575124	bluetooth: Validate socket address length in sco_sock_bind(). Change-Id: I890640975f1af64f71947b6a1820249e08f6375b Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:11 +08:00
Eric Dumazet	504d01b884	net: guard tcp_set_keepalive() to tcp sockets Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: Ieeb498a3e623cfcb54e1c865a3c0229e4acf1e87	2016-10-29 23:12:11 +08:00
Daniel Borkmann	86a0da23c8	netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Change-Id: Ief6a82b5a58e1dd88d43313eb8356c52cf89b214 Fixes: `2bc780499a` ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-29 23:12:11 +08:00
D.S. Ljungmark	ba0d2ae061	ipv6: Don't reduce hop limit for an interface A local route may have a lower hop_limit set than global routes do. RFC 3756, Section 4.2.7, "Parameter Spoofing" > 1. The attacker includes a Current Hop Limit of one or another small > number which the attacker knows will cause legitimate packets to > be dropped before they reach their destination. > As an example, one possible approach to mitigate this threat is to > ignore very small hop limits. The nodes could implement a > configurable minimum hop limit, and ignore attempts to set it below > said limit. Change-Id: I51ee1778e3d2d5fa1aefbdf1ad8869e4e8dc28b2 Signed-off-by: D.S. Ljungmark <ljungmark@modio.se> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:10 +08:00
Sasha Levin	1fea326d57	net: llc: use correct size for sysctl timeout entries The timeout entries are sizeof(int) rather than sizeof(long), which means that when they were getting read we'd also leak kernel memory to userspace along with the timeout values. Change-Id: I328d1186720a6f70f555eeeb62c83ee69814868d Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:10 +08:00
Eric Dumazet	cc354dd458	udp: fix behavior of wrong checksums We have two problems in UDP stack related to bogus checksums : 1) We return -EAGAIN to application even if receive queue is not empty. This breaks applications using edge trigger epoll() 2) Under UDP flood, we can loop forever without yielding to other processes, potentially hanging the host, especially on non SMP. This patch is an attempt to make things better. We might in the future add extra support for rt applications wanting to better control time spent doing a recv() in a hostile environment. For example we could validate checksums before queuing packets in socket receive queue. Change-Id: I9355321ac7ee564d56c342fa7738b918052bf308 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 23:12:10 +08:00
Florian Westphal	91c6941897	netfilter: conntrack: disable generic tracking for known protocols Given following iptables ruleset: -P FORWARD DROP -A FORWARD -m sctp --dport 9 -j ACCEPT -A FORWARD -p tcp --dport 80 -j ACCEPT -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT One would assume that this allows SCTP on port 9 and TCP on port 80. Unfortunately, if the SCTP conntrack module is not loaded, this allows all SCTP communication, to pass though, i.e. -p sctp -j ACCEPT, which we think is a security issue. This is because on the first SCTP packet on port 9, we create a dummy "generic l4" conntrack entry without any port information (since conntrack doesn't know how to extract this information). All subsequent packets that are unknown will then be in established state since they will fallback to proto_generic and will match the 'generic' entry. Our originally proposed version [1] completely disabled generic protocol tracking, but Jozsef suggests to not track protocols for which a more suitable helper is available, hence we now mitigate the issue for in tree known ct protocol helpers only, so that at least NAT and direction information will still be preserved for others. [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html Joint work with Daniel Borkmann. Change-Id: I7fff74303d98876efd3e7834555cbf95d0319359 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-29 23:12:10 +08:00
Hannes Frederic Sowa	ef06d64826	ipv4: try to cache dst_entries which would cause a redirect Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit `f886497212` ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I53e4b500a4db2f5fece937a42a3bd810b2640c44	2016-10-29 23:12:10 +08:00
Zhao Wei Liew	e8e41c8550	sunrpc: Fix possibly uninitialized variable warnings These warnings are encountered when building with GCC 4.9. Change-Id: I58b0c4f8c2d1724e42bb8037104ef93337c46f3d Signed-off-by: Zhao Wei Liew <zhaoweiliew@gmail.com>	2016-10-29 23:12:09 +08:00
Eric Dumazet	d5127daf88	ipv6: add complete rcu protection around np->opt [ Upstream commit 45f6fad84cc305103b28d73482b344d7f5b76f39 ] This patch addresses multiple problems : UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions while socket is not locked : Other threads can change np->opt concurrently. Dmitry posted a syzkaller (http://github.com/google/syzkaller) program desmonstrating use-after-free. Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock() and dccp_v6_request_recv_sock() also need to use RCU protection to dereference np->opt once (before calling ipv6_dup_options()) This patch adds full RCU protection to np->opt BUG: 28746669 Change-Id: I207da29ac48bb6dd7c40d65f9e27c4e3ff508da0 Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Pierre Imai <imaipi@google.com>	2016-06-17 02:54:32 +00:00
Al Viro	b1d288e247	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Bug: 28759139 Change-Id: I561a14b514d714838ef539a94275b117d7f475f4 Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-15 06:22:38 +00:00
kangjie	beeda1afc1	fix infoleak in rtnetlink the stack object “map” has a total size of 32 bytes. Its last 4 bytes are padding generated by compiler. These padding bytes are not initialized and sent out via “nla_put” Bug: 28620102 Change-Id: I13da380c6fe8abca49e3cf9f05293c02b44d2e5e Signed-off-by: kangjie <kangjielu@gmail.com>	2016-06-15 06:22:23 +00:00
Patrick Tjin	ea1be87b37	Merge branch 'android-msm-flo-3.4-mnc-mr1-security-next' into android-msm-flo-3.4-mnc-mr1 Merge security-next into mnc-mr1 @ `75dfdc8` for August 2016.1	2016-06-09 14:50:21 -07:00
Mohamad Ayyash	1fc765b172	Don't show empty tag stats for unprivileged uids BUG: 27577101 BUG: 27532522 Change-Id: I890831a72e5ad4485fdf30e51a146712b18052ed Signed-off-by: Mohamad Ayyash <mkayyash@google.com Signed-off-by: Patrick Tjin <pattjin@google.com>	2016-06-08 11:29:32 -07:00
Avijit Kanti Das	4afcb8361c	net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol() memset() the structure ethtool_wolinfo that has padded bytes but the padded bytes have not been zeroed out. Bug: 28803952 Change-Id: If3fd2d872a1b1ab9521d937b86a29fc468a8bbfe Signed-off-by: Avijit Kanti Das <avijitnsec@codeaurora.org>	2016-06-02 11:27:42 -07:00
Mohamad Ayyash	066b75616c	Replace %p with %pK to prevent leaking kernel address BUG: 27532522 Change-Id: Ic0710a9a8cfc682acd88ecf3bbfeece2d798c4a4 Signed-off-by: Mohamad Ayyash <mkayyash@google.com>	2016-05-24 19:55:22 +00:00
Erik Kline	ab98584aef	ipv6: sysctl to restrict candidate source addresses Per RFC 6724, section 4, "Candidate Source Addresses": It is RECOMMENDED that the candidate source addresses be the set of unicast addresses assigned to the interface that will be used to send to the destination (the "outgoing" interface). Add a sysctl to enable this behaviour. Signed-off-by: Erik Kline <ek@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [Simplified back-port of net-next 3985e8a3611a93bb36789f65db862e5700aab65e] Bug: 19470192 Bug: 21832279 Bug: 22464419 Change-Id: Icd96382f814a6f3ea53f05beb98c266b1929c5a3	2015-07-29 13:59:11 +09:00
Eric W. Biederman	60e6a983e5	proc: Usable inode numbers for the namespace file descriptors. Assign a unique proc inode to each namespace, and use that inode number to ensure we only allocate at most one proc inode for every namespace in proc. A single proc inode per namespace allows userspace to test to see if two processes are in the same namespace. This has been a long requested feature and only blocked because a naive implementation would put the id in a global space and would ultimately require having a namespace for the names of namespaces, making migration and certain virtualization tricks impossible. We still don't have per superblock inode numbers for proc, which appears necessary for application unaware checkpoint/restart and migrations (if the application is using namespace file descriptors) but that is now allowd by the design if it becomes important. I have preallocated the ipc and uts initial proc inode numbers so their structures can be statically initialized. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> (cherry picked from commit `98f842e675`)	2015-07-13 11:18:01 -07:00
Erik Kline	80c68458dc	neigh: Better handling of transition to NUD_PROBE state [1] When entering NUD_PROBE state via neigh_update(), perhaps received from userspace, correctly (re)initialize the probes count to zero. This is useful for forcing revalidation of a neighbor (for example if the host is attempting to do DNA [IPv4 4436, IPv6 6059]). [2] Notify listeners when a neighbor goes into NUD_PROBE state. By sending notifications on entry to NUD_PROBE state listeners get more timely warnings of imminent connectivity issues. The current notifications on entry to NUD_STALE have somewhat limited usefulness: NUD_STALE is a perfectly normal state, as is NUD_DELAY, whereas notifications on entry to NUD_FAILURE come after a neighbor reachability problem has been confirmed (typically after three probes). Change-Id: I5b27b806736bad50723d6c48262da82bef760b71 Signed-off-by: Erik Kline <ek@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-22 18:16:42 +09:00
Lorenzo Colitti	86e3b85a50	Set the iif for IPv6 packets as well. This was fixed upstream in a more comprehensive way by `1fb9489bf`, but this one-line fix is consistent with the rest of the 3.4 networking code. Change-Id: I342ca90c6d0213a7aeac9a2b18283530998f81d7 Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2015-05-20 15:36:18 +09:00
Lorenzo Colitti	73c45b6b8f	net: ping: Return EAFNOSUPPORT when appropriate. 1. For an IPv4 ping socket, ping_check_bind_addr does not check the family of the socket address that's passed in. Instead, make it behave like inet_bind, which enforces either that the address family is AF_INET, or that the family is AF_UNSPEC and the address is 0.0.0.0. 2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL if the socket family is not AF_INET6. Return EAFNOSUPPORT instead, for consistency with inet6_bind. 3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT instead of EINVAL if an incorrect socket address structure is passed in. 4. Make IPv6 ping sockets be IPv6-only. The code does not support IPv4, and it cannot easily be made to support IPv4 because the protocol numbers for ICMP and ICMPv6 are different. This makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead of making the socket unusable. Among other things, this fixes an oops that can be triggered by: int s = socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP); struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6, .sin6_addr = in6addr_any, }; bind(s, (struct sockaddr *) &sin6, sizeof(sin6)); [backport of net 9145736d4862145684009d6a72a6e61324a9439e] Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-20 15:24:04 +09:00
Alexey Kuznetsov	fd60bfcac9	tcp: resets are misrouted [ Upstream commit `4c67525849` ] After commit `e2446eaa` ("tcp_v4_send_reset: binding oif to iif in no sock case").. tcp resets are always lost, when routing is asymmetric. Yes, backing out that patch will result in misrouting of resets for dead connections which used interface binding when were alive, but we actually cannot do anything here. What's died that's died and correct handling normal unbound connections is obviously a priority. Comment to comment: > This has few benefits: > 1. tcp_v6_send_reset already did that. It was done to route resets for IPv6 link local addresses. It was a mistake to do so for global addresses. The patch fixes this as well. Actually, the problem appears to be even more serious than guaranteed loss of resets. As reported by Sergey Soloviev <sol@eqv.ru>, those misrouted resets create a lot of arp traffic and huge amount of unresolved arp entires putting down to knees NAT firewalls which use asymmetric routing. Change-Id: Id55da644c73623ca3037ce07c1d1c9bdc8cd9013 Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-20 15:22:49 +09:00
Lorenzo Colitti	bd1fd73d3c	net: ipv4: tcp: Get tcpi_count via file_count() not direct access Commit `0a1544a1d`, which implements a Qualcomm feature called Smart Wireless Interface Manager, added a tcpi_count member to struct tcp_info, and populates it using: atomic_read(&filep->f_count); This causes compiler warnings on 64-bit architectures (e.g., 64-bit ARCH_UM, used by net_test) because f_count is actually an atomic_long_t, and on 64-bit architectures atomic_long_t is a 64-bit number. The difference doesn't matter in practice because the value is cast to a __u8 anyway, but it causes build breaks because we build with -Werror. Instead of using atomic_long_t directly, use the the file_count macro which exists for this purpose. Change-Id: Ie09a0b4e7a5cf128b21eff10c1b34faf5c995356 Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2015-05-20 15:22:07 +09:00
David S. Miller	31911794b7	ipv4: Missing sk_nulls_node_init() in ping_unhash(). If we don't do that, then the poison value is left in the ->pprev backlink. This can cause crashes if we do a disconnect, followed by a connect(). Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Wen Xu <hotdog3645@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 20770158 Change-Id: I944eb20fddea190892c2da681d934801d268096a	2015-05-04 10:18:36 -07:00
Erik Kline	1e46eaf4f2	net: ipv6: allow choosing optimistic addresses with use_optimistic The use_optimistic sysctl makes optimistic IPv6 addresses equivalent to preferred addresses for source address selection (e.g., when calling connect()), but it does not allow an application to bind to optimistic addresses. This behaviour is inconsistent - for example, it doesn't make sense for bind() to an optimistic address fail with EADDRNOTAVAIL, but connect() to choose that address outgoing address on the same socket. Bug: 17769720 Bug: 18609055 Change-Id: I9de0d6c92ac45e29d28e318ac626c71806666f13 Signed-off-by: Erik Kline <ek@google.com> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-12-10 09:58:22 +09:00
Jane Zhou	3cc8cc4884	net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net `91a0b60346`] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jane Zhou <a17711@motorola.com> Signed-off-by: Yiwei Zhao <gbjc64@motorola.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com> (cherry picked from commit 08a90f2f7ecbe6e03e43acfd7aaf26bc68c73354)	2014-12-01 16:45:58 -08:00
Erik Kline	efe8261b88	net: ipv6: Add a sysctl to make optimistic addresses useful candidates Add a sysctl that causes an interface's optimistic addresses to be considered equivalent to other non-deprecated addresses for source address selection purposes. Preferred addresses will still take precedence over optimistic addresses, subject to other ranking in the source address selection algorithm. This is useful where different interfaces are connected to different networks from different ISPs (e.g., a cell network and a home wifi network). The current behaviour complies with RFC 3484/6724, and it makes sense if the host has only one interface, or has multiple interfaces on the same network (same or cooperating administrative domain(s), but not in the multiple distinct networks case. For example, if a mobile device has an IPv6 address on an LTE network and then connects to IPv6-enabled wifi, while the wifi IPv6 address is undergoing DAD, IPv6 connections will try use the wifi default route with the LTE IPv6 address, and will get stuck until they time out. Also, because optimistic nodes can receive frames, issue an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC flag appropriately set). A second RTM_NEWADDR is sent if DAD completes (the address flags have changed), otherwise an RTM_DELADDR is sent. Also: add an entry in ip-sysctl.txt for optimistic_dad. [backport of net-next 7fd2561e4ebdd070ebba6d3326c4c5b13942323f] Signed-off-by: Erik Kline <ek@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 17769720 Bug: 18180674 Change-Id: I440a9b8c788db6767d191bbebfd2dff481aa9e0d	2014-12-01 19:37:26 +00:00
Will Drewry	06b878fc11	net/compat.c,linux/filter.h: share compat_sock_fprog Any other users of bpf_*_filter that take a struct sock_fprog from userspace will need to be able to also accept a compat_sock_fprog if the arch supports compat calls. This change allows the existing compat_sock_fprog be shared. Signed-off-by: Will Drewry <wad@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Paris <eparis@redhat.com> v18: tasered by the apostrophe police v14: rebase/nochanges v13: rebase on to `88ebdda615` v12: rebase on to linux-next v11: introduction	2014-10-31 19:46:10 -07:00
Will Drewry	5171cc5332	sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W Introduces a new BPF ancillary instruction that all LD calls will be mapped through when skb_run_filter() is being used for seccomp BPF. The rewriting will be done using a secondary chk_filter function that is run after skb_chk_filter. The code change is guarded by CONFIG_SECCOMP_FILTER which is added, along with the seccomp_bpf_load() function later in this series. This is based on http://lkml.org/lkml/2012/3/2/141 Suggested-by: Indan Zupancic <indan@nul.nu> Signed-off-by: Will Drewry <wad@chromium.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Paris <eparis@redhat.com> v18: rebase ... v15: include seccomp.h explicitly for when seccomp_bpf_load exists. v14: First cut using a single additional instruction ... v13: made bpf functions generic.	2014-10-31 19:46:10 -07:00
Sasha Levin	f7820fdf25	net/l2tp: don't fall back on UDP [get\|set]sockopt The l2tp [get\|set]sockopt() code has fallen back to the UDP functions for socket option levels != SOL_PPPOL2TP since day one, but that has never actually worked, since the l2tp socket isn't an inet socket. As David Miller points out: "If we wanted this to work, it'd have to look up the tunnel and then use tunnel->sk, but I wonder how useful that would be" Since this can never have worked so nobody could possibly have depended on that functionality, just remove the broken code and return -EINVAL. Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: James Chapman <jchapman@katalix.com> Acked-by: David Miller <davem@davemloft.net> Cc: Phil Turnbull <phil.turnbull@oracle.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ed Tam <etam@google.com>	2014-07-24 15:36:39 -07:00
Lorenzo Colitti	553313899d	net: core: Support UID-based routing. This contains the following commits: 1. 0149763 net: core: Add a UID range to fib rules. 2. 1650474 net: core: Use the socket UID in routing lookups. 3. 0b16771 net: ipv4: Add the UID to the route cache. 4. ee058f1 net: core: Add a RTA_UID attribute to routes. This is so that userspace can do per-UID route lookups. Bug: 15413527 Change-Id: I1285474c6734614d3bda6f61d88dfe89a4af7892 Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-07-08 16:55:52 -07:00
Tarun Karra	cc62b3e282	msm: kgsl: Restructure IOMMU clock management Restructure IOMMU clock management code to enable/disable clocks based on iommu unit instead of context. In hardware IOMMU clocks are associated with each IOMMU unit. There is no granularity of clock control for each iommu context. Turning on the clocks for one context in a IOMMU unit automatically turns on the clocks for all the contexts in IOMMU unit. Reduce the number of clock enable/disable calls and structure the software design as per hardware design by doing clock management per IOMMU unit. Change-Id: Ib54dbb7a31d6b30b65c07ef130bee149e3e587d8 Signed-off-by: Tarun Karra <tkarra@codeaurora.org>	2014-06-24 10:22:15 -06:00
Andreas Henriksson	e21dad10c1	net: Fix "ip rule delete table 256" [ Upstream commit `13eb2ab2d3` ] When trying to delete a table >= 256 using iproute2 the local table will be deleted. The table id is specified as a netlink attribute when it needs more then 8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0). Preconditions to matching the table id in the rule delete code doesn't seem to take the "table id in netlink attribute" into condition so the frh_get_table helper function never gets to do its job when matching against current rule. Use the helper function twice instead of peaking at the table value directly. Originally reported at: http://bugs.debian.org/724783 Reported-by: Nicolas HICHER <nhicher@avencall.com> Signed-off-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-19 16:59:39 -07:00
Lorenzo Colitti	0a62a7f918	net: support marking accepting TCP sockets When using mark-based routing, sockets returned from accept() may need to be marked differently depending on the incoming connection request. This is the case, for example, if different socket marks identify different networks: a listening socket may want to accept connections from all networks, but each connection should be marked with the network that the request came in on, so that subsequent packets are sent on the correct network. This patch adds a sysctl to mark TCP sockets based on the fwmark of the incoming SYN packet. If enabled, and an unmarked socket receives a SYN, then the SYN packet's fwmark is written to the connection's inet_request_sock, and later written back to the accepted socket when the connection is established. If the socket already has a nonzero mark, then the behaviour is the same as it is today, i.e., the listening socket's fwmark is used. Black-box tested using user-mode linux: - IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the mark of the incoming SYN packet. - The socket returned by accept() is marked with the mark of the incoming SYN packet. - Tested with syncookies=1 and syncookies=2. Change-Id: I5e8c9b989762a93f3eb5a0c1b4df44f62d57f3cb Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-05-12 22:43:02 -07:00
Lorenzo Colitti	2f9890617f	net: add a sysctl to reflect the fwmark on replies Kernel-originated IP packets that have no user socket associated with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.) are emitted with a mark of zero. Add a sysctl to make them have the same mark as the packet they are replying to. This allows an administrator that wishes to do so to use mark-based routing, firewalling, etc. for these replies by marking the original packets inbound. Tested using user-mode linux: - ICMP/ICMPv6 echo replies and errors. - TCP RST packets (IPv4 and IPv6). Change-Id: I95d896647b278d092ef331d1377b959da1deb042 Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-05-12 22:39:57 -07:00
Lorenzo Colitti	35b2ab13ba	net: ipv6: autoconf routes into per-device tables Currently, IPv6 router discovery always puts routes into RT6_TABLE_MAIN. This causes problems for connection managers that want to support multiple simultaneous network connections and want control over which one is used by default (e.g., wifi and wired). To work around this connection managers typically take the routes they prefer and copy them to static routes with low metrics in the main table. This puts the burden on the connection manager to watch netlink to see if the routes have changed, delete the routes when their lifetime expires, etc. Instead, this patch adds a per-interface sysctl to have the kernel put autoconf routes into different tables. This allows each interface to have its own autoconf table, and choosing the default interface (or using different interfaces at the same time for different types of traffic) can be done using appropriate ip rules. The sysctl behaves as follows: - = 0: default. Put routes into RT6_TABLE_MAIN as before. - > 0: manual. Put routes into the specified table. - < 0: automatic. Add the absolute value of the sysctl to the device's ifindex, and use that table. The automatic mode is most useful in conjunction with net.ipv6.conf.default.accept_ra_rt_table. A connection manager or distribution could set it to, say, -100 on boot, and thereafter just use IP rules. Change-Id: I093d39fb06ec413905dc0d0d5792c1bc5d5c73a9 Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-05-12 22:28:20 -07:00
Lorenzo Colitti	efa89f2974	net: ipv6: ping: Use socket mark in routing lookup [net-next commit `bf439b3154`] Change-Id: I8356e9132088c75d4510021c6e4c2641d772087a Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 22:28:15 -07:00
JP Abgrall	b656f8346b	net: ipv4: current group_info should be put after using. Plug a group_info refcount leak in ping_init. group_info is only needed during initialization and the code failed to release the reference on exit. While here move grabbing the reference to a place where it is actually needed. Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com> Signed-off-by: xiaoming wang <xiaoming.wang@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Conflicts: net/ipv4/ping.c Change-Id: I51931e439cce7f19b4179f5828d7355bb3b1dcfa	2014-04-28 11:44:02 -07:00
Ruchi Kandoi	3680c1c730	nf: Remove compilation error caused by e254d2c28c880da28626af6d53b7add5f7d6afee Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2014-04-24 14:32:29 -07:00
Ruchi Kandoi	e00c4d08c5	nf: IDLETIMER: time-stamp and suspend/resume handling. Message notifications contains an additional timestamp field in nano seconds. The expiry time for the timers are modified during suspend/resume. If timer was supposed to expire while the system is suspended then a notification is sent when it resumes with the timestamp of the scheduled expiry. Removes the race condition for multiple work scheduled. Bug: 13247811 Change-Id: I752c5b00225fe7085482819f975cc0eb5af89bff Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>	2014-04-24 13:27:45 -07:00
Mihir Shete	67acf1f09b	cfg80211: add flags to define country IE processing rules 802.11 cards may have different country IE parsing behavioural preferences and vendors may want to support these. These preferences were managed by the WIPHY_FLAG_CUSTOM_REGULATORY and the WIPHY_FLAG_STRICT_REGULATORY flags and their combination. Instead of using this existing notation, split out the country IE behavioural preferences to a new flag. This will allow us to add more customizations easily and make the code more maintainable. Also add a new flag to disable country IE hints issued by the CORE as the first customization. Change-Id: I66ba4a92ac0f029a115eea0a274b02db11279787 CRs-Fixed: 542802 Signed-off-by: Mihir Shete <smihir@codeaurora.org>	2014-02-10 15:57:17 -08:00
JP Abgrall	36eb1e171d	tcp: add a sysctl to config the tcp_default_init_rwnd The default initial rwnd is hardcoded to 10. Now we allow it to be controlled via /proc/sys/net/ipv4/tcp_default_init_rwnd which limits the values from 3 to 100 This is somewhat needed because ipv6 routes are autoconfigured by the kernel. See "An Argument for Increasing TCP's Initial Congestion Window" in https://developers.google.com/speed/articles/tcp_initcwnd_paper.pdf Change-Id: I386b2a9d62de0ebe05c1ebe1b4bd91b314af5c54 Signed-off-by: JP Abgrall <jpa@google.com> Conflicts: include/net/tcp.h	2014-02-07 15:45:23 -08:00
JP Abgrall	cfe519425c	nf: xt_qtaguid: fix handling for cases where tunnels are used. * fix skb->dev vs par->in/out When there is some forwarding going on, it introduces extra state around devs associated with xt_action_param->in/out and sk_buff->dev. E.g. par->in and par->out are both set, or skb->dev and par->out are both set (and different) This would lead qtaguid to make the wrong assumption about the direction and update the wrong device stats. Now we rely more on par->in/out. * Fix handling when qtaguid is used as "owner" When qtaguid is used as an owner module, and sk_socket->file is not there (happens when tunnels are involved), it would incorrectly do a tag stats update. * Correct debug messages. Bug: 11687690 Change-Id: I2b1ff8bd7131969ce9e25f8291d83a6280b3ba7f Signed-off-by: JP Abgrall <jpa@google.com>	2014-02-03 14:30:48 -08:00
Hannes Frederic Sowa	a485048325	ping: prevent NULL pointer dereference on write to msg_name A plain read() on a socket does set msg->msg_name to NULL. So check for NULL pointer first. [Backport of net-next `cf970c002d`] Bug: 12780426 Change-Id: Ida66892d359d2dfbb32803d8b0872bf8c0bb3865 Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>	2014-02-03 14:23:50 -08:00
Eric Dumazet	db8bac7934	net-fixes: flow_dissector: prevent an infinite loop (CVE-2013-4348) Jason Wang found that a malicious packet could make skb_flow_dissect() loop forever. We must check that IP header has a valid ihl to avoid this loop. It involves IPIP encapsulation and ihl = 0 to trigger. Given this bug is critical, I cooked a patch before having a fix in upstream kernel. Tested: Compiled/booted Ran some tests on bnx2x and explicitely disabled hardware provided rxhash ethtool -K eth1 rxhash off ethtool -K eth2 rxhash off Google-Bug-Id: 11465355 Effort: net-fixes Change-Id: I813e4dc48cecb05f8edfa218304e1f13fd764323 Signed-off-by: Ed Tam <etam@google.com>	2013-11-14 20:55:00 -08:00
Mathias Krause	17a3bd594f	sock_diag: Fix out-of-bounds access to sock_diag_handlers[] Userland can send a netlink message requesting SOCK_DIAG_BY_FAMILY with a family greater or equal then AF_MAX -- the array size of sock_diag_handlers[]. The current code does not test for this condition therefore is vulnerable to an out-of-bound access opening doors for a privilege escalation. Signed-off-by: Mathias Krause <minipli <at> googlemail.com>	2013-09-25 17:01:53 +00:00
Eric Dumazet	eb5636bdf3	net: defer net_secret[] initialization Instead of feeding net_secret[] at boot time, defer the init at the point first socket is created. This permits some platforms to use better entropy sources than the ones available at boot time. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-25 17:01:47 +00:00

1 2 3 4 5 ...

22983 commits