android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Artem Borisov	d7992e6feb	Merge remote-tracking branch 'stable/linux-3.4.y' into lineage-15.1 All bluetooth-related changes were omitted because of our ancient incompatible bt stack. Change-Id: I96440b7be9342a9c1adc9476066272b827776e64	2017-12-27 17:13:15 +03:00
Jason Wang	7c8a60a9e3	act_mirred: do not drop packets when fails to mirror it [ Upstream commit `16c0b164bd` ] We drop packet unconditionally when we fail to mirror it. This is not intended in some cases. Consdier for kvm guest, we may mirror the traffic of the bridge to a tap device used by a VM. When kernel fails to mirror the packet in conditions such as when qemu crashes or stop polling the tap, it's hard for the management software to detect such condition and clean the the mirroring before. This would lead all packets to the bridge to be dropped and break the netowrk of other virtual machines. To solve the issue, the patch does not drop packets when kernel fails to mirror it, and only drop the redirected packets. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-07 16:02:00 -07:00
stephen hemminger	8da9d4fa43	htb: fix sign extension bug [ Upstream commit `cbd375567f` ] When userspace passes a large priority value the assignment of the unsigned value hopt->prio to signed int cl->prio causes cl->prio to become negative and the comparison is with TC_HTB_NUMPRIO is always false. The result is that HTB crashes by referencing outside the array when processing packets. With this patch the large value wraps around like other values outside the normal range. See: https://bugzilla.kernel.org/show_bug.cgi?id=60669 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-14 06:02:08 -07:00
Dan Carpenter	d5c50d2b4a	net_sched: info leak in atm_tc_dump_class() [ Upstream commit `8cb3b9c364` ] The "pvc" struct has a hole after pvc.sap_family which is not cleared. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-08-11 15:38:45 -07:00
David S. Miller	565144e976	net_sched: Fix stack info leak in cbq_dump_wrr(). [ Upstream commit `a0db856a95` ] Make sure the reserved fields, and padding (if any), are fully initialized. Based upon a patch by Dan Carpenter and feedback from Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-08-11 15:38:44 -07:00
Jamal Hadi Salim	ddb85c714d	net_sched: act_ipt forward compat with xtables [ Upstream commit `0dcffd0964` ] Deal with changes in newer xtables while maintaining backward compatibility. Thanks to Jan Engelhardt for suggestions. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-05-19 10:54:45 -07:00
Vasily Averin	815b101862	cbq: incorrect processing of high limits [ Upstream commit `f0f6ee1f70` ] currently cbq works incorrectly for limits > 10% real link bandwidth, and practically does not work for limits > 50% real link bandwidth. Below are results of experiments taken on 1 Gbit link In shaper \| Actual Result -----------+--------------- 100M \| 108 Mbps 200M \| 244 Mbps 300M \| 412 Mbps 500M \| 893 Mbps This happen because of q->now changes incorrectly in cbq_dequeue(): when it is called before real end of packet transmitting, L2T is greater than real time delay, q_now gets an extra boost but never compensate it. To fix this problem we prevent change of q->now until its synchronization with real time. Signed-off-by: Vasily Averin <vvs@openvz.org> Reviewed-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-05-01 09:41:06 -07:00
Tianyi Gou	53261a559d	net: sched: export an api to enable/disable flow on sch Export a function from sch_api.c that will look up desired qdisc and call it's registered change function to enable/disable flow. Change-Id: I5b6dc7a6fd2b09b796c92b3770ba83423d19c864 CRs-Fixed: 355156 Acked-by: Jimi Shah <jimis@qualcomm.com> Signed-off-by: Tianyi Gou <tgou@codeaurora.org> (cherry picked from commit b8419fe690053b76658d49565c57ac654faf2eaa) (cherry picked from commit 3a30e7aa4487f56a74f12c12f11cece6ce1f2100)	2013-03-07 15:20:04 -08:00
Tianyi Gou	9a3bb8a6d6	net: sched: Schedule PRIO qdisc when flow control released The PRIO qdisc supports flow control, such that packet dequeue can be disabled based on boolean flag 'enable_flow'. When flow is re-enabled, the latency for new packets arriving at network driver is high. To reduce the delay in scheduling packets, the qdisc will now invoke __netif_schedule() to expedite dequeue. This significantly reduces the latency of packets arriving at network driver. Change-Id: Ic5fe3faf86f177300d3018b9f60974ba3811641c CRs-Fixed: 355156 Acked-by: Jimi Shah <jimis@qualcomm.com> Signed-off-by: Tianyi Gou <tgou@codeaurora.org>	2013-02-27 18:18:56 -08:00
Tianyi Gou	6d473f734c	net_sched: Add flow control support to prio qdisc Add enable_flow flag to the prio qdisc. Packet flow is enabled by default, but can be disabled from userspace (e.g. IPROUTE2 tc tool). This allows for suspending packet dequeue on a per-qdisc basis, which is needed to supprot Quality of Service (QOS) when using WWAN modem. Change-Id: I932f296be946f1acc3b00c7d8569bbb733d33622 Acked-by: Andrew Richardson <randrew@qualcomm.com> CRs-Fixed: 283471 Signed-off-by: Tianyi Gou <tgou@codeaurora.org>	2013-02-25 11:37:01 -08:00
Stefan Hasko	c7078c2c5d	net: sched: integer overflow fix [ Upstream commit `d2fe85da52` ] Fixed integer overflow in function htb_dequeue Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-11 09:07:14 -08:00
Paolo Valente	5ee708f19b	pkt_sched: fix virtual-start-time update in QFQ [ Upstream commit `7126195697` ] If the old timestamps of a class, say cl, are stale when the class becomes active, then QFQ may assign to cl a much higher start time than the maximum value allowed. This may happen when QFQ assigns to the start time of cl the finish time of a group whose classes are characterized by a higher value of the ratio max_class_pkt/weight_of_the_class with respect to that of cl. Inserting a class with a too high start time into the bucket list corrupts the data structure and may eventually lead to crashes. This patch limits the maximum start time assigned to a class. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-13 05:38:42 +09:00
Eric Dumazet	52ee75479f	net-sched: sch_cbq: avoid infinite loop [ Upstream commit `bdfc87f7d1` ] Its possible to setup a bad cbq configuration leading to an infinite loop in cbq_classify() DEV_OUT=eth0 ICMP="match ip protocol 1 0xff" U32="protocol ip u32" DST="match ip dst" tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \ bandwidth 100mbit tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \ rate 512kbit allot 1500 prio 5 bounded isolated tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \ $ICMP $DST 192.168.3.234 flowid 1: Reported-by: Denys Fedoryschenko <denys@visp.net.lb> Tested-by: Denys Fedoryschenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-13 05:38:42 +09:00
Hiroaki SHIMODA	7e0c71a9a5	net_sched: gact: Fix potential panic in tcf_gact(). [ Upstream commit `696ecdc106` ] gact_rand array is accessed by gact->tcfg_ptype whose value is assumed to less than MAX_RAND, but any range checks are not performed. So add a check in tcf_gact_init(). And in tcf_gact(), we can reduce a branch. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-02 10:29:34 -07:00
Alan Cox	d5eeca5f5c	sch_sfb: Fix missing NULL check [ Upstream commit `7ac2908e4b` ] Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=44461 Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-09 08:31:42 -07:00
Eric Dumazet	f7a47ee346	netem: add limitation to reordered packets [ Upstream commit `960fb66e52` ] Fix two netem bugs : 1) When a frame was dropped by tfifo_enqueue(), drop counter was incremented twice. 2) When reordering is triggered, we enqueue a packet without checking queue limit. This can OOM pretty fast when this is repeated enough, since skbs are orphaned, no socket limit can help in this situation. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-09 08:31:41 -07:00
Eric Dumazet	116a0fc31c	netem: fix possible skb leak skb_checksum_help(skb) can return an error, we must free skb in this case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if skb_unshare() failed), so lets use this generic helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 13:40:48 -04:00
David Ward	244b65dbfe	net_sched: gred: Fix oops in gred_dump() in WRED mode A parameter set exists for WRED mode, called wred_set, to hold the same values for qavg and qidlestart across all VQs. The WRED mode values had been previously held in the VQ for the default DP. After these values were moved to wred_set, the VQ for the default DP was no longer created automatically (so that it could be omitted on purpose, to have packets in the default DP enqueued directly to the device without using RED). However, gred_dump() was overlooked during that change; in WRED mode it still reads qavg/qidlestart from the VQ for the default DP, which might not even exist. As a result, this command sequence will cause an oops: tc qdisc add dev $DEV handle $HANDLE parent $PARENT gred setup \ DPs 3 default 2 grio tc qdisc change dev $DEV handle $HANDLE gred DP 0 prio 8 $RED_OPTIONS tc qdisc change dev $DEV handle $HANDLE gred DP 1 prio 8 $RED_OPTIONS This fixes gred_dump() in WRED mode to use the values held in wred_set. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-16 23:51:07 -04:00
Linus Torvalds	3b59bf0816	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking merge from David Miller: "1) Move ixgbe driver over to purely page based buffering on receive. From Alexander Duyck. 2) Add receive packet steering support to e1000e, from Bruce Allan. 3) Convert TCP MD5 support over to RCU, from Eric Dumazet. 4) Reduce cpu usage in handling out-of-order TCP packets on modern systems, also from Eric Dumazet. 5) Support the IP{,V6}_UNICAST_IF socket options, making the wine folks happy, from Erich Hoover. 6) Support VLAN trunking from guests in hyperv driver, from Haiyang Zhang. 7) Support byte-queue-limtis in r8169, from Igor Maravic. 8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but was never properly implemented, Jiri Benc fixed that. 9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang. 10) Support kernel side dump filtering by ctmark in netfilter ctnetlink, from Pablo Neira Ayuso. 11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker. 12) Add new peek socket options to assist with socket migration, from Pavel Emelyanov. 13) Add sch_plug packet scheduler whose queue is controlled by userland daemons using explicit freeze and release commands. From Shriram Rajagopalan. 14) Fix FCOE checksum offload handling on transmit, from Yi Zou." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits) Fix pppol2tp getsockname() Remove printk from rds_sendmsg ipv6: fix incorrent ipv6 ipsec packet fragment cpsw: Hook up default ndo_change_mtu. net: qmi_wwan: fix build error due to cdc-wdm dependecy netdev: driver: ethernet: Add TI CPSW driver netdev: driver: ethernet: add cpsw address lookup engine support phy: add am79c874 PHY support mlx4_core: fix race on comm channel bonding: send igmp report for its master fs_enet: Add MPC5125 FEC support and PHY interface selection net: bpf_jit: fix BPF_S_LDX_B_MSH compilation net: update the usage of CHECKSUM_UNNECESSARY fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso ixgbe: Fix issues with SR-IOV loopback when flow control is disabled net/hyperv: Fix the code handling tx busy ixgbe: fix namespace issues when FCoE/DCB is not enabled rtlwifi: Remove unused ETH_ADDR_LEN defines igbvf: Use ETH_ALEN ... Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.	2012-03-20 21:04:47 -07:00
Linus Torvalds	0d9cabdcce	Merge branch 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "Out of the 8 commits, one fixes a long-standing locking issue around tasklist walking and others are cleanups." * 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list cgroup: Remove wrong comment on cgroup_enable_task_cg_list() cgroup: remove cgroup_subsys argument from callbacks cgroup: remove extra calls to find_existing_css_set cgroup: replace tasklist_lock with rcu_read_lock cgroup: simplify double-check locking in cgroup_attach_proc cgroup: move struct cgroup_pidlist out from the header file cgroup: remove cgroup_attach_task_current_cg()	2012-03-20 18:11:21 -07:00
David S. Miller	4da0bd7365	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-03-18 23:29:41 -04:00
Eric Dumazet	cc34eb672e	sch_sfq: revert dont put new flow at the end of flows This reverts commit `d47a0ac7b6` (sch_sfq: dont put new flow at the end of flows) As Jesper found out, patch sounded great but has bad side effects. In stress situation, pushing new flows in front of the queue can prevent old flows doing any progress. Packets can stay in SFQ queue for unlimited amount of time. It's possible to add heuristics to limit this problem, but this would add complexity outside of SFQ scope. A more sensible answer to Dave Taht concerns (who reported the issued I tried to solve in original commit) is probably to use a qdisc hierarchy so that high prio packets dont enter a potentially crowded SFQ qdisc. Reported-by: Jesper Dangaard Brouer <jdb@comx.dk> Cc: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-16 01:55:25 -07:00
David S. Miller	ff4783ce78	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/sfc/rx.c Overlapping changes in drivers/net/ethernet/sfc/rx.c, one to change the rx_buf->is_page boolean into a set of u16 flags, and another to adjust how ->ip_summed is initialized. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 21:55:51 -05:00
Eric Dumazet	cd961c2ca9	netem: fix dequeue commit `50612537e9` (netem: fix classful handling) added two errors in netem_dequeue() 1) After checking skb at the head of tfifo queue for time constraints, it dequeues tail skb, thus adding unwanted reordering. 2) qdisc stats are updated twice per packet (one when packet dequeued from tfifo, once when delivered) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-19 18:57:50 -05:00
Eric Dumazet	2132cf6437	net_sched: sch_plug: plug_qdisc_ops is static net/sched/sch_plug.c:211:18: warning: symbol 'plug_qdisc_ops' was not declared. Should it be static? Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-13 16:04:40 -05:00
David S. Miller	16bda13d90	net: Make qdisc_skb_cb upper size bound explicit. Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-09 13:50:34 -05:00
Shriram Rajagopalan	c3059be16c	net/sched: sch_plug - Queue traffic until an explicit release command The qdisc supports two operations - plug and unplug. When the qdisc receives a plug command via netlink request, packets arriving henceforth are buffered until a corresponding unplug command is received. Depending on the type of unplug command, the queue can be unplugged indefinitely or selectively. This qdisc can be used to implement output buffering, an essential functionality required for consistent recovery in checkpoint based fault-tolerance systems. Output buffering enables speculative execution by allowing generated network traffic to be rolled back. It is used to provide network protection for Xen Guests in the Remus high availability project, available as part of Xen. This module is generic enough to be used by any other system that wishes to add speculative execution and output buffering to its applications. This module was originally available in the linux 2.6.32 PV-OPS tree, used as dom0 for Xen. For more information, please refer to http://nss.cs.ubc.ca/remus/ and http://wiki.xensource.com/xenwiki/Remus Changes in V3: * Removed debug output (printk) on queue overflow * Added TCQ_PLUG_RELEASE_INDEFINITE - that allows the user to use this qdisc, for simple plug/unplug operations. * Use of packet counts instead of pointers to keep track of the buffers in the queue. Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Signed-off-by: Brendan Cully <brendan@cs.ubc.ca> [author of the code in the linux 2.6.32 pvops tree] Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-07 12:54:56 -05:00
David S. Miller	a0417fa3a1	net: Make qdisc_skb_cb upper size bound explicit. Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 15:14:37 -05:00
Li Zefan	761b3ef50e	cgroup: remove cgroup_subsys argument from callbacks The argument is not used at all, and it's not necessary, because a specific callback handler of course knows which subsys it belongs to. Now only ->pupulate() takes this argument, because the handlers of this callback always call cgroup_add_file()/cgroup_add_files(). So we reduce a few lines of code, though the shrinking of object size is minimal. 16 files changed, 113 insertions(+), 162 deletions(-) text data bss dec hex filename 5486240 656987 7039960 13183187 c928d3 vmlinux.o.orig 5486170 656987 7039960 13183117 c9288d vmlinux.o Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-02-02 09:20:22 -08:00
Vijay Subramanian	a42b4799c6	netem: Fix off-by-one bug in reordering With netem reordering, a gap of N is supposed to reorder every Nth packet with given reorder probability. However, the code currently skips N packets and reorders every (N+1)th packet. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-22 15:08:44 -05:00
Eric Dumazet	ddecf0f4db	net_sched: sfq: add optional RED on top of SFQ Adds an optional Random Early Detection on each SFQ flow queue. Traditional SFQ limits count of packets, while RED permits to also control number of bytes per flow, and adds ECN capability as well. 1) We dont handle the idle time management in this RED implementation, since each 'new flow' begins with a null qavg. We really want to address backlogged flows. 2) if headdrop is selected, we try to ecn mark first packet instead of currently enqueued packet. This gives faster feedback for tcp flows compared to traditional RED [ marking the last packet in queue ] Example of use : tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 4sec sfq \ limit 3000 headdrop flows 512 divisor 16384 \ redflowlimit 100000 min 8000 max 60000 probability 0.20 ecn qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop flows 512/16384 divisor 16384 ewma 6 min 8000b max 60000b probability 0.2 ecn prob_mark 0 prob_mark_head 4876 prob_drop 6131 forced_mark 0 forced_mark_head 0 forced_drop 0 Sent 1175211782 bytes 777537 pkt (dropped 6131, overlimits 11007 requeues 0) rate 99483Kbit 8219pps backlog 689392b 456p requeues 0 In this test, with 64 netperf TCP_STREAM sessions, 50% using ECN enabled flows, we can see number of packets CE marked is smaller than number of drops (for non ECN flows) If same test is run, without RED, we can check backlog is much bigger. qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop flows 512/16384 divisor 16384 Sent 1148683617 bytes 795006 pkt (dropped 0, overlimits 0 requeues 0) rate 98429Kbit 8521pps backlog 1221290b 841p requeues 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Dave Taht <dave.taht@gmail.com> Tested-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-12 20:05:28 -08:00
Eric Dumazet	eeca6688d6	net_sched: red: split red_parms into parms and vars This patch splits the red_parms structure into two components. One holding the RED 'constant' parameters, and one containing the variables. This permits a size reduction of GRED qdisc, and is a preliminary step to add an optional RED unit to SFQ. SFQRED will have a single red_parms structure shared by all flows, and a private red_vars per flow. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Dave Taht <dave.taht@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-05 14:01:21 -05:00
Eric Dumazet	18cb809850	net_sched: sfq: extend limits SFQ as implemented in Linux is very limited, with at most 127 flows and limit of 127 packets. [ So if 127 flows are active, we have one packet per flow ] This patch brings to SFQ following features to cope with modern needs. - Ability to specify a smaller per flow limit of inflight packets. (default value being at 127 packets) - Ability to have up to 65408 active flows (instead of 127) - Ability to have head drops instead of tail drops (to drop old packets from a flow) Example of use : No more than 20 packets per flow, max 8000 flows, max 20000 packets in SFQ qdisc, hash table of 65536 slots. tc qdisc add ... sfq \ flows 8000 \ depth 20 \ headdrop \ limit 20000 \ divisor 65536 Ram usage : 2 bytes per hash table entry (instead of previous 1 byte/entry) 32 bytes per flow on 64bit arches, instead of 384 for QFQ, so much better cache hit ratio. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-05 14:01:21 -05:00
Hagen Paul Pfeifer	eb10192447	net_sched: Bug in netem reordering Not now, but it looks you are correct. q->qdisc is NULL until another additional qdisc is attached (beside tfifo). See `50612537e9`. The following patch should work. From: Hagen Paul Pfeifer <hagen@jauu.net> netem: catch NULL pointer by updating the real qdisc statistic Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-05 13:27:39 -05:00
David S. Miller	117ff42fd4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-01-04 21:35:43 -05:00
Eric Dumazet	02a9098ede	net_sched: sfq: always randomize hash perturbation SFQ q->perturbation is used in sfq_hash() as an input to Jenkins hash. We currently randomize this 32bit value only if a perturbation timer is setup. Its much better to always initialize it to defeat attackers, or else they can predict very well what kind of packets they have to forge to hit a particular flow. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-04 14:12:48 -05:00
Eric Dumazet	bd16a6cce2	net_sched: sfq: fix mem alloc error recovery Since commit `817fb15dfd` (net_sched: sfq: allow divisor to be a parameter), we can leave perturbation timer armed if a memory allocation error aborts sfq_init(). Memory containing active struct timer_list is freed and kernel can crash. Call sfq_destroy() from sfq_init() to properly dismantle qdisc. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-04 14:12:48 -05:00
Eric Dumazet	fa0f5aa743	net_sched: qdisc_alloc_handle() can be too slow When trying to allocate ~32768 qdiscs using autohandle mechanism, we can fill the space managed by kernel (handles in [8000-FFFF]:0000 range) But O(N^2) qdisc_alloc_handle() loops 0x10000 times instead of 0x8000 time tc add qdisc add dev eth0 parent 10:7fff pfifo limit 10 RTNETLINK answers: Cannot allocate memory real 1m54.826s user 0m0.000s sys 0m0.004s INFO: rcu_sched_state detected stall on CPU 0 (t=60000 jiffies) Half number of loops, and add a cond_resched() call. We hold rtnl at this point. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-03 13:03:20 -05:00
Eric Dumazet	d32ae76f2b	sch_qfq: accurate wsum handling We can underestimate q->wsum in case of "tc class replace ... qfq" and/or qdisc_create_dflt() error. wsum is not really used in fast path, only at qfq qdisc/class setup, to catch user error. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-03 13:02:19 -05:00
Eric Dumazet	6bafcac323	sch_qfq: fix overflow in qfq_update_start() grp->slot_shift is between 22 and 41, so using 32bit wide variables is probably a typo. This could explain QFQ hangs Dave reported to me, after 2^23 packets ? (23 = 64 - 41) Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-03 12:58:23 -05:00
Eric Dumazet	d47a0ac7b6	sch_sfq: dont put new flow at the end of flows SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the circular list. In fact this is probably an old SFQ implementation bug. 100 Mbits = ~8333 full frames per second, or ~8 frames per ms. With 50 flows, it means your "new flow" will have to wait 50 packets being sent before its own packet. Thats the ~6ms. We certainly can change SFQ to give a priority advantage to new flows, so that next dequeued packet is taken from a new flow, not an old one. Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-03 12:52:09 -05:00
Eric Dumazet	50612537e9	netem: fix classful handling Commit `10f6dfcfde` (Revert "sch_netem: Remove classful functionality") reintroduced classful functionality to netem, but broke basic netem behavior : netem uses an t(ime)fifo queue, and store timestamps in skb->cb[] If qdisc is changed, time constraints are not respected and other qdisc can destroy skb->cb[] and block netem at dequeue time. Fix this by always using internal tfifo, and optionally attach a child qdisc to netem (or a tree of qdiscs) Example of use : DEV=eth3 tc qdisc del dev $DEV root tc qdisc add dev $DEV root handle 30: est 1sec 8sec netem delay 20ms 10ms tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \ burst 20480 limit 20480 mtu 1514 rate 32000bps qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms 10.0ms Sent 190792 bytes 413 pkt (dropped 0, overlimits 0 requeues 0) rate 18416bit 3pps backlog 0b 0p requeues 0 qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us Sent 190792 bytes 413 pkt (dropped 6, overlimits 10 requeues 0) backlog 0b 5p requeues 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-30 17:12:23 -05:00
David S. Miller	7f8e3234c5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-30 13:04:14 -05:00
Eric Dumazet	b0460e4484	sch_tbf: report backlog information Provide child qdisc backlog (byte count) information so that "tc -s qdisc" can report it to user. qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms 10.0ms Sent 948517 bytes 898 pkt (dropped 0, overlimits 0 requeues 1) rate 175056bit 16pps backlog 114b 1p requeues 1 qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us Sent 948517 bytes 898 pkt (dropped 15, overlimits 611 requeues 0) backlog 18168b 12p requeues 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-29 15:07:21 -05:00
Eric Dumazet	bb52c7acf8	netem: dont call vfree() under spinlock and BH disabled commit `6373a9a286` (netem: use vmalloc for distribution table) added a regression, since vfree() is called while holding a spinlock and BH being disabled. Fix this by doing the pointers swap in critical section, and freeing after spinlock release. Also add __GFP_NOWARN to the kmalloc() try, since we fallback to vmalloc(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-24 16:08:50 -05:00
David S. Miller	abb434cb05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 17:13:56 -05:00
stephen hemminger	2494654d48	netem: loss model API sizes The new netem loss model is configured with nested netlink messages. This code is being overly strict about sizes, and is easily confused by padding (or possible future expansion). Also message for gemodel is incorrect. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 16:51:18 -05:00
Eric Dumazet	f5a59b7332	sch_hfsc: report backlog information Add backlog (byte count) information in hfsc classes and qdisc, so that "tc -s" can report it to user, instead of 0 values : qdisc hfsc 1: root refcnt 6 default 20 Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0) rate 1492Kbit 126pps backlog 103226b 74p requeues 0 ... class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0) backlog 81822b 56p requeues 0 period 23 work 49451576 bytes rtwork 13277552 bytes level 0 ... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: John A. Sullivan III <jsullivan@opensourcedevel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 16:51:18 -05:00
Thomas Graf	7838f2ce36	mqprio: Avoid panic if no options are provided Userspace may not provide TCA_OPTIONS, in fact tc currently does so not do so if no arguments are specified on the command line. Return EINVAL instead of panicing. Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-22 22:34:56 -05:00
Eric Dumazet	225d9b89c9	sch_sfq: rehash queues in perturb timer A known Out Of Order (OOO) problem hurts SFQ when timer changes perturbation value, since all new packets delivered to SFQ enqueue might end on different slots than previous in-flight packets. With round robin delivery, we can thus deliver packets in a different order. Since SFQ is limited to small amount of in-flight packets, we can rehash packets so that this OOO problem is fixed. This rehashing is performed only if internal flow classifier is in use. We now store in skb->cb[] the "struct flow_keys" so that we dont call skb_flow_dissect() again while rehashing. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-21 15:44:34 -05:00

1 2 3 4 5 ...

896 commits