Allocation for the optnames is similar to the DCCP options, with a
range for rx and tx half connection CCIDs.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moving the TFRC sender and receiver variables to separate structs, so
that we can copy these structs to userspace thru getsockopt,
dccp_diag, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Isolating it, that will be used when we introduce a CCID2 (TCP-Like)
implementation.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As discussed in the dccp@vger mailing list:
Now applications have to use setsockopt(DCCP_SOCKOPT_SERVICE, service[s]),
prior to calling listen() and connect().
An array of unsigned ints can be passed meaning that the listening sock accepts
connection requests for several services.
With this we can ditch struct sockaddr_dccp and use only sockaddr_in (and
sockaddr_in6 in the future).
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moving the setting of DCCP_SKB_CB(skb)->dccpd_reset_code to the places
where events happen that trigger sending a RESET packet.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliciting a SYNCACK in response, we were handling SYNC packets
only in the DCCP_OPEN state, in dccp_rcv_established.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
It is possible to receive more than one CLOSEREQ packet if the
CLOSE packet sent in response is somehow lost, change the state
to DCCP_CLOSING only on the first CLOSEREQ packet received.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Also use some BUG_ON where appropriate and use LIMIT_NETDEBUG for the unlikely
cases where we, at this stage, want to know about, that in my tests hasn't
appeared in the radar.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
To match more closely what is described in RFC 3448.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
To start the timestamps with 0.0ms, easing the integer maths in the CCIDs, this
probably will be reworked to use the to be introduced struct timeval_offset
infrastructure out of skb_get_timestamp, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The initialization of ccid3hcrx_rtt to 5ms is just a bandaid, I'll continue
auditing the CCID3 HC rx codebase to fix this properly, probably I'll add a
feedback timer as suggested in the CCID3 draft.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
We can get this value in an TIMESTAMP_ECHO and/or in an ELAPSED_TIME option, if
receiving both give precendence to the biggest one.
In my tests they are very close if not equal at all times, so we may well think
about removing the code in CCID3 that inserts this option and leaving this to
the core, and perhaps even use just TIMESTAMP_ECHO including the elapsed time.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This makes the send rate calculations behave way more closely to what
is specified, with the jitter previously seen on x and x_recv
disappearing completely on non lossy setups.
This resembles the tcp_data_snd_check code, that possibly we'll end up
using in DCCP as well, perhaps moving this code to
inet_connection_sock.
For now I'm doing the simplest implementation tho.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that applications can set dccp_sock->dccps_pkt_size, that in turn
is used in the CCID3 half connection init routines to set
ccid3hc[tr]x_s and use it in its rate calculations.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming it to dccp_rx_hist_detect_loss.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming it to dccp_rx_hist_add_packet.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'll now take a look at the other proposed TFRC DCCP CCIDs to find
more code that is now in ccid3.c and move to this module, the loss
event rate, calc_X, etc most probably will be moved there.
The main goal of these changes is to pave the way for the
implementation of more TFRC based DCCP CCIDs and to shrink ccid3.c,
reducing its complexity and helping in getting it rock solid.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And put this into net/dccp/ccids/lib/, where packet_history.[ch] will also be
moved and then we'll have a tfrc_lib.ko module that will be used by
dccp_ccid3.ko and other CCIDs that are variations of TFRC (RFC 3448).
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid open coding this all over the place.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introducing functions to add to or subtract from a timeval variable
and renaming now_delta to timeval_new_delta that calls do_gettimeofday
and then timeval_delta, that should be used when there are several
deltas made relative to the current time or setting variables to it,
so as to avoid calling do_gettimeofday excessively.
I'm leaving these "timeval_" prefixed funcions internal to DCCP for a
while till we're sure there are no subtle bugs in it.
It also is more correct as it checks if the number of usecs added to
or subtracted from a tv_usec field is more than 2 seconds.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is not quite what I think we should have long term but improves
performance for now, so lets use it till we get CCID3 working well,
then we can think about using sk_write_queue, perhaps using some ideas
from Juwen Lai's old stack for 2.4.20.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch puts mostly read only data in the right section
(read_mostly), to help sharing of these data between CPUS without
memory ping pongs.
On one of my production machine, tcp_statistics was sitting in a
heavily modified cache line, so *every* SNMP update had to force a
reload.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested with a patched netcat, no horror stories so far 8)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And also hc_tx and hc_rx get_info functions for the CCIDs to fill in
information that is specific to them.
For now reusing struct tcp_info, later I'll try to figure out a better
solution, for now its really nice to get this kind of info:
[root@qemu ~]# ./ss -danemi
State Recv-Q Send-Q Local Addr:Port Peer Addr:Port
LISTEN 0 0 *:5001 *:* ino:628 sk:c1340040
mem:(r0,w0,f0,t0) cwnd:0 ssthresh:0
ESTAB 0 0 172.20.0.2:5001 172.20.0.1:32785 ino:629 sk:c13409a0
mem:(r0,w0,f0,t0) ts rto:1000 rtt:0.004/0 cwnd:0 ssthresh:0 rcv_rtt:61.377
This, for instance, shows that we're not congestion controlling ACKs,
as the above output is in the ttcp receiving host, and ttcp is a one
way app, i.e. the received never calls sendmsg, so
ccid_hc_tx_send_packet is never called, so the TX half connection
stays in TFRC_SSTATE_NO_SENT state and hctx_rtt is never calculated,
stays with the value set in ccid3_hc_tx_init, 4us, as show above in
milliseconds (0.004ms), upcoming patches will fix this.
rcv_rtt seems sane tho, matching ping results :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using TIMESTAMP_ECHO and ELAPSED_TIME options received.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And don't insert a TIMESTAMP option in all packets, leave the decision
to the CCIDs.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just like kfree, etc it will just not call the CCID exit
routines when the private data area is set to NULL.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that we retransmit CLOSE/CLOSEREQ packets till they elicit an
answer or we hit a timeout.
Most of the machinery uses TCP approaches, this code has to be
polished & audited, but this is better than we had before.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is from a first audit, more eyeballs are more than welcome.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>