In our last post TCP Talk Series- II , we studies about the Transmission Control Blocks, TCP Finite State Machine (FSM) and Nagles Algorithm. Today, we are going to cover below topics:
- Selective Acknowledgement
- Sliding Window
Why only after 3-duplicate ACK retransmission takes place in TCP?
Since TCP does not know whether a duplicate ACK is caused by a lost segment or just a reordering of segments, it waits for a small number of duplicate ACKs to be received.
It is assumed that if there is just a reordering of the segments, there will be only one or two duplicate ACKs before the reordered segment is processed, which will then generate a new ACK. If three or more duplicate ACKs are received in a row, it is a strong indication that a segment has been lost.
The key here is that Reno wants to avoid slow-start it makes the assumption that 3 DUP-ACK received without any intervening packets is the receiver indicating that it wants that single segment resent(fast re-transmit). If a timeout occurs that is when Reno will trigger slow start.
The reason why TCP sends consecutive ACKs all together.
In order to optimise the use of high latency links, receivers avoid ACK every single packets. Otherwise in a e.g 300ms you would only be able to send 3 packets per second. Being 1500 bytes the usual maximum MTU for an Ethernet link you can see how this is not scalable.
Delayed ACK :
TCP will piggyback the ACK with its data. But if the peer dies not have the any data to end at the moment, the ACK should not be delayed too long. Hence a timer for 200ms will be used. At every 200 ms, TCP will check for any ACK to be sent and send the as individual packet.
All these below mentioned are related to the Acknowledgement
- Retransmission Queue
- RTO (Retransmission Timeout)
- SRTT (Smooth Round Trip Time)
- Timer Backoff
At beginning of connection, a default Retransmission Timeout (RTO) value is determined. If I didn’t get any ACK for that segment before the RTO expires. Then TCP retransmit the segment.
How RTO defined : (Retransmission Timeout)
TCP sets a timer when it sends data, and if the data is not ack when the timer expires a timeout or timer based retransmission of data occurs. The time out occurs after an interval called as RTO. It has another way to initiating a retransmission called FAST RETRANSMISSION OR FAST RETRANSMIT.
This is a Operating system depends. Microsoft default RTO value is 3 seconds when they starts the connection.
Basically they take the RTT from the very last segment. Basically they compares the RTT from the very first packet and the last Packet. Compares the value. And they calculate close to Average.
So RTO is based on the RTT, the round trip from i.e R1 to R2.
As per RFC2988, to compute the current RTO, a TCP sender maintains two state variables, SRTT (Smoothed Round Trip Time) and RTTVAR ( Round-Trip Time Variation). In addition we assume a clock granularity of G seconds.
The rules governing the computation of SRTT, RTTVAR, and RTO are as follows :
Until the RTO measurement has been made for a segment sent between the sender and receiver, the sender Should set the RTO <- 3 Sec, through the “backing off” on repeated retransmission.
Note that some implementation may use a “heartbeat” timer that is fact yield between 2.5 seconds and 3 seconds. Accordingly a lower bound of 2.5 seconds is also acceptable, providing that the timer will never expire faster than 2.5 seconds.
When the RTT measurement R is made, the host must set
- SRTT = R
- RTTVAR = R/2
- RTO = SRTT + max(G, K*RTTVAR)
The RTO GETS Incremented as “TCP Retransmission happened” Below mentioned example :
- 1st TCP Retransmission Packet RTO is 1.004722 Seconds
- 2nd TCP Retransmission Packet RTO is 2.011 Seconds
- 3rd TCP Retransmission Packet RTO is 3.018 seconds and so on.
NEW RENO and SACK is called advanced loss recovery techniques to distinguish them from the older approach.
A gap between the ACK number and other in window data cached at the receiver are called holes. Data with sequence numbers beyond the holes are called Out of Sequence data because data is not contiguous.
When a SACK option is used, an ACK can be augmented with up to three or four SACK blocks that contain information about out of- sequence data at the receiver.
Each SACK block contains two 32 bit seq number representing the first and last seq number plus 1 of a continuous block of out of sequence data being held at the receiver.
SACK always use on conjunction with TSOPT, which takes an additional 10 Bytes meaning that SACK is typically able to include only three block per ack.
Generally speaking a receiver generate SACK whenever these is any out-of-order data in the buffer.This can happen either because data was lost in the transit or because it has been recorded and newer data has arrived at the receiver before older data.
Unfortunately Regular ACK and SACK lost and are not retransmitted by TCP unless they contain data ( if control bin SYN or FIN turned on)
Sender Behavior :
SACK capable sender must be used that treats the SACK blocks appropriately and performs Selective retransmission by sending only those segments missing at the receiver, as process is called selective repeat.
When a SACK capable sender has the opportunity to perform a retransmission, usually because it has received a SACK or seen multiple DUP ACK, it has the choice of whether it sends new data or retransmit old data.
The SACK information provides the sequence number ranges present at the receiver so the sender can in order what segments likely need to be retransmitted to fill the receiver
To understand ho the use of SACK alters the sender and receiver behavior, we repeat the preceding fast retransmit experiment with the same setup (dropping seq number 23601 and 28801. But this time the sender and receiver are using SACK.
The first SACK received. Wireshark indicates SACK information by indicating the left edge and right edge of the SACK Range. Here we see that the ACK or 23801 contains a SACK block of 25201-26601, indicating their is hole at the receiver. The received is missing seq number range from 23801-25200, which corresponds to the single 1440 bytes of packet staring with the seq 23801. Nite that this SACK is a window update and is
not counted as a duplicate ACK .
The SACK arriving at time 0.967 contains two SACK: [28001,29401] and [25201, 26601]. Recall that the first SACK blocks from previous SACK are repeated in later positions in
subsequent SACK for robustness against ACK loss.
This SACK is a dup ack for seq 23801 and suggests that receiver now requires two full size segment starting with seq number 23801 and 26601.
The sender reaches immediately by initiating fast retransmit, but because of congestion control procedures the sender sends only one transmission for segment 23801.
A TCP SACK sender uses the recovery point idea introduced with NEWRENO.
The benefit of SACK are more pronounced when the RTT is large and packet loss is severe. Under such circumstances the benefits of being able to fill more than one hole per RTT are likely to be more significant.
RETRANSMISSION QUEUE :
Every segment transmitted by TCP is copied and placed on a “Retransmission Queue”
All the bytes that TCP sends that place those bytes in the retransmission Queue. Which will help if any Segment lost then that particular segment need to retransmit.
If ACK not received by the Time RTO expires, segment is re-transmitted.
Segments will be present in the Retransmission Queue until and unless, it receives an “ACK” or if TCP cross the threshold to retransmit after a Duplicate ACK received.
Question : How Many TCP Retransmits occurs before TCP gives up ?
Total of 240 seconds the RTO will increase and steady after that. As per Microsoft.
After a pre-determined quantity of failed retransmissions, TCP will give up.
Quantity of retransmission is OS Dependent.
Microsoft Windows uses a timer called, “TCPMaxDataRetransmissions”
Default Value = 5.
We can change this value, go to “Reg Edit” and change the value as per requirement.
Subsequent RTO value determines by SRTT (SMOOTH ROUND TRIP TIME). Algo smooth it out.
PRACTICALLY HOW PACKET FLOW :
NOTE : After receiving the TCP SYN and ACK for the Segment#1. Now the Segment #1 is removed from the RETRANSMISSION QUEUE .
TAKE AWAY FROM THIS :
- After Initial RTO is determined (based on RTT), it is dynamically modified based on moving average of subsequent RTT for ACKs.
- If RTO expires before an ACK is received :
- RTP value is doubled for that connection. (This is called as Timer Backoff)
- Frame is retransmitted
- Copy of the frame retained in retransmission queue.
SELECTIVE ACK :
NOTE : The SACK is a window update and is not counted as a Duplicate ACK.
Legacy TCP Transmission : Without SELECTIVE ACK
WITH SELECTIVE ACK :
Multiple Packet losses from a window of data can have a catastrophic effect on TCP throughput. TCP uses a cumulative ACK scheme in which received segments that are not at the left edge of the receive window are not ACK.
SACK is a strategy which corrects this behaviour in the face of multiple dropped segments. With Selective ACK, the data receiver can inform the sender about all segment that have arrived successfully, so the sender need retransmit only the segment that have actually been lost.
LEFT EDGE OF BLOCK : This is the first sequence number of this block
RIGHT EDGE OF BLOCK : This is sequence number immediately following the last sequence number of this block.
Selective ACK is in TCP OPTION field.
Here we see that the ACK for 17377 contains a block of [18825-20273] indicating their is hole at the receiver. The receiver is missing the Sequence number range [17377-18825]
which corresponds to the single 1448 bytes packet starting with sequence number 17377.
The sender react immediately by initiating fast retransmit, because of the congestion control procedure the sender sends only one retransmission, for segment 17377. with the arrival of two additional ACK the sender is permitted to send its second retransmission for the segment 20237.
SLIDING WINDOW :
Once the connection is established, data transfer can take place.
At any given time, TCP endpoints may decide to change their initially-advertised WIN Size.
TCP places all bytes it has received from the Upper Layer protocol stream into one of four categories.
- Bytes Send And ACK (Removed from retrans queue) (Bytes that are present for slit of seconds and that are gone)
- Bytes Sent but not yet ACK
- Bytes Not yet sent for which recipient is Ready
- Bytes Not yet sent for which recipient is not ready
The TCP Sliding Window mechanism dictates the size of each of these four categories. Basically how much data in the wire which actually not ACK yet at one time.
That Window is :
Bytes Sent But not yet Acknowledge
Bytes not yet Sent for Which Recipient is Not ready
The Client Said he will take 8 Bytes at a time.
Server got 16 bytes of DATA to process from the upper layer.
Receiver could get potentially 8 Bytes of Data. This is the Usable Window, but TCP starts slow and build up.
The “SEND WINDOW” what is the Maximum Bytes of UNACK Bytes I have.
As the Server is ramping up slowing it will send 4 BYTES of DATA.
LEFT SIDE WINDOW : What is the first Bytes sent not yet ACK yet
RIGHT SIDE WINDOW : What is the last bytes you received that is not yet Sent
The receiving Application, sent an ACK for that 4 Bytes received.
With the receipt of the ACK several things happen at once :
- The first four bytes change their categories from “Sent but not ACK” (Which is PART of the SEND WINDOW) to “Sent and ACK”.
- Now they essentially removed as they are removed from the Retransmission Queue.
- The Send Window (which was composed of two sub categories of “Sent but not Acked” and Usuable Window” once again shrinks to only a single category of traffic.
- The USABLE WINDOW size shrinks because the received ACK window size shrinks (From 4 bytes previously now down to 3 Bytes)
- Now Sent packet 5 and 6. How its going to effect. The Usable window is now shrunk to 1 Bytes.
ACK SENT FOR THREE BYTES
SLIDING WINDOW SLIDES TILL RIGHT SIDE AS PER AVAILABLE WINDOW SIZE
NOTE : USABLE WINDOW :- Bytes it yet sent for which recipient which the receiver is ready
Why this is called as Sliding Window ?
- As bytes are send and ACK, the Send Window moves to right side.
- At any given time Send Window will be categorised, either one kind of traffic or two. Never going to composed more than that.
- Once the packet are sent, then we don’t have any Usable Window. Their is no more bytes are ready for which the recipient was ready. Now at this point “Bytes sent and not yet ACK”
- Which is also called as WINDOW Already Sent.
- If I get an ACK for 5,6, 7 then the send window will shift towards right.
- Now when the Window will shift towards right side, then it will be called as Usable Window. And Byte which got shifted is “Bytes it yet sent for which recipient which the receiver is ready”
TAKE AWAY :
The Send Window i.e Sliding Window governs the max quantity of bytes that can be outstanding (i.e. UnACK) at any point in time.
Then Send Window (At any point in time) can be composed of one or both of the following categories of traffic :
- Usable Window : Bytes that could be sent right now!! Sent but not asked
Then Maximum size of the Send Window is the Lesser of
- The receive Window of the remote TCP host or ..
- The congestion Window of the Sender
- If ACK is missed, the Sender will reduce the SENDER WINDOW. Congestion Window is smaller.
- SEND WINDOW indicates who many bytes at most be transmitted at one time.
- As the WIN SIZE is 3 now the SEND WINDOW Shrinks down to 3 Bytes. This the MAX Quantity of bytes that is ACK.
To be continued…
TCP Talk Series:
- TCP Talk Series – I
- TCP Talk Series- II
- TCP Talk Series – III
- TCP Talk Series -IV
- TCP Talk Series- V