Transport layer
The transport layer builds on the network layer to
provide data transport from a process on a source
machine to a process on a destination machine with a
desired level of reliability that is independent of the
physical networks currently in use.
Services provided to the Upper Layers
Just as there are two types of network service, connection-oriented
and connectionless, there are also two types of transport service. The
connection-oriented transport service is similar to the connection-
oriented network service in many ways. In both cases, connections
have three phases: establishment, data transfer, and release.
Addressing and flow control are also similar in both layers.
Furthermore, the connectionless transport service is also very similar
to the connectionless network service.
Transport Service Primitives
Transport Service Primitives
The nesting of TPDUs, packets, and frames.
Berkeley Sockets
Socket
Socket Programming
Socket programming is a way of connecting two nodes on a network to communicate
with each other. One socket(node) listens on a particular port at an IP, while other
socket reaches out to the other to form a connection. Server forms the listener socket
while client reaches out to the server.
Socket creation: int sockfd = socket(domain, type, protocol)
sockfd: socket descriptor, an integer (like a file-handle)
domain: integer, communication domain e.g., AF_INET (IPv4 protocol) , AF_INET6 (IPv6 protocol)
type: communication type
SOCK_STREAM: TCP(reliable, connection oriented)
SOCK_DGRAM: UDP(unreliable, connectionless)
protocol: Protocol value for Internet Protocol(IP). This is the same number which appears on protocol
field in the IP header of a packet.
Elements of Transport Protocols
1.Addressing
2.Connection Establishment
3.Connection Release
4.Flow Control and Buffering
5.Multiplexing
6.Crash Recovery
Addressing
When an application (e.g., a user) process wishes to set up a connection to a remote
application process, it must specify which one to connect to. (Connectionless
transport has the same problem: to whom should each message be sent?) The
method normally used is to define transport addresses to which processes can listen
for connection requests. In the Internet, these endpoints are called ports. We will
use the generic term TSAP (Transport Service Access Point) to mean a specific
endpoint in the transport layer. The analogous endpoints in the network layer (i.e.,
network layer addresses) are not-surprisingly called NSAPs (Network Service
Access Points). IP addresses are examples of NSAPs.
While stable TSAP addresses work for a
small number of key services that never
change (e.g., the Web server).
The user sends a message specifying
the service name, and the portmapper
sends back the TSAP address. Then the
user releases the connection with the
portmapper and establishes a new one
with the desired service.
Connection Establishment
Three protocol scenarios for establishing a connection using a three-
way handshake. CR denotes CONNECTION REQUEST. Above is the
Normal operation.
Connection Release
a.Normal case of three-way
we see the normal handshake.
case in which one of the users sends a DR
(DISCONNECTION REQUEST) segment to initiate the connection release.
When it arrives, the recipient sends back a DR segment and starts a timer, just
in case its DR is lost. When this DR arrives, the original sender sends back an
ACK segment and releases the connection. Finally, when the ACK segment
Connection Release
b. Ack Lost
If the final ACK segment is lost, as shown in Fig the
situation is saved by the timer. When the timer expires, the
connection is released anyway.
Connection Release
6-14, c,d
(c) Response lost. (d) Response lost and subsequent DRs lost.
Flow Control and Buffering
Flow Control and Buffering
Dynamic buffer allocation. Buffer allocation info travels in separate TPDUs.
The arrows show the direction of transmission. ‘…’ indicates a lost TPDU.
Potential deadlock if control TPDUs are not sequenced or timed out
Multiplexing
(a) Upward multiplexing.
(b) Downward multiplexing. Used to increase the bandwidth, e.g., two
ISDN connections of 64 kbps each yield 128 kbps bandwidth.
Crash Recovery
Three events are possible at the server: sending an
acknowledgement (A), writing to the output process (W), and
crashing (C). Each client can be in one of two states: one segment
outstanding, S1, or no segments outstanding, S0. Based on only this
state information, the client must decide whether to retransmit the
most recent segment.
The Internet Transport Protocols: UDP
• Introduction to UDP
• Remote Procedure Call (RPC)
• The Real-Time Transport Protocol(RTP)
UDP
UDP
UDP: It supports a connectionless transport protocol. UDP transmits segments
The UDP header
Remote Procedure Call
When a process on Machine1 calls a procedure on machine 2, the calling process
on 1 is suspended and execution of the called procedure takes place on 2.
Information can be transported from the caller to the callee in the parameters and
can come back in the procedure result. No message passing is visible to the
programmer. This technique is known as RPC.
Packing the parameters is called Marshaling. Client Server RPC is one area in
which UDP is widely used.
Steps in making a remote procedure call. The stubs are shaded.
The Real-Time Transport Protocol
Real Time multimedia applications like : Internet Radio, Internet Telephony,
Music on Demand, Video Conferencing, Video on Demand.
(a) The position of RTP in the protocol stack. (b) Packet nesting.
Why does DNS use UDP and not
TCP?
There are following interesting facts about TCP and UDP on the
transport layer that justify the above.
1) UDP is much faster. TCP is slow as it requires 3-way
handshake. The load on DNS servers is also an important
factor. DNS servers (since they use UDP) don’t have to keep
connections.
2) DNS requests are generally very small and fit well within UDP
segments.
3) UDP is not reliable, but reliability can added on application
layer. An application can use UDP and can be reliable by using
a timeout and resend at the application layer.
The Internet Transport Protocols: TCP
• Introduction to TCP
• The TCP Service Model
• The TCP Protocol
• The TCP Segment Header
• TCP Connection Establishment
• TCP Connection Release
• TCP Connection Management Modeling
• TCP Transmission Policy
• TCP Congestion Control
• TCP Timer Management
The TCP Service Model
The TCP Service Model
(a) Four 512-byte segments sent as separate IP datagrams.
(b) The 2048 bytes of data delivered to the application in a single
READ CALL.
The TCP Segment Header
Every segment
begins with a fixed-
format, 20-byte
header. The fixed
header may be
followed by header
options. After the
options, if any, up to
65,535 - 20 - 20 =
65,495 data bytes
may follow, where the
first 20 refer to the IP
header and the
second to the TCP
header.
The TCP header length tells how many 32-bit words are contained in the TCP
header. CWR and ECE are used to signal congestion when ECN (Explicit
Congestion Notification) is used. ECE is set to signal an ECN-Echo to a TCP sender
to tell it to slow down when the TCP receiver gets a congestion indication from the
network. CWR is set to signal Congestion Window Reduced from the TCP sender to
the TCP receiver so that it knows the sender has slowed down and can stop
sending the ECN-Echo.
URG is set to 1 if the Urgent pointer is in use. The ACK bit is set to 1 to indicate that
the Acknowledgement number is valid. If ACK is 0, the segment does not contain an
acknowledgement, so the Acknowledgement number field is ignored. The PSH bit
indicates PUSHed data. The receiver is hereby kindly requested to deliver the data to
the application upon arrival and not buffer it until a full buffer has been received (which
it might otherwise do for efficiency). The RST bit is used to abruptly reset a
connection. In general, if you get a segment with the RST bit on, you have a problem
on your hands. The SYN bit is used to establish connections. The connection request
has SYN = 1 and ACK = 0 to indicate that the piggyback acknowledgement field is not
in use. The connection reply does bear an acknowledgement, however, so it has SYN
= 1 and ACK = 1. In essence, the SYN bit is used to denote both CONNECTION
REQUEST and CONNECTION ACCEPTED, with the ACK bit used to distinguish
between those two possibilities. The FIN bit is used to release a connection. It
specifies that the sender has no more data to transmit. However, after closing a
connection, the closing process may continue to receive data indefinitely. Both SYN
and FIN segments have sequence numbers and are thus guaranteed to be processed
in the correct order.
TCP Connection Establishment/Release
6-31
TCP connection release
(a) TCP connection establishment in
the normal case.
(b) Call collision.
TCP Transmission Policy
Window management in TCP.
TCP Transmission Policy
What Nagle suggested is
simple: when data come
into the sender in small
pieces, just send the first
piece and buffer all the rest
until the first piece is
acknowledged. Then send
all the buffered data in one
TCP segment(Maximum)
and start buffering again
until the next segment is
acknowledged.
Clark’s Solution: Receiver
should not send a window
update until it can handle
the maximum segment size
it advertised when the
Silly window syndrome. connection was established
or until its buffer is half
empty. Minimum out of
these will be anounced
TCP Timers
Timers
Retransmission
Timer- Persistence Keep Alive Time Wait
RTO (Retransmission
Timer Timer Timer
TimeOut).
RTO
The solution is to use a dynamic algorithm that constantly adapts the timeout
interval, based on continuous measurements of network performance. The
algorithm generally used by TCP is due to Jacobson (1988) and works as follows.
For each connection, TCP maintains a variable, SRTT (Smoothed Round-Trip
Time), that is the best current estimate of the round-trip time to the destination in
question. When a segment is sent, a timer is started, both to see how long the
acknowledgement takes and also to trigger a retransmission if it takes too long. If
the acknowledgement gets back before the timer expires, TCP measures how long
the acknowledgement took, say, R. It then updates SRTT according to the formula
SRTT = α SRTT + (1 − α) R
where α is a smoothing factor that determines how quickly the old values are
forgotten. Typically, α = 7/8. This kind of formula is an EWMA (Exponentially
Weighted Moving Average) or low-pass filter that discards noise in the samples.
RTO=2*SRTT
But experience showed that a constant value was too inflexible because it failed
to respond when the variance went up.
RTO
To fix this problem previous problem ,Jacobson proposed
making the timeout value sensitive to the variance in round-
trip times as well as the smoothed round-trip time. This
change requires keeping track of another smoothed variable,
RTTVAR (Round- Trip Time VARiation) that is updated using
the formula
RTTVAR = β RTTVAR + (1 − β) |SRTT − R |
This is an EWMA as before, and typically β = 3/4. The retransmission timeout,
RTO, is set to be
RTO = SRTT + 4 × RTTVAR
TCP Timer Management
A second timer is the persistence timer. It is designed to prevent the
following deadlock. The receiver sends an acknowledgement with a
window size of 0, telling the sender to wait. Later, the receiver updates the
window, but the packet with the update is lost. Now the sender and the
receiver are each waiting for the other to do something. When the
persistence timer goes off, the sender transmits a probe to the receiver. The
response to the probe gives the window size. If it is still 0, the persistence
timer is set again and the cycle repeats. If it is nonzero, data can now be
sent.
Third timer is the keepalive timer. When a connection has been idle for a
long time, the keepalive timer may go off to cause one side to check
whether the other side is still there. If it fails to respond, the connection is
terminated.
The last timer used on each TCP connection is the one used in the TIME
WAIT state while closing. It runs for twice the maximum packet lifetime to
make sure that when a connection is closed, all packets created by it have
died off.
TCP Congestion Control
(a) A fast network feeding a
low capacity receiver.
(b) A slow network feeding a
high-capacity receiver.
Two windows maintained in parallel
flow control window
congestion window
Effective windows is the smaller of the two
Example: Receiver says send 64 KB
sender knows > 32 KB can cause congestion
sender will send only 32 KB
TCP Congestion Control
(a) A fast network feeding a
low capacity receiver.
(b) A slow network feeding a
high-capacity receiver.
One key takeaway was that a
transport protocol using an
AIMD (Additive Increase
Multiplicative Decrease) control
law in response to binary
congestion signals from the
network would converge to a
fair and efficient bandwidth
allocation.
When the load offered to any network is more than it can handle,
congestion builds up. The Internet is no exception. The network layer
detects congestion when queues grow large at routers and tries to manage
it, if only by dropping packets. It is up to the transport layer to receive
congestion feedback from the network layer and slow down the rate of
traffic that it is sending into the network. In the Internet, TCP plays the main
role in controlling congestion.
TCP Congestion Control
What happens when a sender on a fast network (the 1-Gbps link) sends a small
burst of four packets to a receiver on a slow network (the 1- Mbps link) that is the
bottleneck or slowest part of the path.
It is an essential part of TCP. By using an ack clock, TCP smoothes out
traffic and avoids unnecessary queues at routers.
Jacobson Solution
This algorithm is called slow start, but it is not slow at all—it is
exponential growth—except in comparison to the previous algorithm that let
an entire flow control window be sent all at once. Slow start is shown in Fig.
In the first round-trip time, the sender injects one packet into the network
(and the receiver receives one packet). Two packets are sent in the next
round-trip time, then four packets in the third round-trip time.
TCP Tahoe- Congestion Control
The maximum segment size here is 1 KB. Initially, the congestion window was 64
KB, but a timeout occurred, so the threshold is set to 32 KB and the congestion
window to 1 KB for transmission 0. The congestion window grows exponentially
until it hits the threshold (32 KB). The window is increased every time a new
acknowledgement arrives rather than continuously, which leads to the discrete
staircase pattern. After the threshold is passed, the window grows linearly. It is
increased by one segment every RTT.
TCP Reno- Congestion Control
After an initial slow start, the congestion window climbs linearly until a packet loss is
detected by duplicate acknowledgements. The lost packet is retransmitted and fast
recovery is used to keep the ack clock running until the retransmission is
acknowledged. At that time, the congestion window is resumed from the new slow
start threshold, rather than from 1. This behavior continues indefinitely, and the
connection spends most of the time with its congestion window close to the
optimum value of the bandwidth- delay product.
SACK (Selective ACKnowledgements)
By the time packet 6 is received, two SACK byte ranges are used to
indicate that packet 6 and packets 3 to 4 have been received, in addition to
all packets up to packet 1. From the information in each SACK option that it
receives, the sender can decide which packets to retransmit. In this case,
retransmitting packets 2 and 5 would be a good idea. SACK is strictly
advisory information. The actual detection of loss using duplicate
acknowledgements and adjustments to the congestion window proceed
just as before.
Cumulative ack doesn’t tell which segment lost
Fix: use selective ack (SACK) option
Example
Consider the bandwidth as 50 Kbps, one way transit
time=240 msec and the segment size is 1000 bit.
Consider the event of a segment transmission and
the corresponding ACK reception. Find the maximum
number of segments that can be outstanding during
this duration.
?
Solution
Bandwidth = 50 Kbps, one way transit time (delay) = 240msec
Bandwidth-delay product (BDP) is a term primarily used in
conjunction with TCP to refer to the number of bytes necessary to
fill a TCP "path", i.e. it is equal to the maximum number of
simultaneous bits in transit between the transmitter and the
receiver.
BDP=50kpbsX240msec= 12 Kbit
Segment size=1000 bit segment size;
Total number of segments = 12 segments
• Consider the event of a segment transmission and the
corresponding ACK reception –this takes a round trip time
(RTT) – twice the one way latency.
• Maximum number of segments that can be outstanding
during this duration = (12 x 2)+1
(as the ACK is sent only when the first segment is received)=
25 segments
Example
If the TCP round-trip time, RTT, is
currently 30 msec and the following
acknowledgements come in after 26,
32, and 24 msec, respectively, what is
the new RTT estimate. Use α = 0.9.
?
Solution
Example
Consider the effect of using slow start
on a line with a 10-msec round-trip time
and no congestion. The receive window
is 24 KB and the maximum segment
size is 2 KB. How long does it take
before the first full window can be sent?
?
Solution
Example
Suppose that the TCP congestion window is set to
18 KB and a timeout occurs. Assume that the
maximum segment size is 1 KB.
a. How big will the window be if the next four
transmission bursts are all successful?
b. After another four transmission bursts, how big will
the window be?
?
Solution
It sets the value of the threshold to one-half of the current window size so the
threshold will be reset to 18KB/2=9KB.
Given that if the next four transmission are all successful, then
•1st transmission: 1 segment, 1KB
•2nd transmission: 2 segments, 2KB
•3rd transmission: 4 segments, 4KB
•4th transmission: 8 segments, 8KB
After these four successful transmissions,
the window size is supposed to be 16KB.
However, since the threshold is 9KB, the
window size can only be 9KB.
Solution
Transport Layer
Transport Layer
TCP Vs UDP
Item TCP UDP
Acronym Transmission Control Protocol User Datagram Protocol
Connection connection-oriented protocol. connectionless protocol
Usage TCP is suited for applications that UDP is suitable for
require high reliability, and applications that need fast,
transmission time is relatively less efficient transmission, such
critical. as games.
Header Size TCP header size is 20 bytes UDP Header size is 8 bytes.
Weight TCP is heavy-weight. TCP requires UDP is lightweight. There is
three packets to set up a socket no ordering of messages, no
connection, before any user data can be tracking connections, etc. It is
sent. TCP handles reliability and a small transport layer
congestion control. designed on top of IP.
Applications-TCP and UDP