Lecture 11 - Transport Layer Protocols (TCP/UDP)


Transport Layer Services and Principles

Residing between the application and network layers, the transport layer is a central piece of the layered network architecture. It has the critical role of providing communication services directly to the application processes running on different hosts. We'll examine the possible services provided by a transport-layer protocol and the principles underlying various approaches toward providing these services. We'll also look at how these services are implemented and instantiated in existing protocols; as usual, particular emphasis will be given to the Internet protocols, namely, the TCP and UDP transport-layer protocols.

A transport-layer protocol provides for logical communication between application processes running on different hosts. By logical communication, we mean that although the communicating application processes are not physically connected to each other (indeed, they may be on different sides of the planet, connected via numerous routers and a wide range of link types), from the applications' viewpoint, it is as if they were physically connected. Application processes use the logical communication provided by the transport layer to send messages to each other, free from the worry of the details of the physical infrastructure used to carry these messages. Figure below illustrates the notion of logical communication.


Figure taken from Kuross-Ross online book

 

As shown in figure above, transport-layer protocols are implemented in the end systems but not in network routers. Network routers only act on the network-layer fields of the network layer PDU  (Protocol Data Units/ Packets); they do not act on the transport-layer fields.

On the sending side, the transport layer converts the messages it receives from a sending application process into Transport PDUs (that is, transport-layer protocol data units). This is done by (possibly) breaking the application messages into smaller chunks and adding a transport-layer header to each chunk to create Transport PDUs. The transport layer then passes the Transport PDUs to the network layer, where each Transport PDU is encapsulated into a Network PDU.
On the receiving side, the transport layer receives the Transport PDUs from the network layer, removes the transport header from the Transport PDUs, reassembles the messages, and passes them to a receiving application process.

A computer network can make more than one transport-layer protocol available to network applications. For example, the Internet has two protocols--TCP and UDP. Each of these protocols provides a different set of transport-layer services to the invoking application.

All transport-layer protocols provide an application multiplexing/demultiplexing service. This service will be described in detail in the next section. In addition to a multiplexing/demultiplexing service, a transport protocol can possibly provide other services to invoking applications, including reliable data transfer, bandwidth guarantees, and delay guarantees.


Relationship between Transport and Network Layers

The transport layer lies just above the network layer in the protocol stack. Whereas a transport-layer protocol provides logical communication between processes running on different hosts, a network-layer protocol provides logical communication between hosts. This distinction is subtle but important. Let's examine this distinction with the aid of a household analogy.

Consider two houses, one on the East Coast and the other on the West Coast, with each house being home to a dozen kids. The kids in the East Coast household are cousins of the kids in the West Coast household. The kids in the two households love to write to each other--each kid writes each cousin every week, with each letter delivered by the traditional postal service in a separate envelope. Thus, each household sends 144 letters to the other household every week. (These kids would save a lot of money if they had e-mail!) In each of the households there is one kid--Ann in the West Coast house and Bill in the East Coast house--responsible for mail collection and mail distribution. Each week Ann visits all her brothers and sisters, collects the mail, and gives the mail to a postal-service mail person who makes daily visits to the house. When letters arrive at the West Coast house, Ann also has the job of distributing the mail to her brothers and sisters. Bill has a similar job on the East Coast.

In this example, the postal service provides logical communication between the two houses--the postal service moves mail from house to house, not from person to person. On the other hand, Ann and Bill provide logical communication among the cousins--Ann and Bill pick up mail from and deliver mail to their brothers and sisters. Note that from the cousins' perspective, Ann and Bill are the mail service, even though Ann and Bill are only a part (the end system part) of the end-to-end delivery process. This household example serves as a nice analogy for explaining how the transport layer relates to the network layer:

hosts (also called end systems) = houses
processes = cousins
application messages = letters in envelopes
network-layer protocol = postal service (including mail persons)
transport-layer protocol = Ann and Bill

Continuing with this analogy, observe that Ann and Bill do all their work within their respective homes; they are not involved, for example, in sorting mail in any intermediate mail center or in moving mail from one mail center to another. Similarly, transport-layer protocols live in the end systems. Within an end system, a transport protocol moves messages from application processes to the network edge (that is, the network layer) and vice versa; but it doesn't have any say about how the messages are moved within the network core. In fact, as illustrated in Figure above, intermediate routers neither act on, nor recognize, any information that the transport layer may have appended to the application messages.

Continuing with our family saga, suppose now that when Ann and Bill go on vacation, another cousin pair--say, Susan and Harvey--substitute for them and provide the household-internal collection and delivery of mail. Unfortunately for the two families, Susan and Harvey do not do the collection and delivery in exactly the same way as Ann and Bill. Being younger kids, Susan and Harvey pick up and drop off the mail less frequently and occasionally lose letters (which are sometimes chewed up by the family dog). Thus, the cousin-pair Susan and Harvey do not provide the same set of services (that is, the same service model) as Ann and Bill. In an analogous manner, a computer network may make available multiple transport protocols, with each protocol offering a different service model to applications.

The possible services that Ann and Bill can provide are clearly constrained by the possible services that the postal service provides. For example, if the postal service doesn't provide a maximum bound on how long it can take to deliver mail between the two houses (for example, three days), then there is no way that Ann and Bill can guarantee a maximum delay for mail delivery between any of the cousin pairs. In a similar manner, the services that a transport protocol can provide are often constrained by the service model of the underlying network-layer protocol. If the network-layer protocol cannot provide delay or bandwidth guarantees for Transport PDUs sent between hosts, then the transport-layer protocol cannot provide delay or bandwidth guarantees for messages sent between processes.

Nevertheless, certain services can be offered by a transport protocol even when the underlying network protocol doesn't offer the corresponding service at the network layer. For example, as we'll see, a transport protocol can offer reliable data transfer service to an application even when the underlying network protocol is unreliable, that is, even when the network protocol loses, garbles, and duplicates packets. As another example, a transport protocol can use encryption to guarantee that application messages are not read by intruders, even when the network layer cannot guarantee the secrecy of Transport PDUs.

Overview of the Transport Layer in the Internet

Recall that the Internet, and more generally a TCP/IP network, makes available two distinct transport-layer protocols to the application layer. One of these protocols is UDP (User Datagram Protocol), which provides an unreliable, connectionless service to the invoking application. The second of these protocols is TCP (Transmission Control Protocol), which provides a reliable, connection-oriented service to the invoking application. When designing a network application, the application developer must specify one of these two transport protocols. The application developer selects between UDP and TCP when creating sockets

To simplify terminology, when in an Internet context, we refer to the Transport PDU as a segment. We mention, however, that the Internet literature (for example, the RFCs) also refers to the PDU for TCP as a segment but often refers to the PDU for UDP as a datagram. But this same Internet literature also uses the terminology datagram for the network-layer PDU! For an introductory note on computer networking such as this, we believe that it is less confusing to refer to both TCP and UDP PDUs as segments, and reserve the terminology datagram for the network-layer PDU.

Before preceding with our brief introduction of UDP and TCP, it is useful to say a few words about the Internet's network layer. The Internet's network-layer protocol has a name - IP, for Internet Protocol. IP provides logical communication between hosts. The IP service model is a best-effort delivery service. This means that IP makes its "best effort" to deliver segments between communicating hosts, but it makes no guarantees. In particular, it does not guarantee segment delivery, it does not guarantee orderly delivery of segments, and it does not guarantee the integrity of the data in the segments. For these reasons, IP is said to be an unreliable service. We also mention here that every host has an IP address. We need to keep in mind that each host has a unique IP address.

Having taken a glimpse at the IP service model, let's now summarize the service model of UDP and TCP. The most fundamental responsibility of UDP and TCP is to extend IP's delivery service between two end systems to a delivery service between two processes running on the end systems. Extending host-to-host delivery to process-to-process delivery is called application multiplexing and demultiplexing. UDP and TCP also provide integrity checking by including error-detection fields in their headers. These two minimal transport-layer services--process-to-process data delivery and error checking--are the only two services that UDP provides! In particular, like IP, UDP is an unreliable service--it does not guarantee that data sent by one process will arrive intact to the destination process.

TCP, on the other hand, offers several additional services to applications. First and foremost, it provides reliable data transfer. Using flow control, sequence numbers, acknowledgments, and timers (techniques we'll explore in detail in graduate courses), TCP ensures that data is delivered from sending process to receiving process, correctly and in order. When TCP breaks the message from the application into TCP segments, it assigns sequence numbers to each segment depending on their correct order at the source machine. These packets are shipped inside IP packets and each one may potentially take different paths to reach the destination machine. At the destination, the packets may thus arrive out of order. TCP would use the sequence numbers to rearrange these packets in the correct order, then join them together to obtain the original message that was sent by the source machine. Whenever, TCP at the destination machine, receives a packet (TCP segment) it sends an acknowledgement message (ACK packet) to the source. In this way the source keeps track of what packets have successfully reached the destination. If a packet gets dropped somewhere in the network, then the sender machine will never get any ACK packet for the dropped packet. The sender waits for some fixed time, for the ACK packet. When it doesn't get any ACK packet till that time, it retransmit the original packet. In this way even if a packet is lost, it can be sent again to the destination. TCP thus converts IP's unreliable service between end systems into a reliable data transport service between processes. A protocol that provides reliable data transfer is necessarily complex.

One might wonder why at all we need something like UDP. UDP is a light weight protocol, i.e., it doesn't have the overheads as TCP. TCP has to manage sequence numbers, send and receive acknowledgments, retransmit packet, congestion control and lot of other functionalities. Thus there are lot of overheads associated with TCP, which are not present with UDP. Thus the applications which are simple enough, which require less communication, usually query-response type communications, use UDP. Like DNS, which just sends a query and expects a response. It doesn't require advanced reliability and other features of TCP. Other set of applications are those which require fast response, less overheads, which can tolerate few losses of packets, like streaming audio and video. On the other had, applications which require that reliable data transfer should take place, especially bulk transfer of data, and which cannot tolerate loss of even a single packets use TCP, example FTP. In FTP, even if one packet is lost, the destination cannot reconstruct the original file that was transmitted. Thus, the application decides which transport layer protocol to use depending on the type of application.

Extras

There are lot more things to be studied about Transport layer protocols. In case of TCP, congestion control mechanism, delay, throughput analysis are important but beyond the scope of our discussion.

Comparison between TCP and UDP

  TCP  UDP
Connection-orientedConnectionless
Reliable packet deliveryUnreliable packet delivery
Has flow control mechanismNo flow control mechanism
Not preferred for real time applicationsPreferred for real time applications
Has considerable overheadLightweight protocol with less overhead
Application: FTP, HTTP etcApplication: DNS, DNS