You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Computer Networking: A Top-Down Approach - Kurose (2016)
Chapter 1: Computer Networks and the Internet
1.1 What is the Internet?
1.1.1 A Nuts-and-Bolts Description
Devices connected to internet are called hosts or end systems
End systems are connected by a network of communication links and packet switches
Different links can transmit data at different rates
Transmission rate of a link measured in bits/second
Sending end-system segments data and adds header bytes to each segment; these segments are called packets
Two most prominent types of packet switches are routers and link-layer switches
Forward packets to their ultimate destinations
End systems access the internet through internet service providers (ISPs)
End systems, packet switches, and other pieces of the Internet run protocols that control the sending and receiving of info
Transmission Control Protocol (TCP) and Internet Protocol (IP) are mot important
Internet standards are developed by Internet Engineering Task Force (IETF)
Standards documents are called requests for comments (RFCs)
1.1.2 A Services Description
Distributed applications involve multiple end systems that exchange data
Internet apps run on end systems, not in packet switches
End systems attached to the internet provide an API that specifies how a program running on one end system asks the Internet infrastructure to deliver data to a specific destination program running on another end system
1.1.3 What is a Protocol?
A protocol defines the format and the order of messages exchanged between two or more communicating entities, as well as the actions taken on the transmission and/or receipt of a message or other event
1.2 The Network Edge
Hosts are divided into two categories: clients and servers
Most servers are stored in large data centers
1.2.1 Access Networks
Two most prevalent types of broadband residential access are digital subscriber line (DSL) and cable
DSL provided by telco
Cable provided by television company
1.3 The Network Core
1.3.1 Packet Switching
End systems exchange messages with eachother
To accomplish this, the source breaks messages into packets
Packets travel through links and packet switches
Two types of packet switches: routers and link-layer switches
Most packet switches use store-and-forward transmission at the inputs to links
Packet switch must receive entire packet before it can begin the transfer the packet onto the outbound link
Routers must buffer packet's bits until all are received
end-to-end delay (dend-to-end) = NL/R
N is the number of links in the system
R is the transmission rate of each link
L is the number of bits transmitted
Queuing Delays and Packet Loss
Each packet switch has multiple links attached; for each attached link, the packet switch has an output buffer (AKA output queue) which stores packets that the router is about to send into that link
If an arriving packet needs to be transmitted but the link is busy with the transmission of another packet, the arriving packet needs to wait in the output buffer
This is called queuing delay
Since buffer space is finite, it is possible that an arriving packet finds the buffer full with other packages
Packet loss occurs in this situation; either the ariving packet or one of the already-queued packets will be dropped
Forwarding Tables and Routing Protocols
Each router has a forwarding table that maps destination addresses to that router's outbound links
1.3.2 Circuit Switching
There are two approaches to moving data trhough a network: circuit switching and packet switching
In circuit-switched networks, the resources needed along a path (buffers, link transmission rate) are reserved for the duration of communcation between the end systems
Ex. telephone networks. You establish a circuit With the receiver of your call; resources are reserved for the two of you
Multiplexing in Circuit-Switched Networks
A circuit in a link is implemented with either frequency-division multiplexing (FDM) or time-division multiplexing (TDM)
In FDM: the frequency spectrum of a link is divided amongst the connections established across the link
A frequency band is dedicated to each connection for the duration of the connection
The width of this band is called the bandwidth
In TDM: time is divided into frames of fixed duration, each frame is divided into a fixed number of time slots
Packet Switching Versus Circuit Switching
Packet switching not suitable for real-time services (ie. telephone and video conference calls)
Packet switching offers better sharing of transmission capacity and is simpler + more efficient to implement
1.3.3 A Network of Networks
PoP (point of presence): group of one or more routers at the same location in the provider's network where customer ISPs can connect into the provider ISP
1.4 Delay, Loss. and Throughput in Packet-Switched Networks
1.4.1 Overview of Delay in Packet-Switched Networks
Types of delay: nodal processing delay, queuing delay, transmission delay; together, these give a total nodal delay
dnodal = dproc + dqueue + dtrans + dprop
Processing Delay
Time required to examine the packet's header and determine where to direct the packet
Queuing Delay
Time waited in queue before packet transmitted onto link
Unlike other delays, can vary from packet to packet
First packet does not experience queuing delay
Let a denote the average rate at chich packets arrive at the queue (packets/sec)
Average rate at which bits arrive in the queue is La bits/sec
Assume the queue can hold infinite bits. The ratio La/R is called traffic intensity
If La/R > 1, the average rate at which bits arrive at the queue exceeds the rate at which bits can be transmitted from the queue. Hence, queuing delay will approach infinity. Hence, systems must never be designed with a traffic intensity greater than 1.
If La/R < 1, the nature of arriving traffic impacts queuing delay. If packets arrive simultaneously every (L/R)N seconds, the first packet has no queuing delay the second packet has queuing delay L/R, and the nth packet has queuing delay (n-1)L/R
Transmission Delay
dtrans = L/R, where L is the number of bits and R is the transmission rate of the link
Propagation Delay
After a bit is pushed into the link, the time required to propagate from the beginning of the link to router B
Distance between two routers divided by the propagation speed: dprop = d/s, where d is the distance between router A and router B and s is the propagation speed of the link
1.4.3 End-to-End Delay
So far, we've focused on nodal delay (delay at single router)
Now, consider todal delay from source to destination. Suppose there are N-1 routers between source and dest
dend-end = N(dproc + dtrans + dprop)
Where dtrans = L/R; L is packet size
1.4.4 Throughput in Computer Networks
Instantaneous throughput at any instant of time is the rate in bits/sec at which host B is receiving a file
Average throughput is F/T where F is the total number of bits in the file and T is the time it took to receive
1.5 Protocol Layers and Their Service Models
1.5.1 Layered Architecture
Protocol Layering
Transport-layer and application-layer protocols are almost always implemented in software in the end systems
Five-layer internet protocol stack:
Application
Transport
Network
Link
Physical
Application Layer
Includes protocols such as HTTP, SMTP, and FTP
Protocols distrubuted over multiple end system
App layer packets are messages
Transport Layer
Transports app-layer messages between app endpoints. There are two transport protocols:
TCP
UDP
TCP provides a connection-oriented service to its applications, guaranteeing delivery of messages and flow control
Breaks long messages into shorter segments and provides a congestion-control mechanism, so a source throttles its transmission rate when the network is congested
UDP provides a connectionless service: no reliability, no flow control, and no congestion control
Transport Layer packets are segments
Network Layer
Moves network-layer packets (datagrams) from one host to another
The transport-layer protocol gives a segment and destination address to the network layer, just as you give the postal service a letter with an address
The network layer delivers the segment to the transport layer in the destination host
Includes IP protocol: defines fields in the datagram as well as how end systems and routers act in these fields
Link Layer
Routes a datagram through a series of routers between the source and destination
At each node, the network layer passes the datagram down to the link layer, which delivers the datagram to the next node along the route. At this next node, the link layer passes the datagram up to the network layer
Examples of link-layer protocols: Ethernet, Wifi
Link-layer packets are frames
Physical Layer
Moves individual bits within a frame from one node to the next
1.5.2 Encapsulation
At the sending host, an app-layer message is passed to the transport layer. In the simplest case, the transport layer takes the message and appends additional info that will be used by the receiver-side transport layer
The application-layer message and the transport-layer header info together constitute the transport-layer segment
The transport layer thus encapsulates the app-layer message, including info allowing the receiver-side transport layer to deliver the message up to the appropriate application and error-detection bits to allow the receiver to determine whether bits in the message have been changed in route
The transport layer then passes the segment to the network layer, which adds network-layer header information such as source and destination end system addresses, creating a datagram
Network layer passes datagram to link layer, which adds its own lin-layer header info and creates a frame
Hence, for each layer, a packet has two types of fields: header fields and a payload field. The payload is typically a packet from the layer above
1.4 Networks Under Attack
Compromised host may be enrolled in a network of similarly compromised devices, known as a botnet
Malware can sepread in the form of a virus or worm
Virus: require some form of user interaction to infect a device (ex. open an email attachment containing a malicious .exe)
Worms: enter a decice without explicit user interaction (ex. a user may be running a vulnerable network app to which an attacker can send malware. It is possible that the application may accept the malware and run it, creating a worm on the device. The newly infected device then scans the Internet, searching for other hosts running the same vulnerable network application to send a copy of the worm)
Denial-of-service (DoS) attacks render a network, host, or other piece of infrastructure unusable. Three categories:
Vulnerability attack: Send messages to a vulnerable application or OS running on a targeted host. If the right sequence of packets is sent to a vulnerable application or OS, the service can stop or the host can crash
Bandwidth flooding: send a deluge of packets to the targeted host to clog the target's access link, preventing legitimate packets from reaching the server
Connection flooding: attacker establishes a large number of half- or fully-open TCP connections at the target host. The host is full of bogus connections, so it stops accepting legitimate connections
Distributed denial-of-service (DDos): attacker controls multiple sources and each source blasts traffic at target
Easy to create a packet with an arbitrary source address, packet content, and destination address. This is called IP spoofing
To combat this, implement end-point authentication
Chapter 2: Application Layer
2.1 Principles of Network Applications
Application architecture is designed by the app developer. There are two choices:
Client-server architecture
Always-on host, called the server, services requests from many other hosts, called clients
Peer-to-peer (P2P) architecture
Minimal (or no) reliance on dedicated servers; the application exploits direct communication between pairs of intermittently connected hosts called peers
2.1.2 Processes Communicating
Client and Server Processes
The process that initiates the communication is the client; the process that waits to be contacted is the server
The Interface Between the Process and the Computer Network
A socket is the interface between the app layer and the transport layer within a host. Also called the API between the application and the network
Addressing Processes
A host is identified by its IP address
The sending process identifies the receiving process with a port number
2.1.3 Transport Services Available to Applications
Reliable Data Transfer
Not necessary for loss-tolerant applications, such as video conference software
Throughput
Applications that have throughput requirements are said to be bandwidth-sensitive applications. Many current multimedia applications are bandwidth sensitive
Elastic applications, on the other hand, can make use of as much, or as little, throughput as happens to be available. Electronic mail, file transfer, and Web transfers are all elastic applications
Timing
Can guarantee, for example, that data arrives to a socket within 100msec
Security
For example, in the sending host, a transport protocol can encrypt all data transmitted by the sending process, and in the receiving host, the transport-layer protocol can decrypt the data before delivering the data to the receiving process
2.1.4 Transport Services Provided by the Internet
TCP Services
Connection-oriented service
Client and server exchange transport-layer control information before the application-level messages begin to flow
This process is called handshaking
Allows client and server to prepare for packets
After handshaking, a TCP connection is said to exist between the sockets of the two processes
Connection is full-duples; two processes can send messages to each other over the connection at the same time
Reliable data transfer
TCP delivers all data without error and in the proper order
Congestion control
Throttles a sending process when the network is congested between sender and receiver
UDP Services
Lightweight, no-frills
Connectionless; no handshaking
Internet telephony applications prefer UDP for its speed, though reliable data transfer is not guaranteed
2.1.5 Application Layer Protocols
Defines how an application's processes, running on different end systems, pass messages to each other
Defines the types of messages exchanged (ex. request messages and response messages)
The syntax of various message types (Ex. the fields of the message and how they are delineated)
The semantics of the fields
Rules for determining when and how a process sends messages and responds to messages
2.2 The Web and HTTP
2.2.1 Overview of HTTP
HyperText Transfer Protocol (HTTP) is the Web's application layer protocol
Implemented in two programs: a client program and a server program
Each execute on different end systems, talk to each other by exchanging HTTP messages
HTTP defines the structure of these messages
Web browsers implement the client side of HTTP, so we can use "client" and "browser" interchangeably in this context
Web servers implement the server side of HTTP, house Web objects, each addressable by a URL
When a user requests a Web page, the browser sends HTTP request messages for the objects in the page to the server. The server receives the requests and responds with HTTP response messages that contain objects
HTTP uses TCP as its underlying transport protocol
HTTP client first initiates a TCP connection with the server. Once the connection is established, the browser and the server processes access TCP through their socket interfaces
2.2.2 Non-Persistent and Persistent Connections
App developer must choose
Non-persistent: each request/response pair is sent over a separate TCP connection
Persistent: all the requests and their corresponding responses are sent over the same TCP connection
HTTP with Non-Persistent Connections
Assume we try to access http://www.someSchool.edu/someDepartment/home.index
HTTP client initiates a TCP connection to the server www.someSchool.edu on port 80. There will be a socket at the client and a socket on the server
The HTTP client sends an HTTP request message to the server via its socket. The request message includes the path name /someDepartment/home.index
The HTTP server process receives the request message via its socket, retrieves the object /someDepartment/home.index from its storage (RAM or disk), encapsulates the object in an HTTP response message, and sends the response message to the client via its socket
The HTTP server process tells TCP to close the TCP connection
The HTTP client receives the response message. The TCP connection terminates. The message indicates that the encapsulated object is an HTML file. The client extracts the file from the response message, examines the HTML file, and finds referenes to other objects on the page (ex. JPEGs or Java applets)
The first four steps are repeated for each referenced object on the page
Round-trip time (RTT) is the time it takes for a small packet tor travel from client to server and then back to the client
HTTP with Persistent Connections
Non-persistent connections have shortcomings:
A brand-new connection must be established and maintained for each requested object. For each of these connections, TCP buffers must be allocated and TCP variables must be kept in both the client and server
Place a significant burden on the Web server, which may be serving requests from hundreds of clients simultaneously
Each object suffers a delivery delay of two RTTs; one to establish the TCP connection and one to request and receive an object
With persistent connections, the server leaves the TCP connection open after sending a response
Subsequent requests and responses between the same client and server can be sent over the same connection
An entire web page can be sent over a single persistent TCP connection
Multiple web pages residing on the same server can be sent from the server to the same client over a single persistent TCP connection
Requests for objects can be made back-to-back, without waiting for replies to pending requests (pipelining)
The default mode of HTTP uses persistent connections with pipelining
2.2.3 HTTP Message Format
HTTP Request Message
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/5.0
Accept-language: fr
First line is called the request line; subsequent lines are called header lines
The request line has three fields
The method field (GET, POST, HEAD, PUT, or DELETE)
The URL field
The HTTP version field
Connection: close means we don't want a persistent connection; close the connection after sending the requested object
Accept-language: fr means we prefer French, but return the default language if not available
HTTP Response Message
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Aug 2011 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html
(data data data data data ...)
Three sections: an initial status line, six header lines, then the entity body
Entity body is the requested object itself
Status line has three fields: the protocol version field, a status code, and a corresponding status message
2.2.4 User-Server Interaction: Cookies
HTP servers are stateless; however, it is often desirable for a website to identify users
Components of a cookie
A cookie header line in HTTP response message
Cookie header line in HTTP request message
Cookie file kept on the user's end system and manager by the user's browser
A back-end database at the website
The first time a user visits a site, the user can provide a user identification (possibly his or her name). During the subsequent sessions, the browser passes a cookie header to the server, thereby identifying the user to the server. Cookies can thus be used to create a user session layer on top of stateless HTTP
2.2.5 Web Caching
A Web cache, also called a proxy server, is a network entity that satisfies HTTP requests on the behalf of an origin Web server
A user’s browser can be configured so that all of the user’s HTTP requests are first directed to the Web cache
Suppose a browser requests the object at http://www.someschool.edu/campus.gif:
Browser establishes a TCP connection to the Web Cache and sends an HTTP request for the object to the Web cache
The Web cache checks to see if it has a copy of the object stored locally. If it does, the Web cache returns the object within an HTTP response message to the client browser
If the Web cache does not have the object, the Web cache opens a TCP connection to the origin server, that is, to www.someschool.edu. The Web cache then sends an HTTP request for the object into the cache-to-server TCP connection. After receiving this request, the origin server sends the object within an HTTP response to the Web cache
When the Web cache receives the object, it stores a copy in its local storage and sends a copy, within an HTTP response message, to the client browser (over the existing TCP connection between the client browser and the Web cache)
A cache is both a server and a client at the same time. When it receives requests from and sends responses to a browser, it is a server. When it sends requests to and receives responses from an origin server, it is a client
Typically a Web cache is purchased and installed by an ISP. For example, a university might install a cache on its campus network and configure all of the campus browsers to point to the cache. Or a major residential ISP (such as AOL) might install one or more caches in its network and preconfigure its shipped browsers to point to the installed caches.
2.2.6 The Conditional GET
HTTP has the conditional GET mechanism that allows a cache to verify its objects are up to date
A request message is a so-called conditional GET message if (1) the request message uses the GET method and (2) the request message includes an If-Modified-Since: header line
2.3 File Transfer: FTP
The user interacts with FTP through an FTP user agent
The user first provides the hostname of the remote host, causing the FTP client process in the local host to establish a TCP connection with the FTP server process in the remote host
The user then provides the user identification and password, which are sent over the TCP connection as part of FTP commands
Once the server has authorized the user, the user copies one or more files stored in the local file system into the remote file system (or vice versa)
HTTP and FTP are both file transfer protocols that run on TCP. However, the two application-layer protocols have important differences:
FTP uses two parallel TCP connections to transfer a file, a control connection and a data connection
Control connection is for sending information such as user identification, password, commands to change remote directory, and commands to "put" and "get" files.
The data connection is used to actually send a file
Because FTP uses a separate control connection, FTP is said to send its control information out-of-band
HTTP uses only one TCP connection; hence it is said to send its control information in-band
FTP maintains state about the user
In particular, the server must associate the control connection with a specific user account, and the server must keep track of the user’s current directory as the user wanders about the remote directory tree
2.4 Electronic Mail in the Internet
Three major components:
User agents
Mail servers
Simple Mail Transfer Protocol (SMTP)
Mail servers form the core of the e-mail infrastructure. Each recipient, such as Bob, has a mailbox located in one of the mail servers. Bob’s mailbox manages and maintains the messages that have been sent to him
A typical message starts its journey in the sender’s user agent, travels to the sender’s mail server, and travels to the recipient’s mail server, where it is deposited in the recipient’s mailbox
2.4.1 SMTP
Older than HTTP; legacy technology. Restricts body of mail messages to 7-bit ASCII
Example: Alice sends Bob an ASCII message
Alice invokes her user agent for e-mail, provides Bob’s e-mail address (for example, [email protected]), composes a message, and instructs the user agent to send the message
Alice's user agent sends the message to her mail server, where it is placed in a message queue
The client side of SMTP, running on Alice's mail server, sees the message in the message queue. It opens a TCP connection to an SMTP server, running on Bob's mail server
After some initial SMTP handshaking, the SMTP client sends Alice's message into the TCP connection
At Bob's mail server, the server side of SMTP receives the message. Bob's mail server then places the message in Bob's mailbox
2.4.2 Comparison with HTTP
SMTP and HTTP are both protocols used to transfer files: HTTP transfers files (also called objects) from a Web server to a Web client (typically a browser); SMTP transfers files (that is, e-mail messages) from one mail server to another mail server
HTTP and SMTP both use persistent connections
HTTP is mainly a pull protocol: someone loads information on a Web server and users use HTTP to pull the information from the server at their convenience
SMTP is primarily a push protocol: the sending mail server pushes the file to the receiving mail server
SMTP restricts messages to 7-bit ASCII (ex. French characters with accents not supported). HTTP does not impose this restriction
2.4.4 Mail Access Protocols
How does a recipient obtain messages which are sitting in a server within their ISP?
User agent can't use SMPT because obtaining messages is a a pull operation, whereas SMTP is a push operation
Hence, user must select a mail access protocol
POP3
The simplest mail access protocol; short and readable
Begins when the user agent opens a TCP connection to the mail server
POP3 progresses through three phases
Auhtorization: user agent sends a username and password to authenticate the user
Transition: user agent receives messages and the user agent can mark messages for deletion/remove deletion marks/obtain mail statistics
Update: occurs after the client has issued quit command, ending POP3 session. The mail server deletes the messages that were marked for deletion
IMAP
Allows user to create remote folders and assign messages to folders
IMAP associates each message with a folder; when a message arrives at the server, it is associated with the recipient's inbox folder
The recipient can move the message into a new, user-created folder, read the message, delete the message, etc.
Unlike POP3, an IMAP server maintains user state info - for example, the names of folders and which messages are associated with which folders
Web-Based E-Mail
The user agent is a web browser, and the user communicates with its remote mailbox via HTTP
When a recipient wants to access a message in his mailbox, the message is sent from his mail server to his browser using HTTP rather than POP3 or IMAP
When a sender wants to send an e-mail, the message is sent from his browser to his mail server over HTTP rather than SMTP
2.5 DNS - The Internet's Directory Service
Hots must have identifiers
One way to identify a host is a hostname, such as yahoo.com or google.com
Hosts are also identified by IP addresses
2.5.1 Services Provided by DNS (Domain Name System)
Primary goal is to translate hostnames to IP addresses
Distributed database implemented in a hierarchy of DNS servers
Application-layer protocol that allows hosts to query the distributed database
DNS is commonly employed by other application-layer protocols - including HTTP, SMTP, and FTP - to translate user-supplied hostnames to IP addresses
For example to access naimayat.me, the user host must first obtain the IP address
The user machine runs the client side of the DNS application
The browser extracts the hostname naimayat.me from the URL
The DNS client sends a query containing the hostname to a DNS server
The DNS client receives a reply which includes the IP address for the hostname
Once the browser receives the UP address from the DNS, it can initiate a TCP connection to the HTTP server process
DNS also provides other important services in addition to hostname translation:
Host aliasing: A host with a complicated hostname can have one or more alias names. For example, a hostname such as relay1.west-coast.enterprise.com could have, say, two aliases such as enterprise.com and www.enterprise.com. In this case, the hostname relay1.westcoast.enterprise.com is said to be a canonical hostname. Alias hostnames, when present, are typically more mnemonic than canonical hostnames. DNS can be invoked by an application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host
Mail server aliasing: For obvious reasons, it is highly desirable that e-mail addresses be mnemonic. For example, if Bob has an account with Hotmail, an e-mail address might be as simple as [email protected]. However, the hostname of the Hotmail mail server is more complicated and much less mnemonic than simply hotmail.com (for example, the canonical hostname might be something like relay1.west-coast.hotmail.com). DNS can be invoked by a mail application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
Load distribution: DNS can distribute the load among a website's servers by associating a set of IP addresses with a specific server
2.5.2 Overview of How DNS Works
There are a large number of DNS servers distributed around the globe
Problems with a single DNS server
A single point of failure: if DNS server crashes, so does the entire internet
Traffic volume: the single server would have to handle all the DNS queries for millions of hosts
Distant centralized database: a single DNS server cannot be "close to" all the querying clients
Maintenence: a single DNS server wuld have to keep records for all internet hosts
A Distributed, Hierarchical Database
To deal with scale, DNS uses many server organized in a hierarchical fashion
Mappings are distributed accross three types of servers
Root DNS servers. There are 13 in the world.
Top-level domain (TLD) DNS servers. Responsible for top-level domains such as com, org, net, edu.
Authoritative DNS servers. Every organization with publicly accessible hosts (such as Web servers and mail servers) on the Internet must provide publicly accessible DNS records that map the names of those hosts to IP addresses.
There is another important type of DNS server called the local DNS server. A local DNS server does not strictly belong to the hierarchy of servers but is nevertheless central to the DNS architecture. Each ISP—such as a university, an academic department, an employee’s company, or a residential ISP—has a local DNS server (also called a default name server). When a host connects to an ISP, the ISP provides the host with the IP addresses of one or more of its local DNS servers
DNS Caching
In a query chain, when a DNS server receives a DNS reply (containing, for example, a mapping from a hostname to an IP address), it can cache the mapping in its local memory
Since mappings between hostnames and IP address are not permanent, DNS servers discard cached information after a period of time (often set to two days)
2.5.3 DNS Records and Messages
The DNS servers that together implement the DNS distributed database store resource records (RRs), including RRs that provide hostname-to-IP address mappings
A resource record is a four-tuple that contains the following fields: (Name, Value, Type, TTL)
TTL is the time to live; it determines when a resource should be removed from a cache
The meaning of name and value tepend on type:
If type=A, then name is a hostname and value is the IP address for the hostname
If type=NS, then name is a domain and value is the hostname of an authoritative DNS server that knows how to obtain the IP addresses for hosts in the domain
If type=CNAME, then Value is a canonical hostname for the alias hostname name. This record can provide querying hosts the canonical name for a hostname
If type=MX, then value is the canonical name of a mail server that has an alias hostname name
DNS Messages
First 12 bytes is the header section
16-bit number that identifies the query
A 1-bit query/reply flag indicates when the message is a query or reply
A1-bit authoritative flag is set in a reply message when a DNS server is an authoritative server for a queried name
A 1-bit recursion-desired flag is set when a client (host or DNS server) desires that the DNS server perform recursion when it doesn't have the record
A 1-bit recursionavailable field is set in a reply if the DNS server supports recursion
there are also four number-of fields. These fields indicate the number of occurrences of the four types of data sections that follow the header
Question section contains info about the query
Name field that contains the name that is being queried
Type field that indicates the type of question being asked
Answer section contains the resource records for the name that was originally queried
Authority section contains records of other authoritative servers
Additional section contains other helpful records. For example, the answer field in a reply to an MX query contains a resource record providing the canonical hostname of a mail server. The additional section contains a Type A record providing the IP address for the canonical hostname of the mail server
2.6 Peer-to-Peer Applications
So far, everything covered employs client-server architecture
2.6.1 P2P File Distribution
Scalability of P2P Architectures
Distribution time is the time it takes to get a copy of the file to all N peers
Distribution time for client-server architecture Dcs ≥ max{NF/us, F/dmin}
Where N = number peers, F = file size in bits, us = server upload rate, dmin = download rate of the peer with the lowest download rate
In P2P observe:
At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. Thus, the minimum distribution time is at least F/us. (Unlike the client-server scheme, a bit sent once by the server may not have to be sent by the server again, as the peers may redistribute the bit among themselves.)
As with the client-server architecture, the peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. Thus the minimum distribution time is at least F/dmin.
Finally, observe that the total upload capacity of the system as a whole is equal to the upload rate of the server plus the upload rates of each of the individual peers, that is, utotal = us + u1 + ... + uN. The system must deliver (upload) F bits to each of the N peers, thus delivering a total of NF bits. This cannot be done at a rate faster than utotal. Thus, the minimum distribution time is also at least NF/(us + u1 + ... + uN).
Thus, the minimum distribution time for client-server architectures increases linearlly with the number of peers, wheras P2P is logarithmic
BitTorrent
Collection of all peers participating in the distribution of a file is called a torrent
Peers in a torrent download equal-size chunks of the file from one another
When a peer first joins a torrent, it has no chunks
When a peer accumulates chunks, it also uploads them to other peers
Once a peer has aquired the entire file, it may (selfishly) leave the torrent or remain and continue to upload chunks to other peers
A peer may leave the torrent at any time with only a subset of chunks, and later rejoin the torrent
Each torrent has an infrastructure node called a tracker
When a peer joins a torrent, it registers itself with the tracker and periodically informs the tracker that it is still in the torrent
When a new peer joins the torrent, the tracker randomly selects a subset of peers from the set of participants and sends their addresses to the new peer. The new peer can then establish concurrent TCP connections with all the peers on this list, called "neighboring peers"
As time passes, some of these neighbors may leave and some may attempt to join
At any given time, each peer will have a subset of chunks from the file
Periodically, a peer will ask each of its neighbors for the list of the chunks they have. If it has L neighbors, it will obtain L lists of chunks
With this knowledge, the peer uses a techique called rarest first to request the rarest chunks for download, aiming to equalize the numbers of copies of each chunk in the torrent
A peer gives priority to the four neighbors that are supplying data at the highest rates
The peer reciprocates and sends chunks to these same four peers. Every 10 seconds, the peer recalculates the rates and possibly modifies the set of four peers
These four peers are said to be unchoked
Every 30 seconds, the peer picks another neighbor at random and sends it chunks. This randomly selected peer is said to be optimistically unchoked in hopes that the random peer will start sending data back
The optimistically unchoked peer could become one of our top four uploaders, and hence we start trading data
This is a tit-for-tat system allowing peers of similar upload capacity to share with each other
Chapter 3: Transport Layer
3.1 Introduction and Transport-Layer Services
Transport layer protocol provides for logical communication between application processes running on different hosts
Transport-layer packets are called segments
3.1.1 Relationship Between Transport and Network Layers
A transport-layer protocol provides logical communication between processes running on different hosts, a network-layer protocol provides logical communication between hosts
3.1.2 Overview of the Transport Layer in the Internet
The internet makes two transport-layer protcols available to the application layer: UDP (User Datagram Protocol) and TCP (Transmission Control Protocol)
When designing an application, the developer must specify one of these two protocols
IP is a best-effort delivery service
The responsibility of UDP and TCP is to extend IP's delivery service between two end systems to a delivery service between two processes running on the end systems
Extending host-to-host delivery to process-to-process delivery is called transport layer multiplexing and demultiplexing
TCP provides reliable data transfer using flow control, sequence numbers, acknowledgments, and timers. Also provides congestion control for the general good of the internet
3.2 Multiplexing and Demultiplexing
Gathering data chunks at the source host from different sockets, encapsulating each data chunk with header information (that will later be used in demultiplexing) to create segments, and passing the segments to the network layer is called multiplexing
Job of delivering data in a transport layer segment to the correct socket is called demultiplexing
Transport layer multiplexing requires
That sockets have unique identifiers
That each segment have special fields that indicate the socket to which the segment is to be delivered
These special fields are the source port number field and the destination port number field
Each port number is a 16-bit number (range from 0 to 66535)
The port numbers ranging from 0 to 1023 are called well-known port numbers are restricted, meaning they are reserved for use by well-known application protocols such as HTTP (port 80) and FTP (port 21)
Connectionless (UDP) Multiplexing and Demultiplexing
When a UDP socket is created, the transport layer automatically assigns a port number in the range 1024 to 65535
If we use the bind() method, we can specify our own address
Connection-Oriented (TCP) Multiplexing and Demultiplexing
A TCP socket is identified by a four-tuple as opposed to UDP's two-tuple: (source IP, source port, destination IP, destination port)
When a TCP segment arrives from the network to a host, the host uses all four values to direct (demultiplex) the segment to the appropriate socket
3.3 Connectionless Transport: UDP
UDP takes messages from the application process, attaches source and destination port number fields for the multiplexing/demultiplexing service, adds two other small fields, and passes the tesulting segment to the network layer
The network layer encapsulates the transport layer segment into an IP datagram then makes a best-effort attempt to deliver the segment to the receiving host
If the segment arrives at the receiving host, UDP uses the destination port number to deliver the segment's data to the correct application process
Note that there is no handshaking in UDP; it is therefore connectionless
DNS is an application-layer protocol that typically uses UDP
When is UDP preferable?
We want finer application-level control over what data is sent, and when. Under UDP, as soon as an application process passes data to UDP, UDP will package the data inside a UDP segment and immediately pass the segment to the network layer. TCP, on the other hand, has a congestion-control mechanism that throttles the transport-layer TCP sender when one or more links between the source and destination hosts become excessively congested.
We want no connection establishment; since the handshake would introduce delay to establish a connection. This is why DNS uses UDP rather than TCP; it would be much slower otherwise.
We want no connection state. A server devoted to a particular app can typically support many more active clients when the application runs over UDP
We want smaller packet overhead. TCP headers adds 20 bytes to every segment, while UDP adds only 8
3.3.2 UDP Segment Structure
UDP has four fields consisting of two bytes each; source port, dest port, length, and checksum
3.3.3 UDP Checksum
Checksum is used to determine whether bits within a UDP segment have ben alrtered as it is moved from source to destination
UDP at the sender side performs the 1s complement of the sum of all the 16-bit words in the segment
Given that neither link-by-link reliability nor in-memory error detection is guaranteed, UDP must provide error detection at the transport layer, on an end-end basis, if the end-end data transfer service is to provide error detection
3.4 Principles of Reliable Data Transfer
TCP is a reliable data transfer protocol that is implemented on top of an unreliable (IP) end-to-end network layer
3.4.1 Building a Reliable Data Transfer Protocol
Reliable Data Transfer over a Perfectly Reliable Channel: rdt1.0
The finite state machine (FSM) definitions for an rdt1.0 sender and receiver state that it is completely reliable
Reliable Data Transfer over a Channel with Bit Errors: rdt2.0
A more realistic model of the underlying channel is one in which bits in a packet may be corrupted
The message-dictation protocol uses both positive acknowledgements and negative acknowledgements
Process of retransmitting data in a reliable data transfer protocol is called ARQ (Automatic Repeat Request) Protocol
Three qualities required of an ARQ protocol to handle bit errors:
Error detection. Set up a checksum: send extra bits from the sender to the receiver to be gathered into the checksum field
Receiver feedback. Since the sender and receiver execute on different end systems (typically), the only way for the sender to learn of the receiver's worldview is for the receiver to provide explicit feedback to the sender. The positive (ACK) and negative (NAK) acknowledgement replies in the message-dictation scenario are examples of such feedback
Retransmission. A packet that is received in error at the receiver will be retransmitted by the sender
Sender will not send a new piece of data until it is sure that the receiver has received the current packet. Such protocols are called stop-and-wait protocols
We need to add a new field to the data packet and have the sender number its data packets by putting a sequence number into this field. The receiver then need only check this sequence number to determine whether or not the received packet is a retransmission
Reliable Data Transfer over a Lossy Channel with Bit Errors: rdt3.0
Suppose now that the underlying channel can lose packets as well as corrupt bits
How do we detect packet loss, and how do we combat it?
If sender waits long enough that it is certain that a packet is lost, it can simply retransmit the packet
How long is long enough?
The sender must wait at least as long as RTT between the sender and receiver plus whatever time is needed to process a packet at the receiver. Moreover, the protocol should ideally recover from packet loss as soon as possible; waiting for a worst-case delay could mean a long wait until error recovery is initiated
The adopted approach, therefore, is for the sender to judiciously choose a wait time such that packet loss is likely (though not guaranteed) to have happened. If an ACK is not received in this time, the packet is retransmitted
This introduces the possibility of duplicate data packets
Implementing a time-based retransmission mechanism requires a countdown timer that can interrupt the sender. The sender will thus need to be able to:
Start the timer each time a packet is sent
Respond to a timer interrupt
Stop the timer
Because packet sequence numbers alternate between 0 and 1, protocol rdt3.0 is sometimes known as the alternating-bit protocol
3.4.2 Pipelined Reliable Data Transfer Protocols
Alternating-bit protocol is slow; it is unlikely that anyone would be happy with its performance today
Rather than operating in a stop-and-wait manner, we should allow the sender to send multiple packets without waiting for ACKs
This is called pipelining
Pipelining has the following consequences:
The range of sequence numbers must be increased, since each in-transit packet must have a unique sequence number and there may be multiple, in-transit, unacknowledged packets
The sender and receiver sides of the protocols may have to buffer more than one packet
The range of sequence numbers needed and the buffering requirements will depend on the manner in which a data transfer protocol responds to lost, corrupted, and overly-delayed packets. Two approaches toward pipelined error recovery can be identified: Go-Back-N and selevtive repeat
3.4.3 Go-Back-N (GBN)
The sender is allowed to transmit multiple packets without waiting for an ACK, but is constrained to have no more than some max allowable number N of unacknowledged packets in the pipeline
The range of permissible sequence numbers for transmitted but not yet acknowledged packets can be viewed as a window of size N over the range of sequence numbers
As the protocol operates, this window slides forward over the sequence number space
Hence, N is often referred to as the window size and GBN itself sliding-window protocol
The GBN sender must respond to three types of events:
Invocation from above. When rdt_send() is called from above, the sender first checks to see if the window is full; that is, whether there are N outstanding unacknowledged packets. If not full, a packet is sent and variables are updated. If full, the sender returns the data back to the upper layer. The upper later tries again later
Receipt of an ACK. An ACK for a packet with sequence number n will be taken to be a cumulative acknowledgement, indicating that all packets with a sequence number up to and including n have been correctly received
A timeout event. If a timeout occurs, the sender resends all packets that have not yet been acknowledged. If an ACK is received but there are still additional transmitted but not yet ACK-ed packets, the timer is restarted. If there are no outsanding unACK-ed packets, the timer is stopped.
GBN does not buffer out-of-order packets; it simply discards them
3.4.4 Selective Repeat (SR)
A disadvantage of GBN: when the window size and bandwidth-delay product are both large, many packets can be in the pipeline. A single packet error can thus cause GBN to retransmit a large number of packets, many unnecessarily. As the probability of channel errors increases, the pipeline can become filled with these unnecessary retransmissions
Selective repeat avoids unnecessary retransmission by having the sender retransmit only those packets that it suspects were lost or corrupted at the receiver
Requires that the receiver individually acknowledge correctly receive packets
Again, a window size of N limits the number of outstanding, unacknowledged packets in the pipeline
Unlike GBN, the sender will have already received ACKs for some of the packets in the window
SR receiver will ACK a correctly received packet regardless of whether in order; out-of-order packets are buffered until any missing packets are received, at which point a batch of packets can be delivered in order to the upper layer
Window size must be less than or equal to half the size of the sequence number space for SR protocols.
3.5 Connection-Oriented Transport: TCP
3.5.1 The TCP Connection
A TCP connection provides a full-duplex service: If there is a TCP connection between Process A on one host and Process B on another host, then application layer data can flow from Process A to Process B at the same time as application layer data flows from Process B to Process A
TCP connection is always point-to-point, between a single sender and a single receiver
Three-way handshake: the client first sends a special TCP segment; the server responds with a second special TCP segment; and finally the client responds again with a third special segment
First two segments carry no payload, the third may carry a payload
Once data passes through the client socket, TCP directs the data to the connection's send buffer, which is one of the buffers set aside by the initial three-way handshake
The maximum amount of data that can be grabbed and placed in a segment is limited by the maximum segment size (MSS)
MSS usually set by determining the length of the largest link-layer frame that can be sent by the local sending host (the Maximum Transmission Unit, MTU) and setting the MSS to ensure that a TCP segment will fit into a single link-layer frame
TCP connection consists of buffers, variables, and a socket connection to a process in one host, and another set of buffers, variables, and a socket connection to a process in another host. No buffers or variables are allocated to the connection in the network elements (routers, switches, and repeaters) between the hosts.
3.5.2 The TCP Segment Structure
TCP segment consists of header fields and a data field
Data field contains chunk of app data
MSS limits the size of a segment's data field
When TCP sends a large file, such as an image, it typically breaks the file into chunks of size MSS
As with UDP, the header contains source and destination port numbers, which are used for multiplexing/demultiplexing data from/to upper-layer apps
Also, as with UDP, header contains a checksum field
TCP segment header also contains:
32-bit sequence number field and 32-bit ACK number field
16-bit receive window for flow control; used to indicate the number of bytes that a receiver is willing to accept
4-bit header length field specifies the length of the TCP header in 32-bit words
Options field is used when a sender and receiver negotiate the MSS or as a window scaling factor for use in high-speed networks
The flag field contains 6 bits:
ACK bit indicates whether the value carried in the ACK field is valid
RST
SYN
FIN: along with above two, used for connection setup and teardown
PSH bit indicates that the receiver should pass the data to the upper layer immediately
URG bit indicates that there is data in this segment that the sending-side upper-layer has marked urgent. The location of the last byte of this urgent data is indicated by the 16-bit urgent data pointer field.
Sequence Numbers and ACK Numbers
The sequence number for a segment is the byte-stream number of the first byte in the segment
Because TCP only acknowledges bytes up to the first missing byte in the stream, TCP is said to provide cumulative acknowledgments
3.5.3 Round-Trip Time Estimation and Time
TCP uses a timeout/retransmit mechanism to recover from lost segments
Estimating the Round-Trip Time
SampleRTT for a segment is the amount of time between when the segment is sent and when an CK is received
Instead of measuring a SampleRTT for every transmitted segment, most TCP implementations take only one SampleRTT measurement at a time. That is, at any point in time, the SampleRTT is being estimated for only one of the transmitted but currently unacknowledged segments, leading to a new value of SampleRTT approximately once every RTT
Also, TCP never computes a SampleRTT for a segment that has been retransmitted; it only measures SampleRTT for segments that have been transmitted once
EstimatedRTT: average of the SampleRTT values. Upon obtaining a new SampleRTT, TCP updates EstimatedRTT according to the following formula: EstimatedRTT = (1 – α) • EstimatedRTT + α • SampleRTT
It is also valuable to have a measure of the variability of RTT. DevRTT is an estimate of how much SampleRTT deviates from EstimatedRTT: DevRTT = (1 – β) • DevRTT + β • | SampleRTT – EstimatedRTT |
Where β = 0.25
Setting and Managing the Retransmission Timeout Interval
What value should be used for TCP's timeout interval?
TimeoutInterval = EstimatedRTT + 4 • DevRTT
3.5.4 Reliable Data Transfer
TCP creates a reliable data transfer service on top of IP's unreliable best-effort service
Doubling the Timeout Interval
A modification to TCP in which each time TCP retransmits, it sets the next timeout interval to twice the previous value, rather than deriving it from the last EstimatedRTT and DevRTT
The timer expiration is most likely caused by congestion in the network, that is, too many packets arriving at one (or more) router queues in the path between the source and destination, causing packets to be dropped and/or long queuing delays. In times of congestion, if the sources continue to retransmit packets persistently, the congestion may get worse. Instead, TCP acts more politely, with each sender retransmitting after longer and longer intervals.
Fast Retransmit
One of the problems with timeout-triggered retransmissions is that the timeout period can be relatively long
The sender can often detect packet loss well before the timeout event occurs by noting so-called duplicate ACKs. A duplicate ACK is an ACK that reacknowledges a segment for which the sender has already received an earlier acknowledgment
Since TCP does not use negative acknowledgments, the receiver cannot send an explicit negative acknowledgment back to the sender. Instead, it simply reacknowledges (that is, generates a duplicate ACK for) the last in-order byte of data it has received
3.5.5 Flow Control
TCP provides flow-control service to its applications to eliminate the possibility of the sender overflowing the receiver's buffer
TCP provides flow control by having the sender maintain a variable called the receive window
Define the following variables:
LastByteRead: the number of the last byte in the data stream read from the buffer by the application process in B
LastByteRcvd: the number of the last byte in the data stream that has arrived from the network and has been placed in the receive buffer at B
LastByteRcvd – LastByteRead < RcvBuffer
Receive window is set to the amount of spare room in the buffer: rwnd = RcvBuffer – [LastByteRcvd – LastByteRead]
3.5.6 TCP Connection Management
The client-side TCP first sends a special TCP segment to the server. The special segmanet contains no app-layer data. One of the flag bits in the header, the SYN bit, is set to 1. This special segment is referred to as a SYN segment. In addition, the client randomly chooses an initial sequence number (client_isn) and puts this number in the sequence number field of the initial TCP SYN segment. This segment is encapsulated within an IP datagram and sent to the server
Once the IP datagram containing the TCP SYN segment arrives at the server host, the server extracts the TCP SYN segment from the datagram, allocates the TCP buffers and variables to the connection, and sends a connection-granted segment to the client TCP. This connection-granted segment also contains no application-layer data. However, it does contain three important pieces of information in the segment header. First, the SYN bit is set to 1. Second, the acknowledgment field of the TCP segment header is set to client_isn+1. Finally, the server chooses its own initial sequence number (server_isn) and puts this value in the sequence number field of the TCP segment header. This connection-granted segment is saying, in effect, “I received your SYN packet to start a connection with your initial sequence number, client_isn. I agree to establish this connection. My own initial sequence number is server_isn.” The connection granted segment is referred to as a SYNACK segment.
Upon receiving the SYNACK segment, the client also allocates buffers and variables to the connection. The client host then sends the server yet another segment; this last segment acknowledges the server’s connection-granted segment. The SYN bit is set to zero, since the connection is established. This third stage of the three-way handshake may carry client-to-server data in the segment payload.
3.6 Principles of Congestion Control
3.6.2 Approaches to Congestion Control
End-to-end congestion control approach: network layer provides no explicit support to the transport layer for congestion control purposes. Even the presence of congestion in the network must be inferred by the end systems based only on observed network behavior. TCP segment loss (as indicated by a timeout or a triple duplicate acknowledgment) is taken as an indication of network congestion and TCP decreases its window size accordingly
Network-assisted congestion control approach: network-layer components (that is, routers) provide explicit feedback to the sender regarding the congestion state in the network. This feedback may be as simple as a single bit indicating congestion at a link
3.7 TCP Congestion Control
The TCP congestion-control mechanism operating at the sender keeps track of an additional variable, the congestion window. The congestion window, denoted cwnd, imposes a constraint on the rate at which a TCP sender can send traffic into the network
The amound of unACKed data at a sender may not exceed the minimum of cwnd and rwnd: LastByteSent – LastByteAcked < min{cwnd, rwnd}
The sender’s send rate is roughly cwnd/RTT bytes/sec. By adjusting the value of cwnd, the sender can therefore adjust the rate at which it sends data into its connection
Because TCP uses acknowledgments to trigger (or clock) its increase in congestion window size, TCP is said to be self-clocking
TCP Congestion Control operates under the following principles:
A lost segment implies congestion, and hence, the TCP sender's rate should be decreased when a segment is lost
An acknowledged segment indicates that the network is delivering the sender's segments to the receiver, and hence, the sender's rate can be increased when an ACK arrives for a previously unacknowledged segment.
Bandwidth probing. Given ACKs indicating a congestion-free source-to-destination path and loss events indicating a congested path, TCP's strategy for adjusting its transmission rate is to increase its rate in response to arriving ACKs until a loss event occurs, at which point, the transmission rate is decreased. The TCP sender thus increases its transmission rate to probe for the rate that at which congestion onset begins, backs off from that rate, and then to begins probing again to see if the congestion onset rate has changed
When TCP connection begins, the value of cwnd is typically initialized to a small value of 1 MSS
In the slow-start state, the value of cwnd begins at 1 MSS and increases by 1 MSS every time a transmitted segment is first acknowledged
Congestion Avoidance
Initially cwnd = i
After 1 RTT, cwnd = i+1
2 RTT, cwnd = i+2
3 RTT, cwnd = i+3
On entry to the congestion-avoidance state, the value of cwnd is approximately half its value when congestion was last encountered. Congestion could be just around the corner. Thus, rather than doubling the value of cwnd every RTT, TCP adopts a more conservative approach and increases the value of cwnd by just a single MSS every RTT. This is often accomplished by increasing cwnd by MSS bytes (MSS/cwnd) whenever a new acknowledgment arrives.
Fast Recovery
if retransmission due to timeout
ssthresh = half current window size
set cwnd = 1
start with slow start phase again
if retransmission due to 3 ACK duplicates
ssthresh = half current window size
cwnd = ssthresh
start with congestion avoidance phase
The value of cwnd is increased by 1 MSS for every duplicate ACK received for the missing segment that caused TCP to enter the fast-recovery state. Eventually, when an ACK arrives for the missing segment, TCP enters the congestion-avoidance state after deflating cwnd. If a timeout event occurs, fast recovery transitions to the slow-start state after performing the same actions as in slow start and congestion avoidance: The value of cwnd is set to 1 MSS, and the value of ssthresh is set to half the value of cwnd when the loss event occurred.
Macroscopic Description of TCP Throughput
Average throughput of a connection = (0.75 * W)/RTT where W is the window size when a loss event occurs
TCP Over High-Bandwidth Paths
average throughput of a connection = (1.22 * MSS) / (RTT * sqrt(L)) where L is the loss rate