One component of TCP/IP is the Internet Protocol (IP) which is responsible for ensuring that data is transferred between two addresses without being corrupted.
For manageability, the data is usually split into multiple pieces or packets each with its own error detection bytes in the control section or header of the packet. The remote computer then receives the packets and reassembles the data and checks for errors. It then passes the data to the program that expects to receive it.
How does the computer know what program needs the data? Each IP packet also contains a piece of information in its header called the type field. This informs the computer receiving the data about the type of layer 4 transportation mechanism being used.
The two most popular transportation mechanisms used on the Internet are Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).
When the type of transport protocol has been determined, the TCP/UDP header is then inspected for the "port" value, which is used to determine which network application on the computer should process the data. This is explained in more detail later.
TCP Is a Connection-Oriented Protocol
TCP opens up a virtual connection between the client and server programs running on separate computers so that multiple and/or sporadic streams of data can be sent over an indefinite period of time between them. TCP keeps track of the packets sent by giving each one a sequence number with the remote server sending back acknowledgment packets confirming correct delivery. Programs that use TCP therefore have a means of detecting connection failures and requesting the retransmission of missing packets. TCP is a good example of a connection-oriented protocol.
How TCP Establishes A Connection
Any form of communication requires some form of acknowledgement for it to become meaningful. Someone knocks on the door to a house, the person inside asks "Who is it?", to which the visitor replies, "It's me!" Then the door opens. Both persons knew who was on the other side of the door before it opened and now a conversation can now begin.
TCP acts in a similar way. The server initiating the connection sends a segment with the SYN bit set in TCP header. The target replies with a segment with the SYN and ACK bits set, to which the originating server replies with a segment with the ACK bit set. This SYN, SYN-ACK, ACK mechanism is often called the "three-way handshake".
The communication then continues with a series of segment exchanges, each with the ACK bit set. When one of the servers needs to end the communication, it sends a segment to the other with the FIN and ACK bits set, to which the other server also replies with a FIN-ACK segment also. The communication terminates with a final ACK from the server that wanted to end the session.
This is the equivalent of ending a conversation by saying "I really have to go now, I have to go for lunch", to which the reply is "I think I'm finished here too, see you tomorrow..." The conversation ends with a final "bye" from the hungry person.
Here is a modified packet trace obtained from the tethereal program. You can clearly see the three way handshake to connect and disconnect the session.
hostA -> hostB TCP 1443 > http [SYN] Seq=9766 Ack=0 Win=5840 Len=0
hostB -> hostA TCP http > 1443 [SYN, ACK] Seq=8404 Ack=9767 Win=5792 Len=0
hostA -> hostB TCP 1443 > http [ACK] Seq=9767 Ack=8405 Win=5840 Len=0
hostA -> hostB HTTP HEAD/HTTP/1.1
hostB -> hostA TCP http > 1443 [ACK] Seq=8405 Ack=9985 Win=54 Len=0
hostB -> hostA HTTP HTTP/1.1 200 OK
hostA -> hostB TCP 1443 > http [ACK] Seq=9985 Ack=8672 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [FIN, ACK] Seq=8672 Ack=9985 Win=54 Len=0
hostA -> hostB TCP 1443 > http [FIN, ACK] Seq=9985 Ack=8673 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [ACK] Seq=8673 Ack=9986 Win=54
In this trace, the sequence number represents the serial number of the first byte of data in the segment. So in the first line, a random value of 9766 was assigned to the first byte and all subsequent bytes for the connection from this host will be sequentially tracked. This makes the second byte in the segment number 9767, the third number 9768 etc. The acknowledgment number or Ack, not to be confused with the ACK bit, is the byte serial number of the next segment it expects to receive from the other end, and the total number of bytes cannot exceed the Win or window value that follows it. If data isn't received correctly, the receiver will re-send the requesting segment asking for the information to be sent again. The TCP code keeps track of all this along with the source and destination ports and IP addresses to ensure that each unique connection is serviced correctly.
UDP, TCP's "Connectionless" Cousin
UDP is a connectionless protocol. Data is sent on a "best effort" basis with the machine that sends the data having no means of verifying whether the data was correctly received by the remote machine. UDP is usually used for applications in which the data sent is not mission-critical. It is also used when data needs to be broadcast to all available servers on a locally attached network where the creation of dozens of TCP connections for a short burst of data is considered resource-hungry.
TCP and UDP Ports
The data portion of the IP packet contains a TCP or UDP segment sandwiched inside. Only the TCP segment header contains sequence information, but both the UDP and the TCP segment headers track the port being used. The source/destination port and the source/destination IP addresses of the client & server computers are then combined to uniquely identify each data flow.
Certain programs are assigned specific ports that are internationally recognized. For example, port 80 is reserved for HTTP Web traffic, and port 25 is reserved for SMTP e-mail. Ports below 1024 are reserved for privileged system functions, and those above 1024 are generally reserved for non-system third-party applications.
Usually when a connection is made from a client computer requesting data to the server that contains the data:
- The client selects a random previously unused "source" port greater than 1024 and queries the server on the "destination" port specific to the application. If it is an HTTP request, the client will use a source port of, say, 2049 and query the server on port 80 (HTTP)
- The server recognizes the port 80 request as an HTTP request and passes on the data to be handled by the Web server software. When the Web server software replies to the client, it tells the TCP application to respond back to port 2049 of the client using a source port of port 80.
- The client keeps track of all its requests to the server's IP address and will recognize that the reply on port 2049 isn't a request initiation for "NFS", but a response to the initial port 80 HTTP query.
No comments:
Post a Comment