TCP/IP

TCP/IP provides a streams, connection-oriented abstraction. It must therefore be reliable in the presence of "network problems" which do not completely disrupt communications. Network problems which cannot be masked include a downed host or partitioned network.

In addition, TCP/IP performs congestion control. Here, the problem is two fold:

Reliability

In order to be reliable, we need to model the types of failures that can occur. In IP these are:

Corrupted packets

Lets consider the corrupted packets issue first. Assume that the corrupted packets occur as the result of randomly occuring failures. Than we can extend each packet with a checksum, such that random errors can be detecting. (The checksum, in terms depend upon modeling the way that corruption can occur, but that is coding theory).

If on the other hand, packets could be systematically corrupted by an advesary, then cryptographic techniques would be required to recognize whether packets come from a known source.

Once a corrupted packet is found, it can be thrown away, that is, treated as a lost packet.

Ensuring packets arrive, and are in sequence

The primary technique is to include in each packet a serial number, so that the ith packet from C1 to C2 has serial number i. (Packets sent in the other direction, C2 to C1 are seperately numbered.)

The recipient initiates a counter at 0. If a message arrives which is equal to the counter, the recipient sends an ACK packet with the counter, and then increments the counter. If a message arrives whose serial number is less than the counter it is thrown away (it is a duplicate). If it is greater than the counter NACKs (negative acknowledgements) are sent for the packets greater than the counter through the serial number of the arriving packet, and the arriving packet is thrown away.

The ACKs/NACKs can be lost as well. Hence the sender sets a timer, by which it expects an ACK. If the ACK does not arrive by the timeout, the packet is resent. This also protects against the last message in a session being lost, which would not be NAKed.

Optimizations

Acknowledge groups of messages instead of individual messages. Works best if the probabilty of message lost is low. If its high, that individual messages should be acknowledged.

Send many messages before waiting for an acknowledgement enables toleration of long latencies (eg. satelight links), but is slower to detect errors.