TCP/IP
TCP/IP provides a streams, connection-oriented abstraction. It must therefore
be reliable in the presence of "network problems" which do not completely
disrupt communications. Network problems which cannot be masked include
a downed host or partitioned network.
In addition, TCP/IP performs congestion control. Here, the problem is
two fold:
-
If the network fills up, a router can just throw away packets.
-
If a router fills up, it cannot receive any packets.
Reliability
In order to be reliable, we need to model the types of failures that can
occur. In IP these are:
-
Lost packets
-
duplicated messages
-
arbitrarily delayed packets
-
packets received out of order
-
corrupted packets
Corrupted packets
Lets consider the corrupted packets issue first. Assume that the corrupted
packets occur as the result of randomly occuring failures. Than we can
extend each packet with a checksum, such that random errors can be detecting.
(The checksum, in terms depend upon modeling the way that corruption can
occur, but that is coding theory).
If on the other hand, packets could be systematically corrupted by an
advesary, then cryptographic techniques would be required to recognize
whether packets come from a known source.
Once a corrupted packet is found, it can be thrown away, that is, treated
as a lost packet.
Ensuring packets arrive, and are in sequence
The primary technique is to include in each packet a serial number, so
that the ith packet from C1 to C2 has serial number i. (Packets sent in
the other direction, C2 to C1 are seperately numbered.)
The recipient initiates a counter at 0. If a message arrives which is
equal to the counter, the recipient sends an ACK packet with the counter,
and then increments the counter. If a message arrives whose serial number
is less than the counter it is thrown away (it is a duplicate). If it is
greater than the counter NACKs (negative acknowledgements) are sent for
the packets greater than the counter through the serial number of the arriving
packet, and the arriving packet is thrown away.
The ACKs/NACKs can be lost as well. Hence the sender sets a timer, by
which it expects an ACK. If the ACK does not arrive by the timeout, the
packet is resent. This also protects against the last message in a session
being lost, which would not be NAKed.
Optimizations
Acknowledge groups of messages instead of individual messages. Works best
if the probabilty of message lost is low. If its high, that individual
messages should be acknowledged.
Send many messages before waiting for an acknowledgement enables toleration
of long latencies (eg. satelight links), but is slower to detect errors.