What isn't considered at this layer is anything more than a single-hop away, i.e. no networking issues. All communication issues are point-to-point at this layer, and the stream of bits is sequential.
One potential point of confusion when studying concepts like error and flow control: these issues arise in multiple layers, depending on the application, the medium, the design of the stack, etc.
One possible approach is to use character stuffing, where each start/stop character that appears coincidentally in the data stream is replaced by 2 such characters. Then the receiver knows to remove the redundant characters and can distinguish them from the true start/stop characters. If you avoid being "character centric", then you can apply the same sort of idea to a bit stream, special bit flags, and bit-stuffing.
An illegal signal would then be a bit period where no transition takes place (i.e. high-high, or low-low)
Why not protect the data as well? Larger checksum would be needed and time to calculate would be longer, so only the header is protected. This means that cells won't be misdirected by corrupt headers, but they may have corrupt data. Higher layers must protect their data if so desired.
The checksum used corrects single bit errors and detects most multibit errors. The design was based on the assumption that the most common medium in use would be fiber, and hence have very low error rates. The probability of a cell having an undetected header error is about 10^-20 (at 155 Mbps that's 1 cell every 90,000 years).
Cells that have their header checksum inserted are ready to go. Whether they leave immediately or not depends on the physical medium. Synchronous medium require waiting for the start of a cell boundary, also the transmission of idle cells. Asynchronous medium allow for cells to be sent at any time.
In some cases the TC sublayer also has the responsibility of putting the ATM cells into the frames of the underlying medium. For example, ATM cells can be put into SONET frames for transmission.
The TC sublayer communicates control information over special OAM (operation and maintenance) cells.
The ATM cells don't have a structure like PPP or HDLC with special fields to mark the boundaries of the cells. How can the TC recognize the cells from the bit stream? The TC hunts for valid headers by sequentially searching the bit stream.
A valid header is one in which the header checksum agrees with the other 4 bytes of header info. A 5 byte shift register is used to search for valid headers. The checksum is computed on the 40 bits, and if it matches the HEC then it is possible that the cell boundaries are now known. The cell is discarded and the 40 bits which occur 48 bytes later (what should be the start of the next cell) are examined. Repeating this you can make the probability of finding the right cell boundaries arbitrarily high, at the cost of greater synchronization time.
<figure: state machine for cell synchronization
no_synch -> no_synch : shift, compute, no match
no_synch -> possible_synch: match
possible_synch -> possible_synch: shift 1 cell, compute, match
possible_synch -> synch: N matches
possible_synch -> no_synch: mismatch
>
Probability of false synchronization = 2^(-8*N), where N is the number of correct checksums you have to find to move from the no_synch to the synch state.
Tanenbaum points out that this means of synchronization violates the
fundamental principle of a protocol stack since the TC sublayer uses its
knowledge of the cell header which is generated by the ATM layer. This
ties the ATM layer to the TC sublayer.
The basic idea is to provide a feedback channel from the receiver to the sender. This sort of feedback is known as acknowledgments, or ACKs.
Timers are used to cope with the possibility that frames are never received, or that ACKs are lost.
can detect error bursts of n bits long
if a really bad burst is present the probability of all of the column
parity bits
being accidentally correct is 1 / 2^n
Treat bit strings as representations of polynomials with coefficients of 0, 1. A k-bit frame is a coeffiient list for a polynomial with k terms, xk-1 to x0, e.g. 110001 represents x5 + x4 + x0.
Sender and receiver agree on generator polynomial G(x), both high and low order bits of G(x) must be 1. To compute a checksum for m bits corresponding to the message M(x), the frame must be longer than G(x). We append a checksum to the frame so that the polynomial represented by the checksummed frame is divisible by G(x). The odds of a corrupt frame being also evenly divisible by G(x) are very small.
Checksum algorithm
1. Let r be degree of G(x). Append r zero bits to the low order
end of the frame, so it now contains m+r bits, and corresponds to the polynomial
x^r * M(x).
2. Divide G(x) into the bit string correspondingto x^r * M(x) using
mod2 division.
3. Subtract the remainder (which is always r or fewer bits) from the
bit string corresponding to x^r * M(x) using mod2 subtraction. The
result is the checksummed frame to be transmitted. Call its polynomial
T(x).
Long division is also done modulo 2, "to go into" means having the same number of bits
frame: 1101011011
M(x): x9 + x8 + x6 + x4 + x4 + x1 + x0
G(x): x4 + x1 + x0, or the bit string: 10011
append 4 0 bits (multiply by x^4): 11010110110000
long division, modulo 2 subtraction, yields remainder of 1110
subtract
11010110110000
- 1110
-------------------------
transmit: 11010110111110
The checksum algorithm simply makes the transmitted frame be divisible
by the generator polynomial. Doing this in decimal corresponds to
897 / 10 = 7
897 - 7 = 890
890 is evenly divisible by 10
The receiver computes [T(x) + E(x) ] / G(x) = E(x) / G(x), since T(x) / G(x) is always 0 by design.
Errors E(x) which happen to be factors of G(x) will not be caught, all others are detected
Single bit errors
if G(x) contains an x0 term, then it will not have xi as a factor
(since the parenthesized term is odd, xi must be even)
so if the degree of the parenthesized term is less than the degree
of G(x), the remainder can never be 0 (all errors detected).
For bursts greater than r, it depends on the exact bit error, but assuming random errors, the chance of non-detection is (1/2)^r
CRC codes can be implemented with only XOR and shift operations, making them fast to do in hardware.
Tanenbaum has an interesting note about the likeliehood of detection using CRC codes. The assumptions used to pick the polynomials, and then to analyse the probability of detection of errors of a given length, included the assumption that the data in the messages was random, or that all possible messages were equally likely. Recent studies show this not to be the case at all, so the calculations of error detection may be quite far off, and undetected errors may be more likely than previously thought.
Most computer networking is concerned with error detection, since the amount of information added to do feedforward error correction can be large.
Total length of a transmission in bits is n = m + r (message + redundancy), where the codewords sent (distinct from the messages, remember) are n bits long.
Given two codewords, it is easy to say by how many bits they differ by (XOR and count the 1 bits). The number of bits that two codewords differ by is known as the Hamming distance. If two codewords are a Hamming distance d apart, it will require d single-bit errors to convert one to the other (corruption).
Most of the time when using m data bits we use all possible 2^m combinations. For example, the 7 bit ASCII code had 128 different symbols defined for it. It makes sense to use all possible combinations since to do otherwise decreases the inherent efficiency of the representation (i.e. you always send m bits, you might as well be able to communicate 2^m different symbols).
When you design a code that can be used to actually correct errors, and not just detect them, then you don't use all possible 2^n values in an n bit code. This "waste" represents redundancy which can be put to good use.
The Hamming distance of a code is the minimum Hamming distance between any two codewords of the code.
Detection: a code with a Hamming distance of d+1 can detect d single
bit errors.
Correction: a code with Hamming distance of 2d+1 can correct d single
bit errors.
Parity is a code with d=2, thus can detect single bit errors.
The data frames from A to B are intermixed with the acknowlegement frames from A to B, since the communication is usually full-duplex. A field in the frame header can be used to distinguish data from ACK. This way ACKs get a free ride on the next outgoing data frame (piggybacking) which results in better use of the available channel bandwidth. How about if new frames do not arrive on time to piggyback the acknowlegement? (sender will time out) Use a timer in the receiver to time out and send separate acknowlegement frames when necessary.
Total time to send one frame and receive an ACK for it (draw a picture
of this)
Ttotal = Tframe + Tproc + Tack + Tproc + 2Tprop
Then we can calculate the utilization of the channel with this protocol as follows:
U = Tframe / Ttotal
~= Tframe / (Tframe + 2Tprop)
since processing time is small and ACKs are short.
= 1 / (1 + 2Tprop / Tframe)
Propogation delay is typically 2/3 speed of light, so Tprop = C / v and v = 2/3 * 3*10^8, where C is the distance between sender/receiver, and 3*10^8 is the speed of light in meters per second.
Tframe = L / R where L = length of frame in bits, and R is the data rate. So substituting we get
but we haven't accounted for the fact that a frame isn't all data, so in fact this utilization is reduced in proportion to the amount of header bits in a frame
U = D / L * { 1 / (1 + 2Rd/vL) }
= {D / (H + D) } * { 1 / (1 + 2Rd/vL) }
if the headers are small relative to the data payload we can ignore this first term. Note however that from and end-to-end perspective you really have to count all of the headers as they are added going down the protocol stack. In this case, particularly for short messages, the D/(H+D) term might in fact be quite small, with a major impact on efficiency.
When Tprop / Tframe is less than 1 (short links, low data rates) then efficiency of send-and-wait is good.
For long links and high speeds, i.e. Tprop/Tframe is large, utilization can be very bad.
For long links, but low data rates it might be ok.
The efficiency problems of Send&Wait are because only one packet can be in transit at a time, and that frame may be much smaller than the channel length.
If an ACK is lost, or corrupted, then sender may send two copies of the same frame. To fix this we use a sequence number in packets. Receiver can simply discard duplicates that it receives. How many bits to use for sequence number? You only need 1 bit (sequence number 0 or 1) since there is only ambiguity arising between a packet and its predecessor or successor. In other words, there is only one frame at a time outstanding.
The timer must be sent to at least twice what the propogation delay is, or the sender will draw false conclusions about lost frames. Time is also wasted when the receiver gets a corrupted frame, as it must wait for the sender to timeout and resend. An improvement is possible by sending explicit acknowledgement of the receipt of bad frames: NAKs.
Efficiency is also reduced since frames with errors must be resent,
wasting some of the capacity of the channel.
P1 = probability that a data frame is lost or corrupted
P2 = probability that an ACK is lost or corrupted
Prob of successful transmission = (1 - P1) * (1 - P2) assuming frame
and ACK loss are independent events
Prob. of failed transmission = F = 1 - (1 - P1) * (1 - P2)
So your channel utilization is reduced by the probability you waste time sending frames that aren't successfully received and becomes
Essence of sliding window:
Example of sliding window (Tannenbaum, 2nd ed, p 226)
window of size 1, sequence number of 3 bits
| initial state | first frame sent | first frame received | ack receive | |
| sender | next frame is 000 | expects 000
next is 001 |
expects 000
next is 001 |
next is 001 |
| receiver | expects 000 | expects 000 | expects 001 | expects 001 |
Since Tframe = L / R, and Tprop = d / v, then channel length in frames = Tprop / Tframe
Note that this important term shows up in our efficiency for send-and-wait protocol.
In general, two modes to calculate efficiency for. The first is for when the sender is never idle waiting for an ACK. The sender can signal for a time Tframe*W, where W is the size of the sliding window. After that, the sender must wait for an ACK from the first frame, which will take Tframe + 2*Tprop to reach the sender. The sender will never be blocked if W* Tframe > Tframe + 2*Tprop, so the theoretical efficiency is 1.0, since the "pipeline" of transmission could always be full.
However, if W*Tframe < Tframe + 2*Tprop , or put another way
W < 1 + 2 * Tprop/Tframe
that is to say the windows size W is less than 2 times the length of the channel (plus 1) then the sender will send
W frames, then wait for an ACK, send 1 more frame, wait for an ACK, etc.
This makes the steady-state efficiency
U = W / (1 + 2Tprop/ Tframe)
Example with frame error (figure 4.10)
·error in frame N+1
·receiver returns ACK for all correctly received frames (...,N-1,
N, N+2, ...)
·on receipt of ACK N+2, sender figures out that N+1 not received
correctly
·sender retransmits N+1, changes position in list, goes on
Example with ack error (figure 4.11)
·error in ack N
·sender receives ack N+1, resends frame N
·receiver checks list, concludes that N is a duplicate, trashes
·receiver acks duplicate frame
Requires 2W+1 sequence numbers for a window of W frames
Consider case where full window is sent, all acks lost - requires
W buffer in the sender, W in the receiver.
If the probability of lost frames is high, and the frames can be large, then the space for buffers may be significant. That motivates the next strategy.
If an ACK is corrupted (say N and N+1), then the sender will infer from receipt of the ACK for N+2 that both N and N+1 were received correctly, and will take the N+2 ACK as sufficient for all three frames.
Since the receiver doesn't have to store frames out-of-sequence, it doesn't require much buffer space. Remember that it throws away the out-of-sequence frames it receives.
Requires W+1 sequence numbers for a window of W frames and W buffers in the sender, 1 in the receiver.
Go-Back-N handles duplicate (lost ACKs) frames easily by simply discarding them. Selective-Retransmission can only know to discard a duplicate by maintaining a history list of received frames.
Most data comm is full duplex, i.e. you have as much BW in one direction as you do in another. If you only use the reverse channel to send ACK/NAKs, then you will be wasting the BW of the reverse channel (low utilization). One idea is to piggy-back ACK/NAK traffic on the information frames that are flowing on the reverse channel.
Now you can calculate the average number of transmission attempts to
get the one frame through successfully:
N = sum of k from 1 to inf { k * (1-F)*F^(k-1) }
which is just the weighted average of all possible values for the number
of times the frame must be sent and reduces to
N = 1 / (1 - F)
and the number of re-transmission (i.e wasted channel capacity) is
N - 1 or
Nr = N -1 = F / (1 - F)
SelectiveRetransmission
For W < 1 + 2 Tprop/Tframe the efficiency drops by the same amount, so
U = (1 - F) * { W / (1 + 2Tprop/ Tframe) } = {1 / (Nr + 1) } * {W / (1 + 2Tprop/ Tframe) }
Protocol machine: always in a specific state at every instant of time. Typically the states are chosen at instants that the protocol machine is waiting for the next event to happen. The events happening cause a change of state.
The state of the complete system is the combination of all the states of the two protocol machines and the channel. The state of the channel is simplified to refer to whether a frame is on it or not, and which kind of frame is on it.
Formally, a finite state machine of a protocol is a quadruple: (S,M,I,T)
Incompleteness of specification
What can you say about the correctness of send-and-wait from the FSM? An error in send-and-wait would be for a sender to send two frames with the same number in a row (i.e. to not alternate 0, 1, 0, 1, etc). You could state this requirement as