Robustness

- The world changes, software changes, use changes, topology and hardware
change, things go wrong in lots of different ways. How well does the routing
algorithm handle all this?

- Does the algorithm find a routing table quickly (convergence)?
How does it adapt to abrupt changes in topology or state of the routers?
Is it possible to have oscillations?

- May be at odds with one another. What might be fair for a single link
may hurt throughput. Must decide on what is meant by optimality before
thinking about algorithms. For example, optimal could be for an individual
packet (least amount of time in transit) or could be for the system as
a whole (greatest throughput). Often times number of hops is chosen as
the metric to minimize as this represents both in some sense.

Optimality principle and sink trees

- Without regard to topology we can say:

If a router J is on the optimal path from router I to router K, then the optimal path from J to K also follows the same route.

Proof: if there was a better way from J to K, then you could use that with the path from I to J for a better path from I to K, so your starting point (the path from I to K was optimal) is contradicted.

If you apply the optimality principle then you can form a tree by taking the optimal path from every other router to a single router, B. The tree is rooted at B. Since it is a tree you don't have loops, so you know that each frame will be delivered in a finite number of hops. Of course finding the set of optimal trees is a lot harder in practice than in theory, but it still provides a goal for all real routing algorithms.

The shortest path algorithm just finds the least expensive path through the network, based on the cost function.

Dijkstra's algorithm is an example. You can start from source/destintation (doesn't matter in a unidirectional graph).

- Set probe node to starting node.
- Probe neighboring nodes and tentatively label them with (probe node, cummulative distance from start).
- Search all tentatively labeled nodes (and not just the nodes labeled from the current probe) for the minimum label, make this minimum node's label permanent, and make it the new probe node.
- If the probe node is the destination/source, stop, else goto 2.

- The distance part of the node labels is cummulative distance from the
starting node, not simply distance from the last probe node.

The key to discovering that you've gone down a bad (greater distance) path is that you examine all nodes with temporary labels in step 3. This means that you switch the probe node to another, shorter path if the you run into an high cost link.

If you label each node with it's predecessor on the path, and the distance to that node, then you can easily find the route you desire (albeit backwards) by starting at the destination and following the trail of predecessors backwards. You'll also know the distance from source to destination from the label on the destination.

Super simple to implement, but generates lots of redundant packets. Interesting to note that all routes are discovered, including the optimal one, so this is robust and high performance (best path is found without being known ahead of time). Good when topology changes frequently (USENET example).

Some means of controlling the expansion of packets is needed. Could try to ensure that each router only floods any given packet once.

Could try to be a little more selective about what is forwarded and where.

The key here is to be able to characterize the nature of the traffic flows over time. You might be able to do this if you know a lot about how the network is used (traffic arrival rates and packet lengths). From the known average amount of traffic and the average length of a packet you can compute the mean packet delays using queuing theory. Flow-based routing then seeks to find a routing table to minimize the average packet delay through the subnet.

The heart of this algorithm is the routing table maintained by each
host. The table has an entry for every other router in the subnet, with
two pieces of information: the link to take to get to the router, and the
estimated distance from the router. For a router A with two outgoing links
L1, L2, and a total of four routers in the network, the routing table might
look like this:

router | distance | link |

B | 5 | L1 |

C | 7 | L1 |

D | 2 | L2 |

Neighboring nodes in the subnet exchange their tables periodically to update each other on the state of the subnet (which makes this a dynamic algorithm). If a neighbor claims to have a path to a node which is shorter than your path, you start using that neighbor as the route to that node. Notice that you don't actually know the route the neighbor thinks is shorter - you trust his estimate and start sending frames that way.

When a neighbor sends you its routing table you examine it as follows and update your own routing table.

- for( i varied across all routers in the table )

if( your distance to the neighbor + neighbors distance to router i < your previous estimate to router i ){

your distance to router i = your distance to the neighbor + neighbors distance to router i

link to router i is set to link to the neighbor with the short distance to i

}

Bad news is not spread reliably. Neighbors only slowly increase their path length to a dead node, and the condition of being dead (infinite distance) is reached by counting to infinity one at a time. Various means of fixing this have been tried, but none are foolproof.

The basic outline is

- discover your neighbors
- measure delay to your neighbors
- bundle all the information about your neighbors together
- send this information to all other routers in the subnet
- compute the shortest path to every router with the information you receive

- Send an HELLO packet out. Receiving routers respond with their addresses,
which must be globally unique.

- Time the round-trip for an ECHO packet, divide by two. Question arises:
do you include time spent waiting in the router (i.e. load factor of the
router) when measuring round-trip ECHO packet time or not?

- Put information for all your neighbors together, along with your own
id, a sequence number and an age.

- Ideally, every router would get every other routers data simultaneously.
This can't happen, so in effect you have different parts of the subnet
with different ideas of the topology of the net at the same time. Changes
ripple through the system, but routers that are widely spread can be using
very different routing tables at the same time. This could result in loops,
unreachable hosts, other types of problems.

A flooding algorithm is used to get the data out as soon as possible. The seuqence number is used to control the flooding. Each router keeps track of the routing data packets it has seen by router/seq no. pair. If the pair is new, then it is forwarded on all lines but the one it arrived on, if it has been seen before it is not forwarded.

The age of a data packet is used to prevent corruption of the sequence number from causing valid data to be ignored. The age field is decremented once per second by the routers which forward the packet. When it hits zero it is discarded.

How often to exchange data?

- Using an algorithm like Dijkstra's, and with a complete set of information
packets from other routers, every router can locally compute a shortest
path to every other router. The memory to store the data is proportional
to k * n, for n routers each with k neighbors. Time to compute can also
be large.

Bad data (from routers in error, e.g.) will corrupt the computation.

The idea is to replace N different routing table entries to N different individual routers with a single entry for a cluster of N routers. You can apply many different levels of hierarchy.

<figure 5.17 example>

The price you pay is that you don't have the optimal route for each router anymore (makes sense, since you lump all routers in a single cluster together with one optimal path).

One way of doing this is by sending a packet which has the desired data along with a list of hosts to be visited. At each router the list is examined. For each outgoing line which is the best path to one of the destinations in the list, the packet is duplicated and the destinations which are best sent down this line are selected from the list for the new packet. Eventually the packets only contain a single destination, in which case the algorithm has done the distribution.