## Routing

How do we get packets from one end point to another? Here's what would be nice for a routing algorithm: correctness, simplicity, robustness, stability, fairness, optimality.

Robustness

The world changes, software changes, use changes, topology and hardware change, things go wrong in lots of different ways. How well does the routing algorithm handle all this?
Stability
Does the algorithm find a routing table quickly (convergence)? How does it adapt to abrupt changes in topology or state of the routers? Is it possible to have oscillations?
Fairness & Optimality
May be at odds with one another. What might be fair for a single link may hurt throughput. Must decide on what is meant by optimality before thinking about algorithms. For example, optimal could be for an individual packet (least amount of time in transit) or could be for the system as a whole (greatest throughput). Often times number of hops is chosen as the metric to minimize as this represents both in some sense.
Algorithms may be static, i.e. the routing decisions are made ahead of time, with information about the network topology and capacity, then loaded into the routers, or dynamically, where the routers make decisions based on information they gather, and the routes change over time, adaptively.

Optimality principle and sink trees

Without regard to topology we can say:

If a router J is on the optimal path from router I to router K, then the optimal path from J to K also follows the same route.

Proof: if there was a better way from J to K, then you could use that with the path from I to J for a better path from I to K, so your starting point (the path from I to K was optimal) is contradicted.

If you apply the optimality principle then you can form a tree by taking the optimal path from every other router to a single router, B. The tree is rooted at B. Since it is a tree you don't have loops, so you know that each frame will be delivered in a finite number of hops. Of course finding the set of optimal trees is a lot harder in practice than in theory, but it still provides a goal for all real routing algorithms.

### Shortest Path

Links between routers have a cost associated with them. In general it could be a function of distance, bandwidth, average traffic, communication cost, mean queue length, measured delay, router processing speed, etc.

The shortest path algorithm just finds the least expensive path through the network, based on the cost function.

Dijkstra's algorithm is an example. You can start from source/destintation (doesn't matter in a unidirectional graph).

1. Set probe node to starting node.
2. Probe neighboring nodes and tentatively label them with (probe node, cummulative distance from start).
3. Search all tentatively labeled nodes (and not just the nodes labeled from the current probe) for the minimum label, make this minimum node's label permanent, and make it the new probe node.
4. If the probe node is the destination/source, stop, else goto 2.
The distance part of the node labels is cummulative distance from the starting node, not simply distance from the last probe node.

The key to discovering that you've gone down a bad (greater distance) path is that you examine all nodes with temporary labels in step 3. This means that you switch the probe node to another, shorter path if the you run into an high cost link.

If you label each node with it's predecessor on the path, and the distance to that node, then you can easily find the route you desire (albeit backwards) by starting at the destination and following the trail of predecessors backwards. You'll also know the distance from source to destination from the label on the destination.

A nice Java applet for exploring Dijkstra's algorithm is http://cs.pace.edu/~carla/DijkstraApplet.html

### Flooding

Every incoming packet is sent out on every other link by every router.

Super simple to implement, but generates lots of redundant packets. Interesting to note that all routes are discovered, including the optimal one, so this is robust and high performance (best path is found without being known ahead of time). Good when topology changes frequently (USENET example).

Some means of controlling the expansion of packets is needed. Could try to ensure that each router only floods any given packet once.

Could try to be a little more selective about what is forwarded and where.

### Flow-based

Similar in spirit to minimum distance, but takes traffic flow into consideration.

The key here is to be able to characterize the nature of the traffic flows over time. You might be able to do this if you know a lot about how the network is used (traffic arrival rates and packet lengths). From the known average amount of traffic and the average length of a packet you can compute the mean packet delays using queuing theory. Flow-based routing then seeks to find a routing table to minimize the average packet delay through the subnet.

### Distance Vector

Also known as Belman-Ford or Ford-Fulkerson. Used in the original ARPANET, and in the Internet as RIP.

The heart of this algorithm is the routing table maintained by each host. The table has an entry for every other router in the subnet, with two pieces of information: the link to take to get to the router, and the estimated distance from the router. For a router A with two outgoing links L1, L2, and a total of four routers in the network, the routing table might look like this:

 router distance link B 5 L1 C 7 L1 D 2 L2

Neighboring nodes in the subnet exchange their tables periodically to update each other on the state of the subnet (which makes this a dynamic algorithm). If a neighbor claims to have a path to a node which is shorter than your path, you start using that neighbor as the route to that node. Notice that you don't actually know the route the neighbor thinks is shorter - you trust his estimate and start sending frames that way.

When a neighbor sends you its routing table you examine it as follows and update your own routing table.

for( i varied across all routers in the table )
if( your distance to the neighbor + neighbors distance to router i  <  your previous estimate to router i ){
your distance to router i = your distance to the neighbor + neighbors distance to router i
link to router i is set to link to the neighbor with the short distance to i
}
You can think of this as forming an approximation of the global state of the subnet from local information only (exchange with neighbors). Unfortunately it has problems (it's only an approximation, after all). Good news (a link comes up, a new router is available, a router or link are made faster) propogate very quickly through the whole subnet (in the worst case it takes a number of exchanges equal to the longest path for everyone to know the good news).

Bad news is not spread reliably. Neighbors only slowly increase their path length to a dead node, and the condition of being dead (infinite distance) is reached by counting to infinity one at a time. Various means of fixing this have been tried, but none are foolproof.

Widely used today, replaced Distance Vector in the ARPANET. Link State improves the convergence of Distance Vector by having everybody share their idea of the state of the net with everybody else (more information is available to nodes, so better routing tables can be constructed).

The basic outline is

2. measure delay to your neighbors
4. send this information to all other routers in the subnet
5. compute the shortest path to every router with the information you receive
Neighbor discovery
Send an HELLO packet out. Receiving routers respond with their addresses, which must be globally unique.
Measure delay
Time the round-trip for an ECHO packet, divide by two. Question arises: do you include time spent waiting in the router (i.e. load factor of the router) when measuring round-trip ECHO packet time or not?
Put information for all your neighbors together, along with your own id, a sequence number and an age.
Ideally, every router would get every other routers data simultaneously. This can't happen, so in effect you have different parts of the subnet with different ideas of the topology of the net at the same time. Changes ripple through the system, but routers that are widely spread can be using very different routing tables at the same time. This could result in loops, unreachable hosts, other types of problems.

A flooding algorithm is used to get the data out as soon as possible. The seuqence number is used to control the flooding. Each router keeps track of the routing data packets it has seen by router/seq no. pair. If the pair is new, then it is forwarded on all lines but the one it arrived on, if it has been seen before it is not forwarded.

The age of a data packet is used to prevent corruption of the sequence number from causing valid data to be ignored. The age field is decremented once per second by the routers which forward the packet. When it hits zero it is discarded.

How often to exchange data?

Compute shortest path tree
Using an algorithm like Dijkstra's, and with a complete set of information packets from other routers, every router can locally compute a shortest path to every other router. The memory to store the data is proportional to k * n, for n routers each with k neighbors. Time to compute can also be large.

Bad data (from routers in error, e.g.) will corrupt the computation.

### Hierarchical

When your subnet is large then the routing tables become unwieldy. Too much memory to store them, too much time to search them, too much time to compute them. When something is too large, people form a hieararchy to deal with it.

The idea is to replace N different routing table entries to N different individual routers with a single entry for a cluster of N routers. You can apply many different levels of hierarchy.

<figure 5.17 example>

The price you pay is that you don't have the optimal route for each router anymore (makes sense, since you lump all routers in a single cluster together with one optimal path).