Routing
How do we get packets from one end point to another? Here's what would
be nice for a routing algorithm: correctness, simplicity, robustness, stability,
fairness, optimality.
Robustness
The world changes, software changes, use changes, topology and hardware
change, things go wrong in lots of different ways. How well does the routing
algorithm handle all this?
Stability
Does the algorithm find a routing table quickly (convergence)?
How does it adapt to abrupt changes in topology or state of the routers?
Is it possible to have oscillations?
Fairness & Optimality
May be at odds with one another. What might be fair for a single link
may hurt throughput. Must decide on what is meant by optimality before
thinking about algorithms. For example, optimal could be for an individual
packet (least amount of time in transit) or could be for the system as
a whole (greatest throughput). Often times number of hops is chosen as
the metric to minimize as this represents both in some sense.
Algorithms may be static, i.e. the routing decisions are made ahead of
time, with information about the network topology and capacity, then loaded
into the routers, or dynamically, where the routers make decisions based
on information they gather, and the routes change over time, adaptively.
Optimality principle and sink trees
Without regard to topology we can say:
If a router J is on the optimal path from router I to router K, then
the optimal path from J to K also follows the same route.
Proof: if there was a better way from J to K, then you could use that
with the path from I to J for a better path from I to K, so your starting
point (the path from I to K was optimal) is contradicted.
If you apply the optimality principle then you can form a tree by taking
the optimal path from every other router to a single router, B. The tree
is rooted at B. Since it is a tree you don't have loops, so you know that
each frame will be delivered in a finite number of hops. Of course finding
the set of optimal trees is a lot harder in practice than in theory, but
it still provides a goal for all real routing algorithms.
Shortest Path
Links between routers have a cost associated with them. In general it could
be a function of distance, bandwidth, average traffic, communication cost,
mean queue length, measured delay, router processing speed, etc.
The shortest path algorithm just finds the least expensive path through
the network, based on the cost function.
Dijkstra's algorithm is an example. You can start from source/destintation
(doesn't matter in a unidirectional graph).
-
Set probe node to starting node.
-
Probe neighboring nodes and tentatively label them with (probe node, cummulative
distance from start).
-
Search all tentatively labeled nodes (and not just the nodes labeled from
the current probe) for the minimum label, make this minimum node's label
permanent, and make it the new probe node.
-
If the probe node is the destination/source, stop, else goto 2.
Comments
The distance part of the node labels is cummulative distance from the
starting node, not simply distance from the last probe node.
The key to discovering that you've gone down a bad (greater distance)
path is that you examine all nodes with temporary labels in step 3. This
means that you switch the probe node to another, shorter path if the you
run into an high cost link.
If you label each node with it's predecessor on the path, and the distance
to that node, then you can easily find the route you desire (albeit backwards)
by starting at the destination and following the trail of predecessors
backwards. You'll also know the distance from source to destination from
the label on the destination.
A nice Java applet for exploring Dijkstra's algorithm is http://cs.pace.edu/~carla/DijkstraApplet.html
Flooding
Every incoming packet is sent out on every other link by every router.
Super simple to implement, but generates lots of redundant packets.
Interesting to note that all routes are discovered, including the optimal
one, so this is robust and high performance (best path is found without
being known ahead of time). Good when topology changes frequently (USENET
example).
Some means of controlling the expansion of packets is needed. Could
try to ensure that each router only floods any given packet once.
Could try to be a little more selective about what is forwarded and
where.
Flow-based
Similar in spirit to minimum distance, but takes traffic flow into consideration.
The key here is to be able to characterize the nature of the traffic
flows over time. You might be able to do this if you know a lot about how
the network is used (traffic arrival rates and packet lengths). From the
known average amount of traffic and the average length of a packet you
can compute the mean packet delays using queuing theory. Flow-based routing
then seeks to find a routing table to minimize the average packet delay
through the subnet.
Distance Vector
Also known as Belman-Ford or Ford-Fulkerson. Used in the original ARPANET,
and in the Internet as RIP.
The heart of this algorithm is the routing table maintained by each
host. The table has an entry for every other router in the subnet, with
two pieces of information: the link to take to get to the router, and the
estimated distance from the router. For a router A with two outgoing links
L1, L2, and a total of four routers in the network, the routing table might
look like this:
| router |
distance |
link |
| B |
5 |
L1 |
| C |
7 |
L1 |
| D |
2 |
L2 |
Neighboring nodes in the subnet exchange their tables periodically to
update each other on the state of the subnet (which makes this a dynamic
algorithm). If a neighbor claims to have a path to a node which is shorter
than your path, you start using that neighbor as the route to that node.
Notice that you don't actually know the route the neighbor thinks is shorter
- you trust his estimate and start sending frames that way.
When a neighbor sends you its routing table you examine it as follows
and update your own routing table.
for( i varied across all routers in the table )
if( your distance to the neighbor + neighbors distance
to router i < your previous estimate to router i ){
your distance to router
i = your distance to the neighbor + neighbors distance to router i
link to router i is set
to link to the neighbor with the short distance to i
}
You can think of this as forming an approximation of the global state of
the subnet from local information only (exchange with neighbors). Unfortunately
it has problems (it's only an approximation, after all). Good news (a link
comes up, a new router is available, a router or link are made faster)
propogate very quickly through the whole subnet (in the worst case it takes
a number of exchanges equal to the longest path for everyone to know the
good news).
Bad news is not spread reliably. Neighbors only slowly increase their
path length to a dead node, and the condition of being dead (infinite distance)
is reached by counting to infinity one at a time. Various means of fixing
this have been tried, but none are foolproof.
Link State
Widely used today, replaced Distance Vector in the ARPANET. Link State
improves the convergence of Distance Vector by having everybody share their
idea of the state of the net with everybody else (more information is available
to nodes, so better routing tables can be constructed).
The basic outline is
-
discover your neighbors
-
measure delay to your neighbors
-
bundle all the information about your neighbors together
-
send this information to all other routers in the subnet
-
compute the shortest path to every router with the information you receive
Neighbor discovery
Send an HELLO packet out. Receiving routers respond with their addresses,
which must be globally unique.
Measure delay
Time the round-trip for an ECHO packet, divide by two. Question arises:
do you include time spent waiting in the router (i.e. load factor of the
router) when measuring round-trip ECHO packet time or not?
Bundle your info
Put information for all your neighbors together, along with your own
id, a sequence number and an age.
Distribute your info
Ideally, every router would get every other routers data simultaneously.
This can't happen, so in effect you have different parts of the subnet
with different ideas of the topology of the net at the same time. Changes
ripple through the system, but routers that are widely spread can be using
very different routing tables at the same time. This could result in loops,
unreachable hosts, other types of problems.
A flooding algorithm is used to get the data out as soon as possible.
The seuqence number is used to control the flooding. Each router keeps
track of the routing data packets it has seen by router/seq no. pair. If
the pair is new, then it is forwarded on all lines but the one it arrived
on, if it has been seen before it is not forwarded.
The age of a data packet is used to prevent corruption of the sequence
number from causing valid data to be ignored. The age field is decremented
once per second by the routers which forward the packet. When it hits zero
it is discarded.
How often to exchange data?
Compute shortest path tree
Using an algorithm like Dijkstra's, and with a complete set of information
packets from other routers, every router can locally compute a shortest
path to every other router. The memory to store the data is proportional
to k * n, for n routers each with k neighbors. Time to compute can also
be large.
Bad data (from routers in error, e.g.) will corrupt the computation.
Hierarchical
When your subnet is large then the routing tables become unwieldy. Too
much memory to store them, too much time to search them, too much time
to compute them. When something is too large, people form a hieararchy
to deal with it.
The idea is to replace N different routing table entries to N different
individual routers with a single entry for a cluster of N routers. You
can apply many different levels of hierarchy.
<figure 5.17 example>
The price you pay is that you don't have the optimal route for each
router anymore (makes sense, since you lump all routers in a single cluster
together with one optimal path).
Broadcast
Unicast routing is most general, but broadcast routing is a good match
to many applications (e.g. distributing routing data, sending common information
like weather reports).
One way of doing this is by sending a packet which has the desired data
along with a list of hosts to be visited. At each router the list is examined.
For each outgoing line which is the best path to one of the destinations
in the list, the packet is duplicated and the destinations which are best
sent down this line are selected from the list for the new packet. Eventually
the packets only contain a single destination, in which case the algorithm
has done the distribution.
Multicast
The goal is to be more efficient than unicast, without the expense (or
lack of privacy) of broadcast.