Internet Trends and Prospects

Van Jacobson, Cisco
Closing address of NORDUnet, 1999

Performance scaling

Traditional Ethernet system: only worked since the Ethernet was only 10 M - the router could keep up with 300 hosts otherwise
Today: each host has its own cable, the lines are now 100 or 1000 M => routers must be different to keep up with hosts

Silicon-based forwarding engines today can switch at wire rate 2-3 M packets/sec - lots of companies do this
10 G (0C192) Ethernet is ~2 years away - limits are laser optics & standards, not forwarding performance

$5000 lasers - too much per port
No forwarding issues up to 40 G bps (100 M packets/sec) limits of electronics. The big technical challenges are (2-5 years):
flattening the net
flattening the stack

Flattening the net

With copper you used different transmission for core  (low speed) and edge (high speed) technologies. With fiber it is the same in both places, so router vendors have to put both core and edge functions in same box. Don't want two routers, too rooms, two buildings.

Boundaries are where everything happens for the Internet: qos, authentication, charging, security, etc. You need lots of CPU power in routers to do this.

At the core you don't need much CPU power since the actions/decisions are much simpler.
Looking 5 years ahead, the router box looks like:

1,2 fibers coming in with WDM - lots of power needed
channelized OC48 out to subscribers

Flattening the stack

Campus wiring to get fiber high speed internet conneciton to the campus. Need to replace the TDM of the telco.
IP router - ATM - SONET - WDM
5 years ago; expensive boxes
not possible to shred IP packets into cells at OC192, no one does this at OC48 even
this is an example of the telco controlling the WAN
IP router - SONET - WDM
drop the ATM since it doesn't buy much
SONET mux is 2x as much money as IP router
SONET forced on campuses by the telco, since that's what they speak
IP router - WDM raw fiber
this is where we need to go
Cisco could make a cheap box: small chassis, 16 GBE ports, 1 incoming fiber, $30k
but how do you hook up to the WDM of the telco? There needs to be a standard for this.
DPT - Ciscos attempt to replace SONET

Reliability

Can the net be as reliable as the phone system?

Phone system

phone switches are big mainframes, they know network topology, compute route, sets up connection for call (vc and time slots)
notice that:
no smarts in the network, all in the controller. the switches are very dumb
the controller is very expensive
1 billion DS0s in OC192 - the switch controller knows them all for the network it controls
scaling is: trunk speed * number of trunks (i.e. bad)
system can't adapt: only point of control is at connection setup
large setup cost means you have to have lengthy conversations to amortize the cost
not good for simple 1 packet exchanges (e.g. web, POS)
services must be constrained (e.g. only DS0 calls) since it is NP hard to mix different services (bin packing problem)
system is very fragile - call state is spread along the path in the switches; lose one you lose the whole call
Alternative is packet network
picture: cloud versus string, routers instead of switches
intelligence and work is distributed throughout the network
scaling is: trunk speed (networks can be bigger and cheaper)
architecturally agnostic: easy to incorporate strings as part of the cloud - not true in the reverse (hard to put IP clouds into ATM strings)
can adapt at any time scale consistent with the speed of light - data is self-describing, intelligence is distributed
no setup to amortize - supports single packet transactions
no constraint on services - mix and match different speeds of service
the best part: the bigger the network gets, the more reliable it is

Telcos do a a great marketing con job on "five nines" reliability. They need super-high component reliability to get their system to work at all

Graph

system rel versus network diameter (hops)
IP network - gradually rises from 0.9 with diameter
telco - decreases from 0.9 slow curve as the networks get bigger

This relationship forces the telcos to work very hard to get reliability - the switches must be fantastic
IP networks - routers aren't as important, the very structure gives reliability

Theory of random graphs

An arbitrarily connected graph, with a degree slightly above fully connected (just epsilon above) has the property that the graph is almost certainly connected. If you go above epsilon then there is almost no failure that will disconnect the graph.  This was the original motivation for the net, even though the math wasn't known at the time.
Basic reason
Strings fail if any single part fails
Clouds fail only when everything fails
IP routers aren't particularly reliable, they are cheap. They could certainly be better. The market determines this; cost has been more important than reliability. Now there is demand for telco reliability as voice moves to IP, so IP routers will be improved. The fact that the net works at all, given the relative unreliability of the components, hints at the underlying mathematical stuctural advantage of clouds versus strings.

The net as laser

All measurement data suggests that there is structure in the network behavior - rich and complex, beyond our current understanding.

Sender and receiver act like mirrors - bouncing packets/acks back and forth. The network is highly non-linear. Two mirrors at the end of a non-linear substrate is a laser. What arises is rich temporal structure from chaos. So it isn't suprising that telcos don't see this structure: they don't have mirrors. Routers remember what is done, so there is an iteration, further promoting ordered behavior.

The only lab to examine this behavior is the net itself. Lots of data must be present to see the structure, to have the behavior emerge.

In the US: the testbeds have acceptable use policies that don't let you experiment on backbones. So no testing shows up the important behavior.

Rant on research in the US

The Internet is the right answer for the future's universal high speed communications. What we have today is largely through luck, and a few brilliant people.

We don't know why the net works as well as it does; no one knows the general principles.

US academic research is focused on producing novelty, not understanding.
US corporate research is focused on producing products and market differentiators.

Conclusion

The Internet will be the medium of the next millenium - there is no viable alternative.
The pace of evolution will increase over the next decade.
The research agenda we set today determines whether that evolution will be driven by foresight or hindsight.

Component research is what we do today. System, control, understanding is the research we need to do.

Questions from audience

Peering agreements mean that there are single point of failures in the Internet today.

VJ: True, but this is not an an artifact of the technical aspects of the system, rather it is political, economic, etc. Bad policy on good technology. A surplus of bandwidth might improve this situation as people come to value more rich interconnection, rather than jealouly guarding bandwidth.

Don't firewalls and netboxes (the net of today) break the picture of the net you painted?

VJ: Perception is that the only way to do it today. Moving toward universal connectivity is important, more so than doing it with the best possible architecture. Peer-to-peer is much better, but client-server is the way growth is taking place now.

What about routing reliability?

VJ: Wavefront routing algorithms don't converge as fast, but you don't lose as much data when a link fails. The other view is that the ends are responsible for reliability, the routers just deal with their local issues. End systems should try to improve their reliability knowing that the routers are acting only locally. Family of algorithms involving multicast to do this. Driven by telephony since dropping voice packets is bad.

My experience is that the net doesn't work, and the telco net does work.

VJ: Structurally the net is capable of greater reliability. The bigger it gets, the more reliable it is. The telco net is only reliable because the pieces are very reliable. This is from the combinatoric math underlying the two types of networks (strings versus clouds). The unreliability of the net today is driven by economics, and the fact that it works at all is indicative of the underlying potential for greater reliability.