Notes from NORDUnet/Terena Gigabit Networking Conference

June 7-10, 1999

Lund, Sweden

Quality of Service and the Internet

Brian Carpenter, Differentiated Sevices
Jula Heinänen, Experience building/using AF service
Brian Teitlebaum, QBone project
Van Jacobson, Internet qos
Panel discussion


Brian Carpenter    IETF, IAB chair, IBM

The scale of the problem is millions of simultaneous connections through the backbone. This is too big to build in the lab, and too big to simulate. That leaves only one choice: use good design principles.

What is known about performance and the net is mostly anecdotal. The performance of the net swings dramatically over the course of a short time. Studies show that on average as much as 20% of the packets across international links are dropped. Most of this is due to congestion.

The problem is congestion, the solution is limiting traffic through differentiated services (diffserv, DiffServ).

Diff services must

The basic diff serv model is


Source ---1--> Classifier ----2---> Network ------> Destination

packets are unmarked at 1
packets are marked by the classifier (2) with their DS Field

The Source and the Classifier may be one and the same.

The Classifier may due traffic shaping as well.

Forwarding

This is the job done by the routers, based on the DS Field of each packet.
This is known as Per Hop Behavior (PHB), and the overall qos result (source -> destination) is the concatenation of many PHBs along the way. That means

DS Field

6 bits, in IPv4 done in the TypeOfService field (8 bits)

Standardized in RFCs as of this week.
 
first 3 bits second 3 bits diffserv meaning
000 000 traditional best effort; no qos 
ccc 000 class selector, 000 low, 111 high 
xxx xx1 reserved for local/experimental uses 
101 110 EF (Expedited Forwarding); virtual leased line service 
001 010
... AF (Assured Forwarding)
4 classes of 3 drop precedences each (12 total combos)
100 110

Models

  1. Use RSVP on campus, apply diffserv at the boundary router.
  2. no RSVP, sources apply diffserv info directly.

  3.  

     
     
     

    In either case, the ISP's router will check for conformance to the service level agreement (SLS).

SLS

The service agreement between customer and ISP. A set of rules for how to forward various diffserv classes. The rules are stored in a database and must be delivered to the routers. There are quite a few proposals for doing this (SNMP, LDAP, etc).

Routers

Treat different classes of traffic differently. Conceptually more than 1 queue, in practice it's about queue handling. This isn't the hard part. Routers can take advantage of support for priority traffic if the underlying hardware/protocol of the outgoing link supports it (e.g ATM), but it isn't necessary (e.g. Ethernet).

You can view a "diffserv Internet" as an NxM network of queues (N routers, M classes). No one has analysed it this way. The mathematics are unkown, probably hard. Simulations are probably not feasible.



Jula Heinänen, Telia, Finland
"Building diffserv with AF PHB Group"

The basic approach is to allocate network forwarding resources (bandwidth, buffer space) using WFQ, priorities, etc, so that you can offer AF services end-to-end.

Random early drop is used. The threshold for dropping varies with the "drop precedence" for any given traffic class. He used "red", "green", "yellow" as examples.

Linux has a complete implementation that supports AP diffserv. I take it he meant something older than the stuff done at Karlsruhe and reported on at IWQoS?

The services he believes are important are:

Assured Bandwidth
Assured Delay
They use a WFQ approach, assigning 80% of the resources for AD packets, 20% for AB packets. This keeps the queue lengths small so they can make promises about their AB, AD services.

They use this service for their international connections, claim to have adequate intra-Finland bandwidth so that diffserv isn't necessary.



Ben Teitelbaum
"Internet2 + Diffserv/Premium - the QBone project"

QBone is trying to bring the logjam of not having apps that need QoS because the networks don't provide QoS, and the networks won't provide it until there are apps to use it.

Claims there is a lot of overprovisioning in the US Internet2. He sees this changing, and doesn't think extra capacity does away with the QoS problem. Data showing east-west coast connection across Internet2 still shows congestion can be a problem, even with extra capacity.

The QoS working group

About 20 institutions
Luleå - Stephen Pink?
Susan Hares of Merit  (state of Michigan ISP) leads the "bandwidth broker" group
Membership in QBone is open. Need to have a means of implementing (connection to Internet2 network, routers) and a compelling application that requires qos.

www.internet2.edu

Someone asked: Why not ATM, since ATM does qos already. Answer: IP goes everywhere, does everything. That's where the world has gone, the techniques for qos must follow.



Van Jacobson, Cisco (van@cisco.com)
"Congestion and Diffserv"

The problem is hard: no one really knows how to put diffserv into the Internet, and no one can predict how any given technique will work.

There are huge variations in data rate across the net: modem to OC48 backbone - 3 orders of magnitude. This won't change soon.

Congestion happens at fast-slow connection points:

Implications of the above:

Scaling and Performance

Single conversation situation
graph of throughput R versus offered traffic T
R increases linearly with T until carrying capacity of network is reached, the wires are full of bits, then it flattens out
graph of delay D versus offered traffic T
flat until point where T flattened out, then increases linearly as buffers fill, then flat
Multiple conversation situation So, some control and an incentive structure must be engineered into the network so that we don't have to rely on people being "good".

1. This is very hard to do with a highly heterogeneous network.

2. The difference in path lengths is a problem.

3. Distinguishing customers problem
A single ISP might have an ecommerce company with only a few hosts, and a large campus with 1000s of hosts. Each customer might have a 45 Mbps line to the ISP. If diffserv is done by flows, then the campus would be favored. The router at the boundary of the ISP has a hard time distinguishing flows, and deciding how to treat them fairly. Move two hops in from the boundary and the distinction is completely lost.

Conclusion

Diffserv protects some traffic, but doesn't solve congestion, since the other classes will congest. Diffserv in fact makes things much harder because now you have congestion, plus the interaction between classes.

Trust and incentive structures will be the key to solving congestion problems via diffserv.



Internet QoS Panel
VJ, BC, JH as above

Is RSVP dead?

BC: RSVP won't be important, doesn't scale.

VJ: RSVP signalling + Int Serv might actually be possible on a large scale, but it would incredibly expensive (economics of routers, numbers of routers). It's an economic issue. RSVP signalling is separate from Int Serve, and is a good and useful protocol for communicating what needs to be said about quality of service, network capacity, etc.

JH: The same arguments apply to the EP or "virtual leased line" that VJ is promoting.

VJ: No they don't because EP scales across many aggregation/de-aggregations of traffic. This is key. The information about customer must somehow be preserved as traffic is aggregated.

RSVP has not worked for IP telephony. The problem is that RSVP requires an IP/port and that info isn't available in IP telephony until the call is setup. Once done, RSVP might find that there isn't adequate resources: your phone rings, then you hear a busy signal.

RSVP has been used for diffserv signalling - MS and others have used it this way. It seems to work fine.

VJ on multicast

Not inherently harder than unicast, no reason the same model shouldn't work. The question is, what do people mean by diffserv multicast? Which portion of the multicast tree? The whole thing? one branch?

VJ on economics

The model must work across many levels of traffic aggregation/de-aggregation, and must work in a de-centralized environment where there are only bi-lateral agreements. That's life. Full accounting and settlement would be incredibly expensive and cumbersome, won't ever happen.

Diffserv - what was the question?

JH: Customers don't like fixed leased lines. Too inflexible. So they won't like EF/virtual leased line.

VJ: Customers want both EF and AF services. EF doesn't have fixed endpoints like real leased lines. The major similarity is the temporal guarantee. The EF/virtual leased line should appear by all temporal performance measures to be a fixed leased line.

When you aggregate traffic you get burstiness. This is what caused early ATM switches to have too small buffers: they didn't take into account the bursty problems of coincidentally synchronized incoming lines. When you aggregate all the way up to the Internet backbone, you can no longer cope with the burstiness. It just can't be done. So something must be done to decrease burstiness, or to de-synchronize the aggregated streams. EF/VLL retimes signals to do just that. So it scales better.

BC: Confirms that there is strong customer demand for EF/VLL service for ecommerce.