This paper contains a theoretical analysis of how network protocols should respond to congestion, in order to keep performance at a good level. If we graph network load against throughput and latency, the goal of the work is to ensure that network utilization remains close to the “knee” of the curve: the point at which an additional increase in load would significantly increase latency, but only slightly increase throughput.
In the time-honored tradition of theory work, the authors propose a simplified system model to make their analysis tractable. Specifically:
- The network explicitly signals notification by returning binary feedback to the clients (1 = congested, 0 = not congested)
- Clients receive feedback promptly
- All the clients receive the same feedback, and they respond to the feedback synchronously: all clients have a chance to respond to the same piece of feedback at the same time
Clients now require a scheme for how to respond to this feedback. Their basic choice is how quickly to adapt to feedback: if I sent load x at time t0 and got feedback f, what load should I send at time t0 + 1? The authors present several criteria for evaluating such a policy:
- efficiency: how close can the total network utilization approach to the knee of the curve?
- fairness: do different users receive equal allocations of network capacity?
- distributedness: a practical scheme should not require that clients have global knowledge
- convergence: how quickly does the system converge to an equilibrium state (responsiveness), and how large are its oscillations around the equilibrium (smoothness)?
The paper describes a lovely geometric representation for efficiency and fairness, which provides a nice intuition for the remainder of the argument.
The paper then turns to particular schemes for responding to feedback. A desirable scheme would, over time, converge toward complete utilization of the network and complete fairness between the users. The authors draw several conclusions from algebraic analysis of their system model:
- The decrease policy should be multiplicative, to ensure that fairness is either increased or remains the same after a decrease
- An additive increase policy ensures the quickest convergence to fairness
- For convergence to efficiency, attempting to increase responsiveness (time to convergence) results in decreasing smoothness (degree of oscillation around the equilibrium point)
- Non-linear increase and decrease policies are unattractive for practical purposes, because they are highly sensitive to their input parameters
As a piece of theoretical work, I liked this paper—the visual representation of efficiency and fairness is particularly nice. My main concern is with their simplifying assumptions: the conclusions of the paper are hard to trust in practice, because their system model is quite simplified. For example, would their result that an additive decrease policy never converges to fairness actually hold true in a realistic network?
The fairness analysis is of limited value, because it assumes that users are not malicious and that they all follow the same policy for responding to congestion feedback. Ensuring fair use of network resources in a heterogeneous environment with malicious users seems like a much more challenging problem; I’m not sure how relevant the paper’s results would be to solving that harder challenge.
I wonder whether it is always ideal for systems to aim for the “knee” of the curve, rather than the “cliff” (the point of maximum throughput, irrespective of latency). For protocols that are not latency-sensitive (e.g. bulk data transfer), moving closer to the cliff might be more desirable than remaining at the knee.
I’m not sure this paper is worth keeping in the syllabus: it provides some intuition for the “additive increase, multiplicative decrease” policy, but I’m not sure how much I can trust the results in practice, and I didn’t see many general principles that would be useful in other domains.