This paper describes SEATTLE, a network architecture that tries to combine the simplicity and manageability of Ethernet with the scalability of IP. Their basic approach is to locate all the situations in which Ethernet, ARP, and DHCP use flooding or broadcast, and replace them with more scaleable protocols based on DHTs with consistent hashing and unicast messaging.
Ethernet is simple to manage because MAC addresses are simply identifiers, they do not encode a location (unlike IP addresses, which encode location because of their hierarchical structure). This makes Ethernet “plug-and-play”, and network reconfiguration is simplified. The disadvantage to Ethernet is that it isn’t intended to be used with large networks (“broadcast domains”), and hence relies on flooding and broadcasting to learn information about the network:
- Each Ethernet bridge holds a forwarding table mapping MAC addresses to physical addresses. If a destination MAC address is seen that is not in the table, the bridge floods the network (sending the packet on all outgoing ports). Furthermore, the size of the forwarding table is linear in the size of the broadcast domain.
- If an Ethernet broadcast domain is composed of multiple bridges, the bridges are arranged into a spanning tree. This means that packets “routed” through the tree don’t necessarily follow the shortest path; they also can’t choose alternate paths to improve scalability or reliability.
- ARP is used to resolve IP addresses into MAC addresses. It does this by broadcasting, which scales poorly as the size of the broadcast domain increases.
The poor scalability of Ethernet becomes increasingly important as datacenter networks grow to hundreds of thousands of hosts. IP solves many of these problems: for example, it uses shortest-path routing, and allows smaller routing tables (based on IP prefix and subnetting, not a flat namespace). However, IP is much harder to administer; for example, hosts much be arranged into hierarchical networks, and support for mobile hosts is limited (especially if continuity of service for mobile nodes is desired).
Therefore, SEATTLE tries to eliminate the scalability problems of Ethernet while retaining its administrative advantages. In SEATTLE, information about switch topology is replicated to every switch using a link-state protocol, which also enables shortest-path routing. Replicating switch topology is sensible, because it changes much less frequently than the locations of individual end hosts. SEATTLE also defines a network-level DHT (that is, each switch in the network is also a DHT node). This DHT is used to resolve the MAC address associated with an IP address, and the physical address/location associated with a MAC address. SEATTLE modifies ARP so that an ARP request can be satisfied without broadcasting (by doing a lookup in the DHT); it also extends ARP to return both the MAC address associated with the IP and the physical address associated with that MAC address, which avoids the need for a second DHT query.
Routers aggressively cache DHT lookups; the paper argues that because this caching scheme is “reactive” and traffic-driven and each host typically communicates with only a small set of other hosts (perhaps a debatable assumption), this caching approach requires that nodes maintain much less state than a “proactive” distribution scheme. Of course, the difficulty of caching is invalidation, which SEATTLE must do in order to support host mobility.
Rather than supporting network-wide broadcasts, SEATTLE allows administrators to define “groups”, which are essentially virtual broadcast domains. Because this scheme is layered on top of the DHT architecture, group membership is flexible. SEATTLE is evaluated with both simulations and a prototype implementation.
I really liked this paper: it simultaneously explains the existing architecture, critiques it, and sketches a clean-slate redesign while remaining coherent. I like the proposed clean-slate design, and I buy the motivation.
One potential problem is locality: if a resolver for either an IP address or MAC address is located far from the clients querying that data, the queries would be relatively expensive, especially in a WAN environment. The paper’s proposed solution for this problem seems like a bandaid: they suggest using a multi-level DHT, but they don’t describe how the levels would be configured. Presumably this would be left to the network administrators, which would give back much of the easy-maintainability advantage of SEATTLE in the first place. Perhaps there is DHT technology that automatically clusters data items close to the sources of queries for that data?