“DNS Performance and the Effectiveness of Caching” follows the networking tradition of collecting a great data set, and then using that data to draw some interesting and non-obvious conclusions about network behavior (see the BGP misconfiguration paper for another example of this approach).
This work is based on three separate network traces (two collected at MIT, one at KAIST in Korea). The authors recorded all outgoing DNS queries and received DNS responses, as well as all TCP connection start (SYN) and end (FIN, RST) packets for TCP flows originated inside the studied network. Only accounting for outgoing TCP flows means that they missed the effect of network services that perform DNS lookups in response to client activity (e.g. spam detection that does DNS resolution), but they found this only accounted for 10% of all DNS lookups.
In addition to their trace data, the authors performed some simulations based on the traces to measure the impact of changing various parameters (changes to average TTL, and the effects of shared caches).
The authors argue that decreasing the TTL of DNS address (“A”) records wouldn’t have much impact on the overall effectiveness of caching, due to the Zipfian distribution of host name lookups, and the fact that many cache hits for A records are tightly clustered together in time. If the inter-reference interval for a record is greater than the record’s TTL, caching will be ineffective for that record. They found that popular host names are accessed frequently enough that even a small TTL is sufficient, and that unpopular host names are accessed so infrequently that even with a large TTL, caching is ineffective. TTL values of a few minutes are sufficient to obtain most of the benefits of caching. Using a low TTL can be useful, because it allows changes to be propagated quickly; perhaps more importantly, it also enables new functionality, like DNS-based server selection and DNS support for mobile computing.
For similar reasons, the authors found that sharing a DNS cache among a large body of clients has only marginal utility once the number of clients increases above 10 or 20 — that is, the hit rate of a shared cache for ~1300 users was only very slightly higher than the hit rate for a cache shared among 20 clients.
In contrast, the authors found that caching for name server (“NS”) records was important, because NS records change rarely, and effective NS caching reduces traffic on the root name servers.
The paper argues that:
It is likely that DNS behavior is closely linked to Web traffic patterns, since most wide-area traffic is Web-related and Web connections are usually preceded by DNS lookups.
Given streaming video and peer-to-peer file sharing services like BitTorrent, it is probably no longer true that, by bandwidth, “most wide-area traffic is Web-related.” (This may still be true of flows, however.)
I thought this paper was interesting enough, but compared to other papers we’ve covered this semester, I didn’t get that much out of it: the interesting conclusions about the effectiveness of TTLs could be summarized without reading the entire paper. That said, this paper does present a good case study of how to do research that is driven by empirical data.