Too Good To Be Believed

In the (excellent) Sinfonia SOSP ’07 paper, the authors compare a group communication system (GCS) built using Sinfonia with the open source Spread GCS. Although I like the Sinfonia paper a lot, I thought this evaluation was actually detrimental to the paper. The authors present several graphs comparing the performance of SinfoniaGCS with Spread, such as this one:

Performance comparison of Spread and SinfoniaGCS

Clearly, SinfoniaGCS vastly outperforms Spread in this configuration. At first glance, this might seem like a great experiment: the authors have demonstrated that you can use Sinfonia to build a high-performance GCS, right?

To me, including this evaluation in the paper is not helpful, because it raises more questions than it answers. There are two possibilities:

  1. Spread’s performance is truly terrible. In that case, what are we to learn from comparing the performance of SinfoniaGCS with a worst-in-class alternative?
  2. Spread is misconfigured. The paper notes that Spread wasn’t configured to use IP broadcast or multicast, and that SinfoniaGCS was allowed to batch together 128 messages at a time; either change could have a huge performance impact. Again, there is little to learn from an apples-to-oranges comparison between a carefully tuned Sinfonia system and a misconfigured Spread system.

A convincing performance study would show that by using the Sinfonia infrastructure, one can build a GCS that approaches the performance of an optimized GCS written from scratch (and perhaps that using Sinfonia leads to a smaller/simpler GCS implementation). Alternatively, if using Sinfonia really does allow dramatically better performance, the reasons for the performance difference should be explored: Sinfonia is not magic, and if the authors could have identified some reasons for why traditional GCS designs perform poorly on modern datacenter networks, that would be an interesting result.

Instead, the authors merely speculate that using IP broadcast/multicast would improve Spread performance and leave it at that. Unfortunately, the result is a performance study from which we can learn very little.

Leave a comment

Filed under Research Notes

Leave a comment