benchmark results for distcc

The following table shows the effect of distcc on several real-world compilation tasks; in this case, compiling various open-source projects. In each case, we compiled in four modes:

  1. local_01: all compilation done locally
  2. dist_h38_j40: using distcc with 38 compilation servers (152 CPUs), and make -j40, which gives 40 parallel compilations. distcc is used in non-pump mode (local preprocessing).
  3. pump_h38_j40: like dist_h38_j40, but using distcc in pump (remote preprocessing) mode.
  4. pump_h38_j80: like pump_h38_j40, but using make -j80 instead of make -j40.

The client machine is an 2.8GHz Intel Pentium 4 machine, with 2G of memory, running Linux 2.6.18.5 (modified Ubuntu 6.06). The server machines were all 4 CPU 2GHz AMD Dual Core Processor 270 machines, with 2G of memory, running Linux 2.6.18.5 (modified Ubuntu 6.06).

Each test was run 5 times. We report the average time over the 5 runs, as well as the standard deviation. In order to minimize disk caching effects on walltime results, we read all the files in each project's tarball before building.

Projectmode wall
time
std
dev
cpu
time
cpu
%

binutils-2.18local_01 125.8s 4.0s 102.9s 81.9%
binutils-2.18dist_h38_j40 35.4s 1.0s 25.6s 72.2%
binutils-2.18pump_h38_j40 28.7s 1.5s 16.6s 57.7%
binutils-2.18pump_h38_j80 29.8s 1.5s 16.6s 55.9%

glibc-2.6 local_01 946.4s 5.1s 568.7s 60.1%
glibc-2.6 dist_h38_j40 534.0s 7.0s 390.5s 74.7%
glibc-2.6 pump_h38_j40 357.8s 6.5s 241.6s 67.5%
glibc-2.6 pump_h38_j80 365.4s 4.2s 243.0s 66.6%

hello-2.1.1 local_01 0.7s 0.1s 0.4s 62.8%
hello-2.1.1 dist_h38_j40 0.9s 0.2s 0.3s 40.3%
hello-2.1.1 pump_h38_j40 1.1s 0.1s 0.4s 31.3%
hello-2.1.1 pump_h38_j80 1.1s 0.0s 0.4s 31.6%

httpd-2.0.43local_01 139.8s 1.4s 117.2s 83.9%
httpd-2.0.43dist_h38_j40 85.1s 2.1s 51.3s 60.4%
httpd-2.0.43pump_h38_j40 80.1s 2.0s 33.7s 42.1%
httpd-2.0.43pump_h38_j80 81.7s 4.1s 33.7s 41.4%

linux-2.6.25local_01 818.3s 7.9s 543.9s 66.5%
linux-2.6.25dist_h38_j40 203.2s 1.7s 185.3s 91.2%
linux-2.6.25pump_h38_j40 134.9s 5.3s 96.5s 71.6%
linux-2.6.25pump_h38_j80 135.6s 4.3s 97.7s 72.1%

samba-3.0.20local_01 314.0s 2.4s 258.8s 82.4%
samba-3.0.20dist_h38_j40 101.8s 0.6s 94.0s 92.3%
samba-3.0.20pump_h38_j40 31.4s 2.4s 21.1s 67.4%
samba-3.0.20pump_h38_j80 31.8s 1.7s 21.2s 66.9%

Discussion

  • For all but the smallest projects, distcc results in a significant speedup in compilation time over compiling locally. This is because distcc is able to compile many source files in parallel. Such parallelism may not possible for small projects such as hello, which involves four small compilations.

  • In almost all cases, distcc's new pump mode results in a significant speedup in compilation time over non-pump mode. Sometimes, as for Samba 3.0, the speedup is more than a factor of three!

  • The parallelism of the various projects' Makefiles affects the obtained speedup significantly. Makefiles that run make sequentially in subdirectories benefit less from distcc. They will see little added benefit from distcc's pump mode because the sequentiality of their execution allows only some, but not many, compilations to be issued near simultaneously.

  • All these opensource projects are built by running configure plus make. In each case, we count only the make time, not the configure time. However, projects such as binutils run extra configuration during the make step (for binutils, the initial configure run is trivial, and the make command does more intensive configuration in various subdirectories before building). This will affect times as described above -- especially reducing the benefit of pump mode -- since the configuration steps are not run in parallel.

  • Even when pump mode does not speed up the build much, as for httpd, it still reduces the CPU burden on the local machine, making it more usable during compiles. Note, however, that this is balanced by an increased CPU burden on the server machines (about 10% in our tests), and may also require more memory on the host machine than non-pump mode does.

  • With a multi-processor client machine, the speed-ups would have been less, both for non-pump distcc over local compilation as for distcc-pump over non-pump distcc. Still, with four-processor client machines, distcc's pump-mode is up to 2 1/2 times faster than its non-pump mode for large projects benchmarked at Google.

  • As a side note, while collecting benchmark results, we found sometimes that pump mode did not give the expected speedup. On analyzing the logs, we discovered the reason: that distcc had encountered an error running the test in pump mode, and had fallen back to plain distcc mode. This can happen for several reasons. For example, the Linux kernel needed special attention because it rewrites header files during the build.

Copyright © 2008 Google Inc.
distcc is a trademark of Martin Pool.

Send comments to distcc(at)lists.samba.org.