Wednesday, April 12, 2006

Install investigations

There's been a reaonable amount of discussion recently regarding Solaris installation speed. Indeed, there's even some idea of what the problem is.

However, closer investigation reveals that rewriting the contents file isn't the limiting factor in a Solaris install. It's not even in the top 3, which are:
  • The time taken to uncompress the data (they're bzip2 compressed)
  • The time taken write Solaris to the disk
  • The SMF manifest import on first boot

Currently, the contents file rewrite is having a race with the pkg tools overhead for 4th place.

Why this disconnect between the obvious problem - and it is a problem - of rewriting this large file many times and its relatively minor importance to Solaris install times? After all, for a slightly trimmed full install Solaris itself accounts for 3.5G of writes, and rewriting the contents file is twice that.

The point is, though, that rewriting the contents file involves a small number of large writes, which are quick. Solaris itself is many small files, so it generates something like 100 times the number of I/O operations.

Not only that, but it's possible to tweak the package installation order to minimize the amount of rewritten data. Simply install all the small packages early and leave the big packages that bloat the contents file until last. Doing so could - in principle - reduce the contents file rewrites by an order of magnitude.

This affects zone installs as well. For zone install, the uncompress cost doesn't exist at all. And for a sparse root zone, there's no Solaris to write to disk - it's loopback mounted from the global zone. So the contents file is much more important as a limiting factor for zone creation performance. However, I've managed to halve the contents file rewrites by tweaking the package installation order. I've not got that much control over the installation order, as it seems to depend on both dependency calculations and the order that opendir() goes through /var/sadm/pkg, but even then a gain of a factor 2 was fairly easy.

This isn't to say that the management of the contents file isn't an interesting and important subject that can lead to some benefits, but the relative importance of it in install performance can easily be substantially overstated. There's other low-hanging fruit to have a go at!

No comments: