Sunday, August 02, 2015

Blank Zones

I've been playing around with various zone configurations on Tribblix. This is going beyond the normal sparse-root, whole-root, partial-root, and various other installation types, into thinking about other ways you can actually use zones to run software.

One possibility is what I'm tentatively calling a Blank zone. That is, a zone that has nothing running. Or, more precisely, just has an init process but not the normal array of miscellaneous processes that get started up by SMF in a normal boot.

You might be tempted to use 'zoneadm ready' rather than 'zoneadm boot'. This doesn't work, as you can't get into the zone:

zlogin: login allowed only to running zones (test1 is 'ready').
So you do actually need to boot the zone.

Why not simply disable the SMF services you don't need? This is fine if you still want SMF and most of the services, but SMF itself is quite a beast, and the minimal set of service dependencies is both large and extremely complex. In practice, you end up running most things just to keep the SMF dependencies happy.

Now, SMF is started by init using the following line (I've trimmed the redirections) from /etc/inittab

smf::sysinit:/lib/svc/bin/svc.startd

OK, so all we have to do is delete this entry, and we just get init. Right? Wrong! It's not quite that simple. If you try this then you get a boot failure:

INIT: Absent svc.startd entry or bad contract template.  Not starting svc.startd.
Requesting maintenance mode

In practice, this isn't fatal - the zone is still running, but apart from wondering why it's behaving like this it would be nice to have the zone boot without errors.

Looking at the source for init, it soon becomes clear what's happening. The init process is now intimately aware of SMF, so essentially it knows that its only job is to get startd running, and startd will do all the work. However, it's clear from the code that it's only looking for the smf id in the first field. So my solution here is to replace startd with an infinite sleep.

smf::sysinit:/usr/bin/sleep Inf

(As an aside, this led to illumos bug 6019, as the manpage for sleep(1) isn't correct. Using 'sleep infinite' as the manpage suggests led to other failures.)

Then, the zone boots up, and the process tree looks like this:

# ptree -z test1
10210 zsched
  10338 /sbin/init
    10343 /usr/bin/sleep Inf

To get into the zone, you just need to use zlogin. Without anything running, there aren't the normal daemons (like sshd) available for you to connect to. It's somewhat disconcerting to type 'netstat -a' and get nothing back.

For permanent services, you could run them from inittab (in the traditional way), or have an external system that creates the zones and uses zlogin to start the application. Of course, this means that you're responsible for any required system configuration and for getting any prerequisite services running.

In particular, this sort of trick works better with shared-IP zones, in which the network is configured from the global zone. With an exclusive-IP zone, all the networking would need to be set up inside the zone, and there's nothing running to do that for you.

Another thought I had was to use a replacement init. The downside to this is that the name of the init process is baked into the brand definition, so I would have to create a duplicate of each brand to run it like this. Just tweaking the inittab inside a zone is far more flexible.

It would be nice to have more flexibility. At the present time, I either have just init, or the whole of SMF. There's a whole range of potentially useful configurations between these extremes.

The other thing is to come up with a better name. Blank zone. Null zone. Something else?

No comments: