Wednesday, November 25, 2009

Is package management interesting?

My desktop workstation running Solaris 10 has almost 1600 packages installed. Many of the development systems I use have something close to 1000 packages installed. My Ubuntu (8.04, to be exact) desktop install fresh off the CD has just over 1100.

Given 1600 packages, can a sysadmin manage that, or even concisely describe what is installed and what its function is? I suspect not. It's actually so bad that I suspect most of us don't even bother.

Tools don't really help here. If anything, by giving the illusion of ease of use, they encourage the growth in the number of packages, making the underlying problem worse.

Really, though, is managing packages interesting? I submit that it's not, and that looking at the problem as one of managing packages is completely the wrong question.

Instead, we should be managing applications at a higher level of abstraction. Rather than managing the 250 packages that comprise Gnome, we need to collapse that into a single item, which we can then open up into manageable chunks. A package may be the right level of granularity for a packaging system, but it's the wrong level of granularity for an administrator.

We should be thinking of applications and function, not the detail of how packages deliver that functionality. I want to be able to go to a system and ask "What do you do?" and have it come back and say "I'm a web proxy server and mail relay"; I don't want to sift through 500 packages and try to work out which of them are relevant. If I want to set up a wiki for collaborative document editing that authenticates against my Active Directory infrastructure, then I want to phrase the requirement in those terms rather than try to work out the list of components that I need for that task.

From this, the packaging details become uninteresting. What a package contains, the packaging software, package names - are less important because that's just internal implementation detail.

The old Solaris installer had some of this - it defined clusters and metaclusters. The implementation doesn't really help much - the definition of the contents of clusters and metaclusters was poor, and there was no support for the clusters once you were managing an already installed system. Also, what you really want is a system that allows for clusters to be structured hierarchically (so you could take something like Gnome, and either manage it as a single unit, or have the option of dealing with subunits like games or libraries), and to overlap (for example, you could imagine that the apache web server would be in a whole lot of clusters).

One might be tempted to construct packages that use dependency information to bring in the packages they need. This approach is flawed: it doesn't cleanly separate groups from packages; it doesn't allow you to omit subgroups; and it makes removal of a group excessively difficult. Software clusters need to be cleanly layered above packages in order to allow each layer to best meet its own requirements.

Beyond simply delivering files (via packages) a cluster could also contain the details of how to actually take the various bits and pieces and put them together into a functioning system. In fact, the whole area of application configuration is one desperately in need of more attention.

A quick summary, then: package management shouldn't be interesting, and we need to move forward to managing applications and configuration.

Tuesday, November 17, 2009

Not so lucky any more?

The Google home page has seen some changes of late. One of these is the removal of the "I'm feeling lucky" button.

One of the things I've noticed over the past few months is that searching online has become dramatically worse. Google is increasingly failing to find useful results, and when it does find something it will rather present you with multiple instances of the same thing (the same news article syndicated to different sources, for example) rather than the more useful list of independent answers. Commonly I end up going to subsequent search pages, and often don't get to anything useful at all.

Is Google losing its touch? Has dumb search had its day? Perhaps the "feeling lucky" option was removed because it almost never works any more.

Monday, November 16, 2009

So what is wrong with SVR4 packaging, really?

So, as might be predicted I suppose, some people wilfully disregarded the thrust of my argument, and turned it into a debate over specific packaging technologies.

OK, so that brings us to the question: what is so bad about SVR4 packaging, really?

I could go on for pages. One of the reasons I found OpenSolaris attractive was the prospect of being able to fix all the bad things with Installation and Packaging in Solaris. Let's takes some of the comments though:

old and clunky

Guilty as charged. Really, it is old. It is clunky. It's been neglected and unloved. It needs fixing. Those reasons alone aren't enough to dismiss it - the key question should be whether it can actually do the job.

missing lots of features

OK, so what features does dpkg have that SVR4 packaging doesn't? That's really the comparison - versus the dpkg or rpm commands.

enabler of dim-sum patching

Completely and utterly false. The problem with Solaris patching lies fairly and squarely in the domain of patching. This could trivially be solved without any changes to tools - either deliver whole packages, or simply institute a policy saying that you can't deliver changes to a package (or related set of packages) in independent patches. This is a process problem, not something inherent to the underlying package system.




Then there are more material objections:

no repository support

Actually, SVR4 packaging does have the ability to fetch and install packages from remote locations. (A crippling limitation of IPS is that it can't do anything else.) What's wrong with this picture is the lack of repositories - blastwave aside. Wouldn't it have been easier to simply make existing packages available on a web site, without having to retool everything?

lack of dependency resolution

As the success of dpkg/apt demonstrates, having your underlying packager do all the work is neither necessary nor desirable. What that does demonstrate is the requirement for more powerful tools above the base packager. Actually, separating the fancy interface from the tools doing the low-level work is probably a really good thing - it enables compatibility of the system over time even if the higher-level tools change, and it enables innovation by allowing independent components to be developed independently.

arbitrary scripting

Actually, one of the key weaknesses of SVR4 packaging is not that it supports scripting, but that the support it offers is pretty poor. there is no real support - there ought to be a strong scripting framework with a well-defined environment, and predefined functionality for common tasks. Oh, and trigger scripts would be nice. Banning scripting (yet allowing it to sneak in through the back door) fails to solve the problem, and encourages bad workarounds.

poor package names

The actual package names are pretty much immaterial. Assigning real significance to them would be false. What matters is that they're unique and follow a scheme. As a user, you might want to install "php" - all that means is that the software studies the package metadata and works out what packages to really install. Actually having a pcakge with that name isn't necessary, and probably not even desirable (it locks you into current names and prevents evolution).




So, beyond a recognition in the codebase that the 21st century has arrived, and the lack of an apt-get/synaptic style front-0end - all of which could fairly easily be remedied - what is really wrong with SVR4 packaging?

Friday, November 13, 2009

JKstat 0.32

Friday the 13th, and I wonder whether to hold off for a day.

But no, another JKstat release comes out.

Nothing major here, still the continuing cleanup process. The one change here is better exception handling, particularly in Client-Server mode. It's not perfect, by any means, but in the past I just dropped errors and exceptions straight in the bin.

The reason for doing so was that I wasn't keen on declaring Exceptions to be thrown - it seriously clutters up the whole API. But Fabrice Bacchella (thanks!) pointed out the obvious, that if I were to throw a RuntimeException then I wouldn't have to declare that I threw it. So the fix is to create a subclass of RuntimeException and throw that, and consumers that need to know can check for that (and can check for the specific failure - RuntimeException is far too generic).

Tuesday, November 10, 2009

Into the sunset?

The announcement to EOL Solaris Express Community Edition (SXCE) was telegraphed well in advance, and we're coming to the end of the road with only a handful of planned releases left to look forward to.

But, is this just the end of the road for SXCE, or is it something bigger that's at stake here?

Read the Sun marketing and you might believe this is a glorious new dawn for the Solaris/OpenSolaris world. The reality may be more like sailing off into the sunset and disappearing from view.

The fundamental difference between the old and the new is instalation and packaging, which have been ripped out wholesale and been incompatibly replaced. Even if the replacements had been perfect (and, quite frankly, they fall a huge distance short) this would have been a huge challenge. Organisations (and individuals) are under huge pressure to retrench and consolidate. Adding additional technologies that they're expected to support is an uphill battle. Adding brand new (and essentially untested) technologies that they're going to have to learn from scratch makes it doubly hard.

If the next version of Solaris had been based around SXCE, with traditional deployment technologies - traditional packaging and jumpstart - then customers would have been able to start rolling it out tomorrow. Everything a customer knew, all their existing investment in skills and tools, would be preserved. New customers would be able to leverage the skills and expertise of existing customers. All the great features and functionality present in OpenSolaris would be there to be taken advantage of.

Contrast that with the planned OpenSolaris transition. You have to retrain all your staff, replace your entire toolset, rebuild your entire systems deployment and administration infrastructure. Most environments are heterogeneous, so this means you now have an entire extra set of infrastructure to support - you aren't going to transition everything to the new scheme immediately, so you're going to have to shoulder the burden of supporting the extra scheme in parallel for years. Isn't the most likely course of action for a cash-strapped IT
department with a CIO breathing down their neck to simply reject that and migrate everything over to RHEL?

Solaris and OpenSolaris contain fantastic technologies that make them a great choice for IT departments - ZFS, Crossbow, CIFS support, zones (especially sparse root zones), Dtrace, and many more - but by making deployment such an unattractive proposition we're making it far less likely that customers will try or use these technologies, and are giving organizations and managers every excuse to ignore Solaris and OpenSolaris as an option.

The best thing Oracle could do for Solaris and OpenSolaris would be to scrap the OpenSolaris distribution (but not the rest of OpenSolaris) and redirect our energies into building a better Solaris based on SXCE. If not, then I fear that Solaris will ride off into the sunset and be consigned to the wastebasket of superior technologies that failed due to bad strategic decisions, and that's a prospect that truly saddens me.

Sunday, November 08, 2009

Tinkering with the SVR4 packaging source

Take one look at the SVR4 packaging source in Solaris/OpenSolaris and it's clear that it's suffering from serious neglect. It's old. It's fragile. It's had stuff bolted onto the side over and over until it's a wonder that it works at all.

Yet, it's survived for the best part of two decades and every Solaris system uses it.

I recently had to fix up some packages on Solaris 10 that were in the deadly embrace of webstart. If you tried to remove them with prodreg, it said "oh no, you must use pkgrm", yet when you tried to use pkgrm it said "oh no, you must use prodreg".

So, we have access to the source, right? And, while I could have manually gone in and wiped out the errant packages by hand, I had a look at the SVR4 source to see if I could put together a version that actually did what it was told.

This was pretty easy. It took a little effort to get enough together that would compile cleanly on Solaris 10 again. And, having done that, and solved the initial problem, I did a bit more tinkering to remove some of the more obviously redundant code and apply some of the performance improvements that are sorely needed.

After thinking up a meaningless acronym, I've made the code I have available here: SPRATE. You're free to use it; anyone feeling masochistic enough to work on it is free to contact me.