Wednesday, October 10, 2007

Solaris on a Laptop

Every so often, things go without a hitch.

Today I got a laptop to look at. It's an old Dell D600, and the requirement was to have something looking like our production Unix environment on it. One of my colleagues had already stuck a Solaris DVD and got Solaris installed, and then handed it over.

First job was to get the network working. That's easy - off to the Broadcom download site to pick up the BCME driver. (Oddly, it's in the server section of the downloads.) That installed, I was on the wired network.

Next for wireless. A quick look and this machine seemed to have a Broadcom 4320 wireless chip. So off to the OpenSolaris Laptop Community, and it looks as though I need the ndis wrapper. With that and the windows driver I downloaded from Dell, I just followed the instructions. OK, ifconfig brings up the wireless.

That's a bit crude, and it would be nice to avoid too much CLI intervention. Enter wificonfig and Inetmenu and we're all set. Having given the laptop user the 'Ginetmenu' and 'Primary Administrator' profiles, it's as easy as firing up inetmenu and selecting from the list.

And both wireless and wired networks work (almost) flawlessly.

I didn't have a chance to look at wpa support, which would be needed some of the time. But I was mightily impressed by how easily everything I did try worked out.

Thursday, September 27, 2007

Jumpstart Profile Builder

The latest version of solview includes something I've been working on for a while - the ability to construct a jumpstart profile.

The aim is to get a tool to do the hard work of resolving package dependencies, so that the resulting jumpstart profile is nicely self-consistent. Doing this in the past has been very much a case of trial and error. No more!

I'll just show one screenshot here, of adding evolution on top of the reduced networking metacluster:



There's a lot more work than can be done here. The interface, while workable, is rather crude. I would like to get it to construct a reasonable jumpstart profile based on an installed system, and it would be nice to be able to feed it all my old profiles and get it to fix them up for me. And I'm sure there are many other ways it could be improved, or used in different ways.

So if you have any suggestions, give me a shout!

Sunday, September 09, 2007

Solview update

I've just released a new version of solview.

If you've tried to run solview in a non global zone, then you'll find that it didn't work. (Thanks to Tony Curtis for spotting this!) This was due to an ignorant and lazy programmer. Yup, I hadn't tested it in a non global zone for a while.

This latest update actually checks that the files it relies on to grab the information it's going to display exist, and behaves properly if they don't. I'm not saying that it works correctly in all circumstances, but it does work in the cases I've tested, whereas previous versions were rather more fragile.

JKstat update

It's been a bit quiet on the JKstat front recently (summer holidays may have had something to do with that).

I've done a little bit of cleanup I've been meaning to for a while - getting rid of fixed text strings and replacing them with resources, making localization possible.

I've also added a little chart feature - when charting a statistic, it's now possible to add other statistics from that kstat to the chart.

Nothing earth-shattering, but it still represents steady progress and so there's now a new version available for download.

Monday, September 03, 2007

X2200M2 fault diagnosis

I'm just setting up some new servers - Sun X2200M2 (running Solaris, of course).

One wasn't happy. Solaris would start to boot via PXE but the system would reset (over and over, in a loop). A quick look and the fault light was on. Not good.

These systems have remote management - ILOM - but after searching and looking through the documentation I couldn't actually see how to persuade it to tell me what was wrong. It's not helped by the fact that there are several ILOM variants in use, but the one on the X2200M2 is one of the more basic ones.

After running through all the options systematically, I stumbled across:

show /SP/AgentInfo/SEL

which tells me

Nonrecoverable ,2007/08/31 17:22:00 ,CPU1 DIMM 3 has multi-bit error

Aha! I've reseated the DIMM after swapping it with its partner, and the installation is proceeding apace.

Saturday, September 01, 2007

W2100z update

I've just updated the BIOS on my home W2100z (downloaded from here).

The old BIOS on the machine was antique. I mean, it predates all the old versions on the download page. And I remember updating it at least once (to get PXE boot working), so it wasn't the factory version.

The update was pretty smooth. However, on booting back into Solaris I see a CPU pegged and very high interrupt rates. Not good.

Now, it turns out that I've stumbled across this before. And in response to that, Alan Hargreaves got the thermal zone driver backported to Solaris 10. So it's available in patch 125107.

Unfortunately getting hold of this patch is proving troublesome because sunsolve is broken again.

OK, so I go have a cup of coffee and this time, success. Patch applied and just save this blog before I reboot.

Tuesday, August 28, 2007

Fixed funny fonts in gpdf

If you're running Solaris 10 on x86, you may have noticed that some PDF documents (particularly, and somewhat ironically, those generated by Sun) display as garbage. Instead of readable text you get rubbish - rather like this:



What you need is the recently released patch 119813-06, which fixes bug 6375381. (That's the original bug, which shows it got fixed in nevada ages ago - 2147903 is the one that tracks Solaris 10 integration.)

So now it looks like this - a vast improvement:

Wednesday, August 22, 2007

That's a busy disk...

From iostat -xnz 1 on my desktop:

r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 3.0 0.0 10.5 18444348816.8 18444348816.8 6148914691236.1 \
6148914691236.1 1844434881679 1844434881679 c1d0

Sorry if it wraps too much.

Of course, those numbers look very suspicious. I have seen this sort of silliness several times in iostat output recently though.

Monday, August 20, 2007

Sunsolve needs a makeover

Sunsolve is an excellent resource. I don't have any issues with the content - it's great stuff.

Where I do have problems is with it's reliability and performance. I couldn't get through at all for a while over the weekend, and often have problems with my automated patch scripts - which can time out, report 500 errors, or tell me I'm unauthorized, all at random, and have been that way for a long time.

I even got the following gem off the website just now:


Come on Sun - show us how good your stuff really is by making your own sites work well!

Saturday, August 18, 2007

Fingered!

I found this nugget on indiana-discuss to be quite apt:
one of the keys to identifying old-time Unix admins, particularly of the SunOS persuasion, is their use of tcsh
Yup. That's got my number!

Saturday, July 07, 2007

Laid low :-(

I've been suffering from a fairly severe viral infection for the last week. I've had to take several days off work, and felt pretty atrocious. I'm hoping that it's starting to ease and I'll be feeling better in the next couple of days.

Sunday, June 24, 2007

And solview too

Not content with updating jkstat, I've released a minor update to solview.

New jkstat version - real progress

I've let loose a new version of JKstat, also available from the jkstat project page.

This is version 0.19, and represents real progress. I'll start with some of the fiddly stuff: I've got rid of all the individual utility scripts (they were breeding out of control) and replaced them with a single jkstat script that takes subcommands. I've removed the native methods for looking up individual statistics, as there was no possibility of error checking (like to see if the value returned was valid). There's an client for nfsstat and the start of one to emulate the kstat(1M) perl client. I've got a swing widget that allows you to add a menu to any java client from which you can launch some of the demos.

The big feature is that I can now detect changes in the underlying kstat chain. I've added another class to handle this, so it's responsible for working out whether something has changed, and which kstats have been added and removed. This wasn't quite as big a job as I had feared.

Getting the demos to use this was somewhat more difficult. In particular, the browser took far more work than I ever expected. The kstats are in a tree, they're presented using a JTree, so you just update the tree, right. Wrong! I had originally placed the kstats into a Hashtable (it's a natural things to do), passed that to the JTree. Then Tom Erickson helped me out by creating a sorted Map, and a revised JTree that could take a Map. (And why some of the java swing components haven't moved into the 21st century and become aware of the Collections framework is beyond me.) I wasted a lot of time thinking I could update the Map and get the tree to refresh its view of the world, but it just didn't work. In the end I had to get rid of that and construct my own model, doing everything manually. Ugh.

It's not finished. Not all the demos handle updates. (Those using tables do - that was easy.) And the browser has a couple of cosmetic flaws: the newly added kstats aren't currently sorted correctly, and if you're watching a kstat that gets removed it doesn't tell you that anything has happened. But it's a major step forward and I'm going to take a deep breath and use this for a while before diving back in.

Monday, June 18, 2007

Solaris metacluster sizes

As part of the Indiana discussion, I was looking at the installed sizes of various Solaris clusters of packages. The idea is to work out where all the space is going so as to be able to trim an OpenSolaris distribution onto a single CD. Depending on compression, this might allow something just over 1G of software.

A minor extension let me do the same for the various metaclusters you can choose from in the Solaris installer. And the sizes of those on a Solaris 10 update 3 x86 system are like this:

157.945M : SUNWCmreq
168.594M : SUNWCrnet
195.372M : SUNWCreq
2010.12M : SUNWCuser
2579.84M : SUNWCprog
2614.19M : SUNWCall
2614.2M : SUNWCXall

This confirms what we've known all along, that the small clusters are tiny, the other install clusters are huge, and there's a great big gap between the two extremes. More than that, the end user cluster misses out on a lot of stuff that would be really valuable to an individual user.

(I would recommend to anyone who wants to customize the packages when installing a Solaris system - either using jumpstart or interactively - that they start with either SUNWCrnet or SUNWCall, as the others don't help you much at all.)

The snag with building a CD sized distribution is that the target size is right in the middle of the gaping chasm between a fully functional (SUNWCprog) and minimized (SUNWCrnet) layout.

Fortunately, things aren't quite as bad as they look. These numbers are for a regular Solaris system. This includes a fair amount of software that isn't freely redistributable, so all the numbers shrink. Sometimes by a lot, as if you start taking StarOffice or java out then you save significant amounts of diskspace.

Sunday, June 03, 2007

solview does patches

I updated solview so it now shows patches. It gets most of this for free, by looking at the installed packages. So you get the standard stuff from showrev -p, with the packages having a short description, plus it will display the readme file for the patch if it can find it. Next step is to parse the patchdiag.xref file to work out what needs updating, like pca does.

Sunday, May 13, 2007

Solview, JPack

Last week I added contents file parsing to solview.

More recently, Michal Pryc announced that he was working on JPack, an easy software manager and installer written in Java.

Now, the current focus of these tools is slightly different - solview is about analysing what your system has, JPack aims to make adding new software painless.

I was planning to add package management to solview at some point, but probably won't bother now - there seems little point in duplicating the work in JPack so I could just reuse that, and hopefully I'll be able to contribute to JPack as well.

Motivated by the rather better look of the GUI in JPack, I had a little go at reorganising the layout in solview so that the package list is done as a table. It does look better. I also did some refactoring of the code - both to make it more usable by other projects (such as JPack), and to remind myself how it all worked.

So there's a new version of solview now available, with the above mentioned refactoring and layout changes, and with the slow operations done in the background with a SwingWorker.

Monday, May 07, 2007

solview - includes contents

A little bit of work this weekend and I've let out a new version of solview.

No sign of the promised jumpstart profile builder, though.

But what is present now is a contents file parser. This allows me to add a summary of the number of files and space used to each package (and to each cluster). For example:



And in addition I added a filesystem browser so you can choose a file and see what packages and clusters it's included in:



There's a slight amount of lunacy involved in parsing the contents file. Don't be surprised to see OutOfMemory errors, as the memory usage is fairly high (several hundred meg, commonly).

Monday, April 23, 2007

Phoronix and Solaris

Interesting: Phoronix now supports Solaris. See the article.

Not only an endorsement of the growth of interest in Solaris, but hopefully another source of information regarding hardware compatibility for Solaris.

Sunday, April 15, 2007

JKstat 0.17 - includes charts

I've been gradually continuing work on JKstat. The latest version (0.17) has a number of code cleanups, but also includes charting capability. (Uses JFreeChart.)

You can right-click on any numerical statistic in kstatbrowser and chart that statistic. Here's an example:

Sunday, March 25, 2007

JKstat 0.15

I've made a new version of JKstat available.

There are quite a few changes underneath. I'm very grateful to Tom Erickson for sending me a bunch of enhancements and fixes. Some of them made it into this version fairly directly - particularly the sorting of kstats in the browser.

This version has also been revamped and now requires Java 5. I had been thinking about modernizing the code for a while, but had always put it off. Tom again provided the impetus, as he went ahead and used generics in the fixed version he sent back to me. I reimplemented it myself, so I could actually understand what I was doing to my code.

I actually went and rewrote solview first. This was a useful exercise in itself, as one of the advantages of using generics is that it makes you think harder about the way you define classes and how your code is structured. In solview I was in the middle of cleaning the code up to define Solaris packages and install clusters as proper objects, and using generics made sure I actually got that right.

JKstat was also accepted as an OpenSolaris project, so I'm going to be moving development over there soon.

Saturday, March 24, 2007

OpenSolaris priorities

Before the OGB election there was the OpenSolaris Comunity Priorities vote.

Looking at the result, and the clear winner in terms of priorities is the public defect management system. Yep, I put that at number 1 too. (I think...)

The OpenSolaris Bugs interface did get revamped recently, and that was a significant improvement. However, it's still not really doing anything for OpenSolaris - it's just a way to give a limited view of a subset of Sun's bug database.

Importantly, there's no other integration with anything else the community might be doing, and no way for a community member to interact with the system. How can it be improved?

Here are a couple of things I would like to see:

Eliminate all the crud. The database is littered with all sorts of things that aren't bugs, and still contains bugs that are antique. We need to take some of the open bugs forward, but most of the database has no relevance to the current codebase.

I think this means we need a completely separate database. And bugs that Sun generate through their support organization need to be put into the public OpenSolaris bug database the same way that Sun currently logs bugs against external projects such as Gnome.

We need to allow community members to participate in the triage process, and allow comments and followups to further refine the bugs and provide sample solutions, test cases, and prioritization. This extra interaction will build a stronger community, and provide ways for more people to get involved.

And the database needs to be linked into the whole workflow, so that there's a way to see who might be working on a particular bug and contact them.

The bug database is the crucial missing link. We want to improve and develop OpenSolaris, and to do that we first need to know what's broken and what's missing.

Polls and voter apathy

I've been sitting on the fringes of the OpenSolaris Governing Board Election. I would have liked to have been more involved, but have been struggling to find time. Strike one for apathy!

(And, while I did consider putting myself forward as a candidate, the fact that I haven't been able to effectively contribute up to this stage also implies that I wouldn't be able to do the job properly in the unlikely event that I were to be elected.)

There does seem to be a fairly low turnout so far. There have been a number of calls for people to get their finger out, and we seem to be creeping closer to a quorum.

Alan Coopersmith had a look at the contributors and communities. This shows what we've known all along - that the different communities have very different levels of participation in the governance process. Looking at the way that core contributors are distributed, it's not clear to me now that relying on communities to generate the initial list was - or is - a good thing.

There seems to be a view that reorganising the communities is going to be a big task for the incoming OGB. (If one gets elected, that is.) I'm not so sure about this. I feel that we're trying to build too much officialdom into the community structure, when in reality the communities are more social structures connected by a common interest. But more on that later.

I'm fortunate that I am classed as a core contributor and, yes, I've voted.

Perhaps core contributor status should be automatically be rescinded from those who haven't bothered to vote?

Thursday, March 08, 2007

Spring Clean and Upgrade

I've been doing a bit of a spring clean and upgrade on some of my home systems.

The family PC had its memory doubled. It's survived 3 years with only 512M, which ought to be enough but it's obviously running out. We use fast user switching so there are usually 4 people logged in. That shouldn't be so bad, but it implies 4 web browsers and they guzzle memory like it's going out of fashion. I also replaced the broken DVD-ROM with a working dvd writer.

I think it's the first time the case has been opened for 3 years. And it was filthy inside, so it got a thorough vacuuming.

Next step was a couple of Sun workstations. I use one for OpenSolaris development, and I shuffled that up from build 55 to build 59. I also took a gamble on rebuilding one of the children's systems from an essentially unpatched FCS version of Solaris 10 to the latest Solaris Express release - the same build 59. Only complaint so far is the placement of the tab close buttons in Firefox 2 versus Mozilla.

Next step - run Nexenta using the VMware player.

Tuesday, February 20, 2007

A new use for a D1000 array

I found a new use for a D1000 array last night.

We bought a pallet-load off eBay a while back. We only actually needed one, but the whole pallet was no more expensive. So the first one got used for the purpose for which it was bought, and I've used a couple more filled up with old drives to test ZFS and the like.

We just piled the rest up in a store-room.

Which is where they had stayed for a few weeks harmlessly gathering dust.

Last night I went in there to look for some cables, and was somewhat annoyed to find that the door wouldn't open when I tried to get out. The lock release did absolutely nothing.

What you have to understand is that this was an old comms or server room. So that's a security and fire door in my way. No windows. No phone. My mobile is on my desk. It's almost soundproof as well, and it's getting late.

I did manage to attract someone's attention as they were passing in the corridor, and we tried all sorts of tricks to get the stupid door to open, with no success.

Eventually I piled all the D1000s up and stood on top of them. (Which just goes to show that buying quality kit is always worthwhile. You try this trick with cheap kit and it will fold under the weight.) Originally I was hoping to get into the false ceiling and work my way through, but even that was a bit of a stretch and very cluttered.

Eventually we took out the panel above the door and I got out that way.

See, I knew this old kit would come in useful some day!

Sunday, February 11, 2007

Updated jkstat

It's been a while (almost a year) since I last updated jkstat, but I've just put version 0.13 up for download.

For those that don't know, jkstat is a Java jni interface to Solaris kstats, allowing you to access a wide range of useful data held by the Solaris kernel from a Java application.

One thing I have done is this version is try to improve the look of the sample applications, most of which has been achieved by replacing the rather clumsy GridLayout with SpringLayout.

Thursday, February 08, 2007

Impressed with Xfce

I've been having a play with Xfce on one of my Solaris workstations.

And I have to say I've been very impressed. It's reasonably fast and light. It has an increasing number of applications. The file manager - Thunar - is quite nice.

There are two things that I think are in favour of Xfce. The first is that because of its common roots with gnome it can leverage all the gnome applications and GTK themes. The second is that it is very polished. There is a good section of window manager themes, and the artwork is very well done.

All in all it's well worth a look.

Monday, January 22, 2007

Sun, Intel - friends?

OK, so Sun and Intel made a big fuss about some deal or other (one which doesn't really set the pulse racing, mind). What I found ironic was the ticker underneath the main feature on the Sun home page, where they put the knife into Itanium.

I remember Sun recently selling a number of Intel powered systems. The LX50 was a nice box - even if it was pushed with SunLinux. The V60x was OK, but I quite liked the V65x, and that sort of system (6 internal drives) is something that would be quite valuable. And the B200x blade was quite a marvel - twin hyperthreaded Xeons and a decent memory configuration in a tiny package.

I expect closer collaboration will improve driver support for Solaris, but I'm also hoping for a wider range of systems designed for Solaris to give us a breadth and variety of system configurations that Sun themselves seem disinclined to supply.

Sunday, January 14, 2007

Backup Blues

Taking system backups is an essential part of any IT operation, but seems to involve an awful lot of pain.

Over the years I've generally been happy with Legato (now EMC) NetWorker, sometimes known as Sun StorEdge Backup or something similar. What I really like about it is that it's very low maintenance - easy to set up and configure, and gets on with its job quietly in the background without any fuss.

I was a little concerned that the latest version has dropped the old nwadmin X-windows based GUI in favour of some Java client-server thingummy. I still don't really like the java interface, but as I previously mentioned it's pretty low maintenance so you don't need to spend much time using it, and it's generally easy enough to use. The one downside to it is that the licensing is a bit more complicated, and the license registration instructions explicitly refer to the old GUI.

I'm currently using NetWorker for one half of my backups. I put in a new server and tape library and moved the NetWorker license across. The transition has been very successful, although the primary win is that I replaced an L20 with 2 DLT-7000 drives with a C4 with 2 LTO3 drives. The old system simply didn't have the capacity or the performance - backups would take days, and some backup runs came in at over twice the capacity of the library. Now they take hours and fit onto a couple of tapes.

The other half of my backup story is more appropriate for the tile of this post. There, we're using an alternative product. OK, specifically, NetBackup. And it's much more trouble than NetWorker has ever been.

I'm trying hard to give a balanced view here. After all, NetBackup is widely used and must have some good points, right? Maybe...

The first thing is that NetBackup is very high maintenance, at least compared to NetWorker. It needs constant attention paid to it, seeming to go off and hide in a corner if you don't give it constant encouragement. I'm not surprised that large organizations have dedicated NetBackup administrators, or even teams thereof.

The second thing is that I (and I know I'm not the only one) have trouble with the way you actually define backup schedules. In NetWorker it's trivially easy - you define what you want backed up, what level of backups you want, and when to do it. In NetBackup you mess around with backup windows, frequencies, policies, and schedules. Again, this is much higher maintenance - I usually need to manually configure half a dozen screens in NetBackup as opposed to simply selecting one option under NetWorker.

The third thing is reliability. The only time I have known NetWorker fail to do a backup is if I've run out of tape. NetBackup seems to throw fits all over the place. We see regular timeouts, errors when it's outside the backup window, and occasional (once or twice a month) where the whole thing freezes up and needs a forceful restart. I've just upgraded the NetBackup server itself, and after tuning got acceptable performance (the default performance was dreadful), but the catalog backup takes over a day for something that ought to take 20 minutes, and renders the machine completely unresponsive while it's at it.

I'm tempted to replace the NetBackup installation with NetWorker, which raises a number of issues. I'm happy that it will be fairly painless on the Unix side, but I'm less sure about the Windows side which also needs backing up. While I have run NetWorker on Windows, it became apparent fairly quickly that it was basically a Unix product and looked to be in alien territory. And my experience of NetWorker's application modules (as needed for the likes of databases, exchange and the like) hasn't been entirely positive. So I'm thinking of a completely separate system for the Winddows backups.

I've tested a number of other backup products, but under Unix. And there it was obvious that most of the Windows products looked completely lost under Unix. I haven't really come across a truly cross-platform backup solution, and it's not obvious to me that such a thing actually exists.

Given how conceptually simple backups are, is it asking too much to have solutions that just work?