When you book an airline ticket, you use FOSS

This article on ReadWriteWeb really caught my eye today.

From my previous life in data and telecoms I know a little of the scale of the Sabre network. It’s BIG. By the sounds of things most of it runs on Open Source software too. They have announced a partnership with a commercial Open Source vendor Progress to use a number of their FUSE Open Source products.

By default, Sabre only chooses off-the-shelf software as its last option if when no open-source solution is available. If there is neither an open-source nor an off-the-shelf solution, Sabre’s own technology team will provide an in-house solution.

Sabre, as Progress’s Debbie Moynihan proudly pointed out to us, can’t afford any downtime – and FUSE’s Supplier-Side Gateway, which currently handles about 1.5 million transaction a day, has now run on Sabre’s system for 14 months without any error.

Besides FUSE’s offerings, which are based on Apache products, Sabre also extensively uses Apache’s web server, MySQL, Hibernate, Terracotta and a number of other open source products. Also, two-thirds of Sabre’s 5000 servers currently run Linux and the company expects to expand this number over time.

Nice figures. Good story.

It’s when I hear about these really massive and important networks that can’t really go down using FOSS because it works and works well that I really wonder why uptake across the whole enterprise space is so shockingly small in comparison. And then I remember why I think it is so.

The Huge Marketing Budgets of one or two proprietary vendors. But, you know what. I think the times they are a changing….

Bash/Scripting Snippets

Over the last week or so I have been planning, preparing and then doing, a migration of the OS on my home server. It used to run a custom built Linux OS based on Linux From Scratch. Whilst this has been very reliable and fast, I have been always niggled by maintenance, upgrades and support issues that are part-and-parcel of a home-brew solution with no package management. In Linux From Scratch’s defence, it isn’t really meant to be a distro. It is a fantastic educational tool. Many people do use LFS for their desktops and servers however.

Anyway, my server is now running Ubuntu Server 8.10 and during the whole migration process I found it necessary to use a few little scripts to keep my sanity and help me do things a bit quicker.

Both as a record for me and for anyone else, here are few snippets which I found useful.

First comes rsync for backup/restore.

rsync is an open source utility that provides fast incremental file transfer.

I guess many of you will have used this before; I had not. It is now an invaluable tool in my arsenal. A simple line like this:

sudo rsync -azvv --progress -e ssh $SDIR $HOST:.$DDIR

can be used in a script and iterated in a loop if needs be. $SDIR is the source directory, $HOST is the host machine and $DDIR for the destination directory. I had some very large files and didn’t want to do the whole backup in one go so this was a useful way to step through specific directories. If you want to exclude any sub-directories or files simply add switches like --exclude 'backups' --exclude 'shared_drive' after the $SDIR. The exclude path is relative to the $SDIR.

Using rsync like this over SSH meant I needed no rsync daemon running on either machine. For a cron job, you would probably want to drop the --progress switch and the output verbosity and use SSH Agent.

I had a MySQL engine with lots of databases. Although a really brilliant tool, I’ve always been a little weary of phpmyadmin for large database backups/restores as I have found it susceptible to script timeouts etc. So another job for a little bash script.

#Get a simple list of all the database names from MySQL
DBS="$(mysql -u root -ppassword -Bse 'show databases')"
mkdir -p $DBDIR
for DB in $DBS
echo "Doing a dump of $DB ..."
mysqldump -u root -ppassword $DB > $DBDIR/$DB.sql
gzip -9 $DBDIR/$DB.sql
exit 0

The idea for this came from here. Thanks marchost. The key line in this small script is this one: DBS="$(mysql -u root -ppassword -Bse 'show databases')". It calls the MySQL CLI, requests and retrieves into the $DBS variable, in plain text with no additional details or formatting other than line feeds, the names of all the databases that this MySQL instance is managing. Before I found this, I had had to manually create a list of the databases to backup in my scripts. This little line makes it fully automatic 🙂 The rest of the script should be fairly self explanatory I hope.

Warning: There is no error checking or anything here. It was only needed for a one time backup and it worked. But if you plan to use it more frequently you’ll need to add some tests and checks. I would also suggest creating a “backup” user on the MySQL database that has global privileges for just SELECT and LOCK TABLES which is enough. That way you are not exposing your databases’ root password to all and sundry.

And then, once my new server was up and running, I needed to drag loads of data back and do some judicious chmodding and chowning. Now, I’d always been niggled that I didn’t know how to do a recursive chmod on just files or directories. Well; I found out thanks to google and someone called AdaHacker.

find $DIRNAME -type f -exec chmod 640 {} \;

This is a recursive chmod command that works on files only. I didn’t use $DIRNAME as I was running this from the command line. Just replace it with ./ to work from the current directory. If you want to do a similar thing for just directories then replace the -type f with -type d.

Hopefully these will be helpful to some. They were to me.

Atomic Hosting or Virtualisation?

Hi all,

we have a slight dilema and I thought it would be good to see if anyone “out there” had any thoughts or experience they could help us with.

The company we use for hosting and who we are very happy with, Bytemark,  provide the hardware and support services that this and other sites of ours run on.  We are getting to a stage now where our server is quite sluggish at times and has a lot of old baggage that we could really do with cleaning up.

Our plan is to get a second server and then to migrate just the services we need in a controlled fashion. Once that is accomplished and we are happy, we can then shutdown the old (this one) machine and stop paying for it.

Unfortunately for us, Bytemark have several attractive options for hosting (all on Linux only BTW).

We currently use a virtual machine package. I am not sure of the physical hardware it runs on but the performance of the VM has not been too bad at all, although the cumulative affect of a couple of years of running multiple web sites, test applications, development services, email and such like has taken it’s toll.

We could just go for another VM with perhaps a bit more RAM to give us some breathing space.

The alternative, for not much more cash, is to get a dedicated host. It has a great deal more RAM (2G) and disk space than the VM offering, but the processor is one of the new, low-power Intel Atom 230s.

I am concerned that the processing capability of the Atom will not be up to much. From the reviews and benchmarks I have read it certainly isn’t a Core2 and that’s for sure! One test (on Tom’s Hardware) noted that the LAN running hard (just a 100Mb) was eating upwards of 20% of the processor cycles just dealing with the traffic. If you start adding a few php web sites and MySQL databases what’s going to happen then?

The upside of the Atom is the low power consumption and small footprint. But this benefit is probably negated by Virtualisation anyway.

So, my question(s)…

Does anyone have an opinion they’d like to air? I’m personally more in favour of the VM solution currently, I feel that the much higher performing underlying hardware, even though it is being shared with other VMs, will probably give us better throughput than the Atom even though that would be dedicated hardware and have more RAM. Am I wrong or misguided? Any links to good comparisons you could suggest? Any real world experience to guide us?

OpenOffice.org: Jobs in Redmond?

I came across this pearl on the OpenOffice.org marketing mailing list and it certainly gave me a little chuckle…

It was from Alexandro (CoLo of OO.o Spain) who is looking for a job in the USA working with OpenOffice.org. Here is one I don’t think he’ll be applying for…

US-WA-Redmond-SMB Marketing Manager – Breadth.

The Breadth Team is seeking a talented Marketing Manager to help drive our competitive efforts within the SMS&P-SMB and with other field teams, segment teams and product groups across Microsoft. Our passion is helping our field and partners win against our biggest competitors in this space, particularly OpenOffice and MySQL.

[Emphasis mine]

Or perhaps he should? Why not infiltrate the competition? I’d hazard a guess that M$ do, or have done it.

O.K. I’ll admit that he might not last so long 😉 But I’m sure he could cause a decent amount of “Collateral Repair”. [This is the opposite of Collateral Damage; in this scenario he would help many SMEs to discover the advantages of using OpenOffice.org and Open Source in general, thereby preventing the damage caused by using proprietary software].

Have a nice day.

Alfresco, a bit like Quickr but Bettr

Quickr, for those who are lucky enough not to know, is the morphologically challenged relative of Lotus Quickplace. In reality it is Quickplace with two new themes, two new placetypes and two versions of dojo dumped on the filesystem to make things look a bit more “Web 2.0” and some windows-only integration with Microsoft only applications. So why I am I telling you about proprietary software here on “The Open Sourcerer”? Well I have a bit of a background in the IBM/Lotus area and I have been developing corporate themes for Quickplace since sometime in the last millennium. It hasn’t changed much, but there is a very serious Free and Open Source alternative now.

In brief, Quickr is a website creating tool, each site is known as a “place” and within a place you can have folders and rooms. Rooms are like sub-places, they can have their own access control rules and a different style. They can contain rooms as well so you can have a hierarchy of places. It looks quite pretty, and 10 years ago it was 5 years ahead of its time. It has now got a client install, which integrates with some legacy Windows applications, more on that later.

Alfresco is an Open Source Enterprise Content Management System, which runs as a J2EE application on Linux and other platforms (I would stick to Linux+Apache+Tomcat+MySQL for preference). Like Quickr you create areas for storing stuff, in Alfresco they are called “Spaces”. Spaces can contain files, folders and more spaces.

Inheritance of security to sub-spaces/rooms

So in Quickr you create a place, you add members to that place, you create a room within the place, you carefully check the checkbox labeled “inherit members from parent place” as you create it so that all the members of the place can get into the room. Lovely. Now add another member to the place. You would expect them to be able to access the room wouldn’t you?

No. Inheritance is a one shot deal when you create a room, it just copies the access control list from the parent room as it creates the subroom. Now imagine an place in an enterprise with 100+ rooms and managing user access to this lot. It gets messy.

In Alfresco, inheritance works just like it should. You can set a space to inherit from the parent space, and override it at will. Nice, friendly and fit for the enterprise user/administrator.

Access as a file system

The big new feature in Quickr (the pretty skins don’t count as they are only skin deep) is the Quickr Connectors. This Windows only program installs as a Windows Explorer extension and sits alongside the network neighbourhood, it sort of works like a filesystem.

You can’t do linked spreadsheets (OpenOffice.org or Symphony, or the other one) because the files don’t reside at a resolvable UNC path.

Folders are deeply broken. You can create folders, and nested folders, but they look rubbish in most of the web themes which are designed for a single level of folders. If you do use a web theme with a hierarchical folder tree and then use the web interface to move folders between rooms, they break in the connector. Moving them in the web doesn’t update some important UNID field somewhere, I couldn’t figure out which, but I reported it as a bug.

Personal spaces (aka Quickr Entry) were supposed to be a wonderful thing, when you send an email with an attachment from a proprietary email client (Lotus Notes or the other one) it asks you if you want to store the attachment in your Quickr place and send a link instead. This sort of works. With no security. Your place is public, anyone can see stuff you put in it (with a lame security-by-obscurity option which I haven’t figured out how to get to yet). So you want to organise your space, putting stuff in folders etc. well you can’t. Folders aren’t allowed in personal spaces. Tough.

So how does file system access work in Alfresco? Well it will act as a WebDav server or a CIFS server or both. There is no mucking about with locally installed connector clients and Windows Explorer extensions to make it look a little bit like a network filesystem. It is a network filesystem. WebDAV is well supported on Linux and Mac and it works on Windows too. Once you connect to your server via WebDAV it just looks like another bit of your filesystem. You can drag and drop documents into and out of it, double click things to open them etc. Linked spreadsheets work fine, and in fact every application that expects to be storing or accessing data on a regular drive works just fine with your remote content management system. It isn’t just any remote drive though, it is still a content management system, if the business rules for a space where you drop a file dictate version control then that is exactly what happens.

Version control

So lets say you have a document in Quickr created with a form set up for optional version control (which is a bit of a sloppy concept in itself). You are doing some edits and what started as correcting a few typos turns into a major re-factoring session. You now want to save your document as a new version. Tough. Too late. You have to create a new version before you start editing it, otherwise you are just editing and overwriting the existing version. Quickplace always had a published version + working draft system, it now has a sort of revision history stuffed into it. The two models don’t seem to like each other very much.

Version control in Alfresco is somewhat more thought out, it has a very powerful Advanced Versioning Manager, which can track back not just individual files, but directories, it can show you the state of the whole repository at a particular point in time. Very useful for the multiple linked spreadsheets example. It can do way more than this, it is configurable as

So what does work Bettr in Quickr?

Well Quickr has a truly sickening theme/skin engine. It only works in Internet Explorer with ActiveX and you can upload 6 files (stylesheet + 5 HTML files) which it scoops up along with any referenced images. The HTML files basically duplicate each other, or you can upload just one HTML file and have it guess what the others should look like. There is no community site to share and sell Quickr skins that I know of, unlike Joomla! and WordPress etc. However, rubbish as the theme engine is, it is better than Alfresco which doesn’t yet have a skinning capability (you can edit the stylesheet and all the .jsp files, but that isn’t the same as a facility for uploading a package of skin elements so that places can be individually styled.)

Quickr isn’t just for storing files, it has a nice calendar that can show custom forms on it. I haven’t yet seen a calendar view for Alfresco. The Gantt chart view in Quickr isn’t very sophisticated at all, I wouldn’t miss that, but the calendar is useful.

When uploading files though the web interface from some Microsoft Office applications it does an ActiveX/COM control thing that gets the application to save as HTML as well as the native binary format and it uploads both the HTML version and the native format. It then serves up the HTML version to browser clients, which would be a nice trick. If it worked a bit better. It doesn’t do this trick if using the windows explorer integration, so if you use a mixture of the Quickr connector and the web client you get a great big muddle and a mess.

In conclusion

If I had to do a 15 minute sales demo, on Windows, I could easily make Quickr look fantastic, but when comparing Quickr against Alfresco as a serious tool for long term use in a modern business, Quickr falls short and Alfresco is the one I would choose.

Sun to buy MySQL for $1b

The title says it all really

SANTA CLARA, CA January 16, 2008 Sun Microsystems, Inc. (NASDAQ: JAVA) today announced it has entered into a definitive agreement to acquire MySQL AB, an open source icon and developer of one of the world’s fastest growing open source databases for approximately $1 billion in total consideration. The acquisition accelerates Sun’s position in enterprise IT to now include the $15 billion database market. Today’s announcement reaffirms Sun’s position as the leading provider of platforms for the Web economy and its role as the largest commercial open source contributor.

Wow. That really makes sense for Sun and makes the chaps from MySQL nicely rich!

Sun have already shown themselves to be pretty into the Open Source thing. In the last couple of years they have Open Sourced Java and Solaris. Two of their biggest software platforms. I recall Jonathan Schwartz saying how, after giving away the software for free they made more money from it… That makes perfect sense to me, but it doesn’t seem to work for Microsoft yet.