S h o r t S t o r i e s

// Tales from software development

When installers go bad…

leave a comment »

Branches, Service Packs, QFEs

Late last week we hit a problem that we’ve seen before on this project but this time it was particularly pernicious.

Windows Installer: Don’t remove Components from installed Features

Our installers are built so that many of the files are contained as a single file inside a Component definition. It seems natural to delete the Component and the File definitions from the installer when a file becomes redundant in a later version of an installer. However, we discovered the hard way that this is a very bad idea.

The problem is patching. If you generate a patch from two versions of an installer in which the later one has a Component removed, the Windows Installer gets very confused when it applies the patch. The Windows Installer documentation does warn against removing Components in this way but unfortunately no warnings are issued by any of the standard installer and patch generation tools, or by the Windows Installer itself when it applies such a patch.

What seems to happen is that after the patch is applied, the Windows Installer considers the installation to have been changed from ‘Installed’ to something else because a Component that was an integral part of the installer before the patch was applied is now missing. However, because most of the installer’s files are still deployed on the file system, the Windows Installers decides that the status of the install is not ‘Uninstalled’ either. As there isn’t a ‘half-installed’ or ‘a bit broken’ status, the Windows Installer decides that the status of the installation must be ‘Advertise’.

We’ve seen this issue before and so we’re careful not to remove Component definitions from the installers. If a file is redundant then we remove the File definition from the Component but leave the Component definition in place.

Branching Strategy

The team is engaged in a ‘sustaining engineering’ effort. In simple terms this means that we’re delivering new functionality and fixes via service pack releases on a regular basis.

The source code management strategy that we use is to create a new service pack branch of the previously delivered service pack branch. For example, when we’ve delivered SP1, we create a new branch from the SP1 branch and call it SP2. This ensures that all changes made in SP1 are in the new branch.

Of course, life is never quite that simple and it’s common that we need to open the new branch before completing the previous service pack. During the overap we regularly integrate changes made in the current branch to the new branch.

Our build process also creates branches. This is to ensure that we have a snapshot of the source exactly as it was at the time of the build.

This gives us the following branch structure:

//depot/SP1/Src
//depot/SP1/1.1.100.001
//depot/SP1/1.1.100.002
//depot/SP1/1.1.100.003
//depot/SP1/1.1.100.004
etc

where the Src branch is the branch that we make code changes in and the numbered branches are created by the build. So, for example, if the build of SP1 that we deliver to the customer is 1.1.100.003 then the //depot/SP1/1.1.100.003 branch is a snapshot of the source code used to create that build.

When we create a new source code branch for the next service pack, we branch the ‘Src’ branch, i.e. //depot/SP1/Src is branched to //depot/SP2/Src.

The next issue is QFEs. These are high severity bug fixes that must be delivered to the customer outside of the normal service pack delivery mechanism. Because the code changes in a QFE must relate to the specific version of the code that the customer has installed and is using, we create QFE source branches from the specific branch that was created by our build process for the build of the service pack that was delivered to the customer.

So, for example, if QFE100 applies to build 1.1.100.003 of SP1, then we create a QFE branch of the source code by branching //depot/SP1/1.1.100.003 to //depot/QFE100/Src.

Once we know a QFE has been applied by the customer, we roll up the fix into the current service pack branch that we’re working on by integrating the change from the QFE branch to the current serice pack branch. In this example, we’d integrate the changes for the fix from //depot/QFE100/Src to //depot/SP2/Src.

We need to apply some discretion in making this integration however. Some of the changes in the QFE branch will not be related to the fix iteself. Some will be build configuration changes required to build the QFE, some will be installer configuration changes that are specific to the QFE and not generally appropriate for service pack delivery, etc.

How it all goes wrong

Our problem last week was caused by the way that we integrated changes from a QFE into the service pack branch.

The QFE applied to a database and our database installers use a script sequencing file that controls which SQL scripts will be run by the installer and in which order they will run. A patch or a QFE will use a sequencing file that contains only the scripts required to make the required changes. We usually create a new sequencing file for each patch or QFE and call this from an installer custom action that runs when the Windows Installer has deployed changed files to the file system.

Old patch and QFE sequence files are often, but not always, removed from service pack branches as they are usually redundant. Can you see where this is going ? When the changes in the QFE branch were integrated into the next service pack branch, the code changes were all integrated as required but the installer changes were rejected as it was considered that they were not required.

A few months later we entered the test phase of the delivery lifecycle for this service pack and it became clear that something was wrong with one of this database patches. Initial diagnosis showed that the sequencing file to deploy the changed SQL scripts was missing some of the required scripts. This was correct but it wasn’t actually the cause of the problem as we’d soon find out.

Because we were late in the test cycle it was decided that, rather than create a new build of the service pack with the corrected sequence file, we’d create a QFE for the service pack. The problem wasn’t the effort required to create a new build but the fact that at this stage the service pack was 90% tested and it was too late to start the test cycle again with a new build as a full test pass takes around 6 days.

A QFE, in the form of a patch, was created to fix the issue with the troublesome servce pack patch. When this patch was applied its behaviour was bizarre to say the least. Analysis of the log output from the custom action that runs the sequencing script showed that it was the sequencing script from the original QFE that was executing. Examination of the Windows Installer log revealed the real problem – the installation was in the ‘Advertise’ state and so the file changes in the patch were being ignored, as had occurred previously when we’d applied the service pack patch only we hadn’t noticed it then. Because the custom action was still being invoked even though the patch was failing, the version of the sequencing file that was being executed was the one that was still in place on the machine’s disk from the original QFE and not the one being deployed in the patch.

This latter point may seem obvious when explained but it’s very counterintuitive to apply a patch containing a particular version of a file and see the patch execute a completely different version of the file.

No solution

We tried a number of things to repair the broken installation but the bottom line seems to be that once an installation is broken in this way, there is no way of recovering it. After a couple of days of trying every potential solution that we could think of we admitted defeat. The only way forward was either to not patch the installation with the broken patch or to uninstall and re-install if the patch was applied.

Advertisements

Written by Sea Monkey

June 16, 2008 at 7:15 am

Posted in Deployment

Tagged with

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: