S h o r t S t o r i e s

// Tales from software development

Archive for June 2008

When installers go bad…

leave a comment »

Branches, Service Packs, QFEs

Late last week we hit a problem that we’ve seen before on this project but this time it was particularly pernicious.

Windows Installer: Don’t remove Components from installed Features

Our installers are built so that many of the files are contained as a single file inside a Component definition. It seems natural to delete the Component and the File definitions from the installer when a file becomes redundant in a later version of an installer. However, we discovered the hard way that this is a very bad idea.

The problem is patching. If you generate a patch from two versions of an installer in which the later one has a Component removed, the Windows Installer gets very confused when it applies the patch. The Windows Installer documentation does warn against removing Components in this way but unfortunately no warnings are issued by any of the standard installer and patch generation tools, or by the Windows Installer itself when it applies such a patch.

What seems to happen is that after the patch is applied, the Windows Installer considers the installation to have been changed from ‘Installed’ to something else because a Component that was an integral part of the installer before the patch was applied is now missing. However, because most of the installer’s files are still deployed on the file system, the Windows Installers decides that the status of the install is not ‘Uninstalled’ either. As there isn’t a ‘half-installed’ or ‘a bit broken’ status, the Windows Installer decides that the status of the installation must be ‘Advertise’.

We’ve seen this issue before and so we’re careful not to remove Component definitions from the installers. If a file is redundant then we remove the File definition from the Component but leave the Component definition in place.

Branching Strategy

The team is engaged in a ‘sustaining engineering’ effort. In simple terms this means that we’re delivering new functionality and fixes via service pack releases on a regular basis.

The source code management strategy that we use is to create a new service pack branch of the previously delivered service pack branch. For example, when we’ve delivered SP1, we create a new branch from the SP1 branch and call it SP2. This ensures that all changes made in SP1 are in the new branch.

Of course, life is never quite that simple and it’s common that we need to open the new branch before completing the previous service pack. During the overap we regularly integrate changes made in the current branch to the new branch.

Our build process also creates branches. This is to ensure that we have a snapshot of the source exactly as it was at the time of the build.

This gives us the following branch structure:

//depot/SP1/Src
//depot/SP1/1.1.100.001
//depot/SP1/1.1.100.002
//depot/SP1/1.1.100.003
//depot/SP1/1.1.100.004
etc

where the Src branch is the branch that we make code changes in and the numbered branches are created by the build. So, for example, if the build of SP1 that we deliver to the customer is 1.1.100.003 then the //depot/SP1/1.1.100.003 branch is a snapshot of the source code used to create that build.

When we create a new source code branch for the next service pack, we branch the ‘Src’ branch, i.e. //depot/SP1/Src is branched to //depot/SP2/Src.

The next issue is QFEs. These are high severity bug fixes that must be delivered to the customer outside of the normal service pack delivery mechanism. Because the code changes in a QFE must relate to the specific version of the code that the customer has installed and is using, we create QFE source branches from the specific branch that was created by our build process for the build of the service pack that was delivered to the customer.

So, for example, if QFE100 applies to build 1.1.100.003 of SP1, then we create a QFE branch of the source code by branching //depot/SP1/1.1.100.003 to //depot/QFE100/Src.

Once we know a QFE has been applied by the customer, we roll up the fix into the current service pack branch that we’re working on by integrating the change from the QFE branch to the current serice pack branch. In this example, we’d integrate the changes for the fix from //depot/QFE100/Src to //depot/SP2/Src.

We need to apply some discretion in making this integration however. Some of the changes in the QFE branch will not be related to the fix iteself. Some will be build configuration changes required to build the QFE, some will be installer configuration changes that are specific to the QFE and not generally appropriate for service pack delivery, etc.

How it all goes wrong

Our problem last week was caused by the way that we integrated changes from a QFE into the service pack branch.

The QFE applied to a database and our database installers use a script sequencing file that controls which SQL scripts will be run by the installer and in which order they will run. A patch or a QFE will use a sequencing file that contains only the scripts required to make the required changes. We usually create a new sequencing file for each patch or QFE and call this from an installer custom action that runs when the Windows Installer has deployed changed files to the file system.

Old patch and QFE sequence files are often, but not always, removed from service pack branches as they are usually redundant. Can you see where this is going ? When the changes in the QFE branch were integrated into the next service pack branch, the code changes were all integrated as required but the installer changes were rejected as it was considered that they were not required.

A few months later we entered the test phase of the delivery lifecycle for this service pack and it became clear that something was wrong with one of this database patches. Initial diagnosis showed that the sequencing file to deploy the changed SQL scripts was missing some of the required scripts. This was correct but it wasn’t actually the cause of the problem as we’d soon find out.

Because we were late in the test cycle it was decided that, rather than create a new build of the service pack with the corrected sequence file, we’d create a QFE for the service pack. The problem wasn’t the effort required to create a new build but the fact that at this stage the service pack was 90% tested and it was too late to start the test cycle again with a new build as a full test pass takes around 6 days.

A QFE, in the form of a patch, was created to fix the issue with the troublesome servce pack patch. When this patch was applied its behaviour was bizarre to say the least. Analysis of the log output from the custom action that runs the sequencing script showed that it was the sequencing script from the original QFE that was executing. Examination of the Windows Installer log revealed the real problem – the installation was in the ‘Advertise’ state and so the file changes in the patch were being ignored, as had occurred previously when we’d applied the service pack patch only we hadn’t noticed it then. Because the custom action was still being invoked even though the patch was failing, the version of the sequencing file that was being executed was the one that was still in place on the machine’s disk from the original QFE and not the one being deployed in the patch.

This latter point may seem obvious when explained but it’s very counterintuitive to apply a patch containing a particular version of a file and see the patch execute a completely different version of the file.

No solution

We tried a number of things to repair the broken installation but the bottom line seems to be that once an installation is broken in this way, there is no way of recovering it. After a couple of days of trying every potential solution that we could think of we admitted defeat. The only way forward was either to not patch the installation with the broken patch or to uninstall and re-install if the patch was applied.

Advertisements

Written by Sea Monkey

June 16, 2008 at 7:15 am

Posted in Deployment

Tagged with

MSBuild: Behaviour of the CreateItem task

leave a comment »

The CreateItem task allows you to dynamically create item lists and is a very useful and powerful feature but it does have some quirks that are worth understanding.

A quick introduction

Just to refresh your memory, the CreateItem task is typically coded like this:

  <CreateItem Include="@(MyItems)" >
   <Output TaskParameter="Include" ItemName="MyOutputItems" />
  </CreateItem>

The items in the MyItems list are copied to the MyOutputItems item list.

CreateItem does not replace items in a list

The first thing to note is that CreateItem always adds items to a list and does not overwrite any existing items if the item list being written to already exists.

Executing CreateItem in a target

When the current target is exited all items added will now be in the item list. This is obvious in non-batched targets but in a target that is batched on the same item list as the one used as the input to the CreateItem task, the visibility of added items within the target is constrained by the batching process.

You might expect that at the end of the execution of the target the item list contains only the item that was added  in the last batched execution of the target. This is what you’d expect in a procedural language that was setting a variable on each execution of a loop but MSBuild is more of a declaritive language than a procedural one.

This behaviour is not always obvious and is of significance in some batching scenarios as explained next.

Batching considerations

In a non-batched target, or a target that is batched on something other than the input to the CreateItem task, the result of CreateItem is an item list with all input items added.

In a the target that is batched on the same item list as the input to the CreateItem task, the item list created will only ever appear to have one item added to it within the scope of the target. If you’ve used batching then hopefully this will make sense – it’s the only logical way that this could work. When the target is exited the item list will contain all the added items. Again, this makes sense but it is a little bit counterintuitive in some situations.

But here’s the catch… While batching, the item list being added to will appear to only have one item added from the item list that is being batched on but it will have all items that were already in the list when the target was called. The examples in the next section illustrate this.

Examples

The first example demonstrates the behaviour in non-batched targets:

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Full" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
 <ItemGroup> 
  <ItemList1 Include="Item 1" />
  <ItemList1 Include="Item 2" />
  <ItemList1 Include="Item 3" /> 
 </ItemGroup>
 <Target Name="Full" DependsOnTargets="List1;UnbatchedTarget;List2">
 </Target>
 
 <Target Name="List1">
  <Message Text="List1: @(item)" Importance="High" />
 </Target>
 
 <Target Name="List2">
  <Message Text="List2: @(item)" Importance="High" />
 </Target>
 <Target Name="UnbatchedTarget" >
  <CreateItem Include="@(ItemList1)">
   <Output TaskParameter="Include" ItemName="item" />
  </CreateItem>
  
  <Message Text="UnbatchedTarget: @(item)" Importance="High" />
 </Target>
</Project>

The output from this is:

Target List1:
    List1:
Target UnbatchedTarget:
    UnbatchedTarget: Item 1;Item 2;Item 3
Target List2:
    List2: Item 1;Item 2;Item 3
Build succeeded.
    0 Warning(s)
    0 Error(s)

The output from the List1 task shows that the item list is empty at this point.

The output from the Message task in the UnbatchedTarget shows that all three items defined in ItemList1 have now been added to the item list ‘item’ by the CreateItem task.

The output from the List2 target is the same as the UnbatchedTarget output.

The second example demonstrates the behaviour in a batched scenario:

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Full" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
 <ItemGroup>
 
  <ItemList1 Include="Item 1" />
  <ItemList1 Include="Item 2" />
  <ItemList1 Include="Item 3" />
 
  <ItemList2 Include="Item A" />
  <ItemList2 Include="Item B" />
  <ItemList2 Include="Item C" /> 
 
 </ItemGroup>
 <Target Name="Full" DependsOnTargets="List1;BatchedTarget1;List2;BatchedTarget2;List3">
 </Target>
 
 <Target Name="List1">
  <Message Text="List1: @(item)" Importance="High" />
 </Target>
 
 <Target Name="List2">
  <Message Text="List2: @(item)" Importance="High" />
 </Target>
 <Target Name="List3">
  <Message Text="List3: @(item)" Importance="High" />
 </Target>
 <Target Name="BatchedTarget1" Inputs="%(ItemList1.Identity)Dummy" Outputs="%(ItemList1.Identity)Dummy" >
  <CreateItem Include="@(ItemList1)">
   <Output TaskParameter="Include" ItemName="item" />
  </CreateItem>
  
  <Message Text="BatchedTarget1: @(item)" Importance="High" />
 </Target>
 <Target Name="BatchedTarget2" Inputs="%(ItemList2.Identity)Dummy" Outputs="%(ItemList2.Identity)Dummy" >
  <CreateItem Include="@(ItemList2)">
   <Output TaskParameter="Include" ItemName="item" />
  </CreateItem>
  
  <Message Text="BatchedTarget2: @(item)" Importance="High" />
 </Target>
</Project>

The output from this project is:

Target List1:
    List1:
Target BatchedTarget1:
    BatchedTarget1: Item 1
Target BatchedTarget1:
    BatchedTarget1: Item 2
Target BatchedTarget1:
    BatchedTarget1: Item 3
Target List2:
    List2: Item 1;Item 2;Item 3
Target BatchedTarget2:
    BatchedTarget2: Item 1;Item 2;Item 3;Item A
Target BatchedTarget2:
    BatchedTarget2: Item 1;Item 2;Item 3;Item B
Target BatchedTarget2:
    BatchedTarget2: Item 1;Item 2;Item 3;Item C
Target List3:
    List3: Item 1;Item 2;Item 3;Item A;Item B;Item C
Build succeeded.
    0 Warning(s)
    0 Error(s)

As before, the List1 target shows that the item list being created is empty at this point.

BatchedTarget1 displays a single item value for as many items in the list. This is what you would expect as the target is batched on the item list that is being used as the input to the CreateItem task.

The output from the List2 target shows that the item list now contains all three items from the ItemList1 item list.

Now it gets interesting… What happens if you use the item list created in another CreateItem task ? The output from BatchedTarget2 shows exactly what happens – the item list contains the items that were already in it when the target was called and each item that the target is batched on, ItemList2 in this case.

Again, when the target is exited, all the items that were added during the execution of the target are added so that the item list now contains all of the items in ItemList1 (added by BatchedTarget1) and all of the items added by BatchedTarget2.

Does it matter ?

Do you need to understand this behaviour ? Maybe, maybe not… It’s useful to remember the basic concept because it will explain behaviour that you may regard as odd in some circumstances.

To give an example, I recently wrote a project to collect together some source files for delivery to a client. Some of the files had to be retrieved from a source control system while others were on a file server.

I wrote a couple of targets. The first accesses the source control system, syncs the required files, and copies them to the target folder. The second simply copies specific files from the file server to the target folder.

The files for each target are defined using separate item lists. Each list contains several items, each of which describes a set of files to be copied. Each item includes metadata for files to be excluded, the location under the target folder where the files are to be copied, etc.

The two targets use the CreateItem task to dynamically create an item list that is be passed to the MSBuild Copy task. The first is in the target that copies from the source control system:

  <CreateItem Include="%(SourceDepotCopy.DepotPath)\**\*.*" Exclude="%(SourceDepotCopy.Exclude)" >
   <Output TaskParameter="Include" ItemName="SourceFiles" />
  </CreateItem>

The second is in the target that copies the files from the file server:

  <CreateItem Include="%(FileSystemCopy.SourceFiles)" Exclude="%(FileSystemCopy.Exclude)" >
   <Output TaskParameter="Include" ItemName="SourceFiles" />
  </CreateItem>

Note that both CreateItem tasks output to the SourceFiles item list.

The result isn’t what I intended. The first target executes correctly but the second doesn’t. As explained above, when the second batched target is executed the output item list for the CreateItem task will already have all the items that were added during the execution of the first target and batched items will be added for each time the task is executed as determined by the batching on the task.

In short, when the second target is executing, the SourceFiles item list is full of the items added in the first target and so copies hundreds of files from the source control system to the locations specified in the metadata in the items in the item list used to control the copying of files from the file server.

The solution is to simply use a different name for the item list used in the second target:

  <CreateItem Include="%(FileSystemCopy.SourceFiles)" Exclude="%(FileSystemCopy.Exclude)" >
   <Output TaskParameter="Include" ItemName="FilesToCopy" />
  </CreateItem>

The project now does what was intended.

I deliberately used the same name on both CreateItem tasks and I should have realised what would happen but I overlooked it. More worryingly, it would be easy to accidentally use the same item list name in different places in a project and not realise it because MSBuild gives no warnings about the reuse of an item list such as this.

Written by Sea Monkey

June 6, 2008 at 6:00 pm

Posted in Development

Tagged with