S h o r t S t o r i e s

// Tales from software development

Archive for April 2008

What credentials should you use when joining a domain ?

with one comment

As part of our build and test environment we have a machine that runs Microsoft Virtual Server 2005 R2 and hosts a number of virtual machines (VMs). These VMs are configured with Undo disks on which all changes to the file system take place. When we’ve finished our tests we shut the VMs down and discard the Undo disks. This causes the domain that they belong to to treat the VMs as though they are inactive and after a month or so they drop off the domain. This issue is discussed in Virtual PC Guy’s blog:

http://blogs.msdn.com/virtual_pc_guy/archive/2006/03/28/561508.aspx

One of the build guys, Ed, configured and maintained our VMs but recently left the team. When I started up one of the VMs to test some new installers yesterday I followed Ed’s notes on how to rejoin the VM to the domain. There are two ways of doing this. The obvious approach is to go to Control Panel | System and select the Computer Name tab. Click the Change button to display the Computer Name Changes dialog, select the Member of Workgroup option, and then click the OK button. You will be prompted to restart the machine for the change to take effect. After doing this, repeat the process but this time select the Member of Domain option to rejoin the domain. This time, when you click the OK button you are prompted to enter security credentials. This is where the problem lies and I’ll come back to this shortly. If the credentials are accepted then you’ll see the message “Welcome to the domain-name domain”. The machine has to be restarted again for the change to take effect.

Ed had documented a very useful but less obvious way to rejoin the domain. This involves replacing the fully qualified domain name that is displayed when you first open the Computer Name Changes dialog with the netbios name. For example, when I opened the dialog the domain name was displayed as europe.corp.xxxxxxxx.com (where xxxxxxxx is the name of the company I’m working for). The netbios for this domain is EUROPE and entering this accomplishes the task of rejoining the domain with a single restart.

When you first join a domain you are prompted to supply the credentials of an account that is authorised to do this. This is a domain account with administrator permissions for the machine.

When you attempt to rejoin the domain you are also prompted to supply credentials but, and this isn’t indicated in the prompt, you are actually being asked to supply the credentials of the owner of the machine object in Active Directory, I.E. the credentials that were first used to join the machine to the domain.

The problem that I encountered was that when I entered my own credentials I got an “Access is denied” message even though I was an administrator on the machine. I tried entering the credentials that we use for all our builds as this has administrator permissions for all the machines in our team. This also failed.

After 15 minutes of trying various workarounds I admitted defeat and asked one of our infrastructure and environments experts. He immediately diagnosed it as an issue with the ownership of the machine object in the domain Active Directory. The problem was that when Ed set up the VMs he used his own security credentials and this established him as the owner of the corresponding machine object in Active Directory. Although I could remove the machine from the domain, any attempt to rejoin the domain would fail because Active Directory would match up the new request with the existing machine object that was owned by Ed’s userid.

If you have administrator permissions in Active Directory it’s possible to change the ownership of the machine object for the machine you’re trying to rejoin to the domain or to delete it. In my case I had to contact the company’s infrastructure support group and request that the objects for all our VMs be deleted.

There was one more hurdle to overcome. Presumably because of caching of Active Directory information, I still couldn’t rejoin the original domain but I could join a different domain within the organisation. After joining the REDMOND domain and restarting the machine, I was then finally able to rejoin the EUROPE domain.

When rejoining the VMs to the domain I used our build account credentials rather than my userid and password so that we wouldn’t have the same issue again when a member of the team left.

Advertisements

Written by Sea Monkey

April 24, 2008 at 9:44 pm

Posted in Environments

Tagged with

Ken's First Law of Problem Diagnosis

with one comment

A long time ago in a former career in mainframe system software development and support I worked in a team alongside Mark and led by Ken. The three of us spent most of our time diagnosing customer issues with the company’s products.

Typically, we’d have to work our way though several hundred pages of hardcopy print out of a mainframe core dump to diagnose the cause of a system crash. We’d start by looking at the CPU registers to determine where the failure occurred and what was going on at the time. Then we’d examine the OS’s and our product’s control blocks and work our way backwards through the code to see how the failure had occurred. Sometimes we were lucky and recognised a problem that we’d seen previously and could short circuit much of the diagnosis and just confirm that the failure was the same. However, all too often it was a process that could take anywhere from a few hours to several weeks.

Ken had been doing this longer than Mark and myself and had a number of observations about it that we called “Ken’s Laws”. The first law was that the length of time it takes to diagnose a problem is inversely proportional to the size of the patch code required to fix it.

At first this seems like a very odd observation but it reflects the fact that the problems that take a long time to diagnose are usually very subtle and are often resolved by a very small change in the code.

The most dramatic example of this was a problem that I spent two weeks working on. I came close to giving up on it a few times because there’s no guarantee that the answer will be found in the core dump. But I kept at it with help from Ken and Mark and finally the cause of problem began to appear.

It came down to a single branch instruction in a piece of code that dealt with the recognition of physical disk devices. The developer who wrote the code had been aware that IBM was introducing a new device and had included support for it. Unfortunately, the preliminary documentation did not correctly reflect the device identifier codes used for all versions of the new disk type. The branch instruction that passed control to the code to deal with the new disk type had been coded as an assembler BNEH (branch not higher or equal) instruction and it should have been a BNH (branch not higher). The assembler generated the same machine code branch instruction for both assembler instructions with a 4 bit mask defining the branch condition. The difference between the two instructions was a single bit in the mask.

So, after about 70-80 hours of work on this problem the cause was identified as a single incorrect bit. Patches work by replacing bytes so although the problem was one bit in a 4 bit mask the smallest patch I could write was one byte long.

A corollary of this example came a few months later. A system dump arrived and within 10 minutes I’d identitified that the failure was due to a new variant of a disk that our software did not support. The solution was a patch that added support for the new disk but, as this required information about the geometry of the device, the patch ran to several hundred bytes.

Few developers have to spend time with system dumps these days but I think Ken’s First Law still holds – the most subtle and pernicious issues are the ones that take the longest to identify.

Written by Sea Monkey

April 18, 2008 at 4:17 pm

Posted in Debugging

Tagged with

Running multiple versions of .NET in IIS 6.0

leave a comment »

I run a private but publicly accessible web site that I wrote about four years ago that provides me with access to a webcam set up in my home and various other services such as webmail. It’s run without a glitch for years and then today it stopped working. Whenever I tried to access it I got this message in my browser:

Server Application Unavailable
The web application you are attempting to access on this web server is currently unavailable.  Please hit the “Refresh” button in your web browser to retry your request.

Administrator Note: An error message detailing the cause of this specific request failure can be found in the application event log of the web server. Please review this log entry to discover what caused this error to occur.

I checked the event log and found this error:

Source: ASP.NET 1.1.4322.0
EventID: 1062
Description:
It is not possible to run two different versions of ASP.NET in the same IIS process. Please use the IIS Administration Tool to reconfigure your server to run the application in a separate process.

What puzzled me was that I couldn’t understand why the problem was occuring now – I’d been using the web site regularly over the past few weeks without any problems and I hadn’t changed anything recently. What had changed to cause the problem ?

The last thing I’d done on the machine that hosts the web site was to install a new web service that reports on the status of the machine – free disk space, etc. The web service is written in .NET 2.0, so presumably this was what the message was referring to – IIS was objecting to an attempt to run two different versions of the .NET CLR in the same process. However, I’d written and installed that service about three weeks ago so why didn’t the problem show up then ?

I typically only use the web site when I’m not at home, usually when I’m at work. The only application that uses the web service is an application that runs on my desktop computer at home to show me the status of my servers. So, requests against the web site and the web service were never being made at the same time. What was different about today was that I was at home using my desktop computer (with the application that calls the web service) and I’d tried to login to the web site.

Presumably, both processes were also being unloaded after a period of inactivity so I’d even managed to use both the web site and the web service within hours of each other over the past three weeks without problems. It was only when I tried to use both at the same time or within a short period of each other that the conflicting .NET runtimes became an issue.

The solution was simple enough. Both the web site and the web service used the DefaultAppPool and so were running in the same process. So, I created a new AppPool based on the DefaultAppPool with a suitable name and then configured the web site to use the new AppPool. It’s simple enough but worth documenting:

First, create the new AppPool:

1. Start the IIS Manager using Start | Administrator Tools | Internet Information (IIS) Manager

2. Expand the local computer and the Application Pools folders.

3. Right-click the Application Pools folder and select New | Application Pool…

4. Enter the name of your new application pool, select the option to Use existing application pool as template, and select DefaultAppPool in the drop down list.

5. Click OK.

Now set the web site to use the new AppPool:

6. Expand the web sites folder and select the web site where your web site is located (e.g. Default Web Site).

7. Right click on the folder for your web site and select Properties.

8. Select your newly created AppPool in the Application Pool drop down list and click OK.

If you tend to use the DefaultAppPool for your web applications and you want to run multiple versions of .NET then you might want to create a default AppPool for each version of .NET that you intend to run.

Written by Sea Monkey

April 11, 2008 at 4:43 pm

Posted in Deployment

Tagged with , ,

Using long paths in .NET

with one comment

Long paths

One of the limitations of the .NET Framework is that the Directory, DirectoryInfo, File, FileInfo, and Path classes in the System.IO namespace are limited to a maximum path length of MAX_PATH (defined as 260) characters. This used to be a limitation imposed by MS-DOS and Windows but the Windows API now supports paths of up to 32,000 characters.

So why does the .NET Framework continue to have this restriction ? Sadly, it’s just down to effort, or rather the lack of time to implement long path support in the framework. This is discussed in the .NET Framework team’s blog in Long Paths in .NET, Part 1 of 3.

What the \\?\ ?

Long paths are supported in the Unicode versions of some of the Windows API functions using a special syntax. Prepending the character sequence \\?\ to a path passed into a function indicates that it’s a long path.

UNC paths are handled slightly differently and use this syntax: \\?\UNC\<machine-name>\<share-name>. So, for example, the UNC path \\Neptune\Files becomes \\?\UNC\Neptune\Files.

Note that each directory component of a long path cannot exceed 255 characters in length. You also need to be aware that using the long path format turns off some of the massaging that the API functions perform, e.g. resolving references to ‘.’ and ‘..’, and converting relative paths to full paths.

There’s no indication that long paths will be supported by the .NET Framework in the near future so if you need to use them you’ll have to use P/Invoke to call the Windows API functions directly.

Most of the time you’re unlikely to need long path support as Windows doesn’t generally support it. For example, Windows Explorer does not support long paths. However, you might run into situations where you do need it. I recently wrote a small application to backup files on a network attached storage (NAS) device. This device runs a UNIX based OS and uses SAMBA to provide network file storage. Some of the paths that my application needed to process exceeded MAX_PATH length and the .NET Framework classes that I was using (System.IO’s Directory and File classes) threw exceptions.

My solution was to implement slimmed down versions of File, FileInfo, Directory, and DirectoryInfo. These classes mapped the paths passed to their methods to long path formatted strings (i.e. prepended with \\?\ and \\?\UNC\ as appropriate) that were used internally when calling the Unicode versions of the various Windows API functions required. The paths returned by methods such as Directory.GetDirectories() and Directory.GetFiles() are normal paths (i.e. without the \\?\ prefix) which may, technically, be invalid as they may be longer than MAX_PATH. However, as long as these paths are only ever passed to my classes rather than the System.IO classes, the paths are handled correctly.

The classes and methods that you’re likely to need to implement, as a minimum, are:

  • Directory
    • Exists(path)
    • GetDirectories(path)
    • GetFiles(path)
  • File
    • Copy(sourceFileName, destFileName)
    • Exists(path)
    • CreationTime(path)
    • LastAccessTime(path)
    • LastWriteTime(path)

For the most part it’s quite easy to implement these classes and methods using the Windows API functions but you need to be aware of some differences in the behaviour of API functions and their equivalent System.IO methods. For example, the .NET Framework Directory.CreateDirectory() will create all the required directories in a path whereas the CreateDirectory Windows API function will only create a single directory within an existing path. It’s not difficult to implement the .NET Framework behaviour but you need to be aware of where the differences lie.

Summary

  • The .NET Framework does not support paths longer than MAX_PATH (260) characters
  • Use the Unicode versions of Windows API functions such as FileFindFirst, FileFindNext, FindClose, CreateDirectory, etc.
  • Use \\?\ and \\?\UNC\ prefixes to enable long path support in these Windows API functions.

References

Long Paths in .NET, Part 1 of 3 – The .NET Framework team blog entry about long paths.

Naming a file – MSDN page describing file and path names and the use of the \\?\ prefix.

PInvoke wiki – A wiki of PInvoke signatures for .NET languages.

Written by Sea Monkey

April 6, 2008 at 9:35 pm

Posted in Development

Tagged with ,