S h o r t S t o r i e s

// Tales from software development

Archive for September 2010

Threading issues with legacy unmanaged DLLs

leave a comment »

One of the interfaces that I’ve recently written allows a desktop application to request data from a remote data store.

The desktop application sends its request to a remoting server running as a Windows Service on a LAN server that then calls the customer’s interface to retrieve the data. The customer’s interface is implemented as a Delphi 6 DLL that was written in 2002 and has not been updated since. The Windows Service uses P/Invoke to call the DLL.

The development of the interface was difficult because there was no test environment and the customer would not allow any testing in their live environment. The call to the DLL had to be replaced with a mock function.

The initial implementation of the client/server interface was deployed in April and was tested successfully. There were suspicions that the customer’s testing wasn’t very rigorous but this was their responsibility and we didn’t believe that there were any problems with the interface.

Inevitably, it was only a couple of weeks ago in the week before the go-live date for the project that the customer did some serious testing and discovered that there were serious problems with the interface. The DLL failed with errors and exceptions on approximately 40% of the calls that were made to it. Curiously, one of the error messages was “KILLED :”. The customer didn’t know what this signified and it had never been seen in any other interface that called the DLL.

After a bit of head scratching I began to think that this was a threading issue. The Windows Service was receiving requests from multiple clients and each would be running on its own thread.

The first step was to synchronize the calls to the DLL so that only one request was made at a time. This was a quick fix – all that had to be done was to wrap the code that called the DLL in a lock(object) {…} block.

When this version was deployed there was a slight improvement in the incidence of errors but it was still averaging between 25% and 30% of the calls being made to the DLL.

If the problem was in my code rather than the DLL then the most likely reason was that the DLL needed to be called on the same thread each time, i.e. once it was loaded and initialised in the service’s process memory it would only work when called with the same thread it was first called on.

The solution required goes against the grain. These days we tend to be trying to find ways to implement multiple theaded processing but this problem required a multiple threads to be handled by a singleton thread.

It’s not the kind of code that you’d want to write on a regular basis but it does provide a solution to this particular problem. It’s a static class that starts a thread that’s used to make the calls to the DLL on. The thread runs continously but is despatched using a ManualResetEvent. Separate ManualResetEvent objects are also used to signal that the call to the DLL has been completed and whether the thread should exit. When the Windows Service is stopped it should call the Stop() method to terminate the DLL caller thread.

Once this code was implemented the error rate for calling the DLL dropped to less than 0.5%. It’s likely that whatever problems remain are in the DLL rather than a problem with the way it’s being called. 
 

namespace Vitality.Integration.DataInterface
{
    using System;
    using System.Threading;

    public static class DLLCaller
    {
        private static object lockObject;
        private static Thread dllThread;
        private static string functionArgument;
        private static string functionResponse;
        private static ManualResetEvent callRequested = new ManualResetEvent(false);
        private static ManualResetEvent callComplete = new ManualResetEvent(false);
        private static ManualResetEvent stopRequested = new ManualResetEvent(false);

        [DllImport("DataSvr.dll", CharSet = CharSet.Ansi)]
        private static extern string RequestData([MarshalAs(UnmanagedType.AnsiBStr)] string functionArgument);

        static DLLCaller()
        {
            lockObject = new object();
            Start();
        }

        public static string HandleIFSRequest(string functionArgument)
        {
            lock (lockObject)
            {
                DLLCaller.functionArgument = functionArgument;
                DLLCaller.functionResponse = string.Empty;

                // Signal the DLL caller thread to make the call to the DLL:
                DLLCaller.callComplete.Reset();
                DLLCaller.callRequested.Set();

                // Wait for the call to complete:
                DLLCaller.callComplete.WaitOne();
                return DLLCaller.functionResponse;
            }
        }

        public static void Start()
        {
            // Reset the wait handles:
            DLLCaller.callRequested.Reset();
            DLLCaller.callComplete.Reset();
            DLLCaller.stopRequested.Reset();

            // Start the DLL call processing thread:
            ThreadStart threadStart = new ThreadStart(DataRequest);
            DLLCaller.dllThread = new Thread(threadStart);
            DLLCaller.dllThread.Start();
        }

        public static void Stop()
        {
            // Signal that a stop has been requested:
            DLLCaller.stopRequested.Set();
        }

        private static void CallDataRequest()
        {
            do
            {
                // Wait on the exit requested and call requested handles:
                if (WaitHandle.WaitAny(new WaitHandle[] { stopRequested, callRequested }) == 0)
                {
                    // If WaitAny returns 0 it indicates that the first element in the array was
                    // signalled, i.e. stop has been requested.
                    break;
                }

                try
                {
                    DLLCaller.functionResponse = RequestData(functionArgument);
                }
                catch (Exception exception)
                {
                    DLLCaller.functionResponse = exception.Message;
                }

                DLLCaller.callRequested.Reset();
                DLLCaller.callComplete.Set();
            }
            while (true);
        }
    }
}
Advertisements

Written by Sea Monkey

September 30, 2010 at 8:18 am

Posted in Development

Tagged with

It’s not an issue, it’s a bug.

leave a comment »

Written by Sea Monkey

September 28, 2010 at 8:00 am

Posted in Comment

Tagged with

Fortunately, Unfortunately

leave a comment »

Fortunately, Unfortunately

Unfortunately, the Samsung SpinPoint system drive in my virtual machine host server failed yesterday morning even though it was working fine the day before and had given no warning signs of its imminent demise. It spun up but it was dead in all other respects.

Fortunately, I’d created a copy of the system drive and left it installed but not connected inside the case of the server for precisely this scenario.

Unfortunately, I’d imaged it with a version of Norton Ghost that doesn’t create usable copies of Windows Server 2008 boot disks.

Fortunately, all the data including virtual machine images, an SVN repository, and a Perforce repository was on the RAID protected D:\ drive.

Unfortunately, that still meant re-installing Windows, Virtual Server, SVN, Perforce, and a few other programs.

Fortunately, Windows Server 2008 installs fairly quickly.

Unfortunately, I couldn’t find the driver CD for the RAID storage controller because I’d put it somewhere safe but couldn’t remember where that was.

Fortunately, it was downloadable from the Adaptec web site.

Unfortunately, it still took more than a day to get the server back to full functionality.

I guess it’s time to look at creating an image of the system drive again but this time I won’t be using Norton Ghost even if it does now support Windows Server 2008.

Written by Sea Monkey

September 24, 2010 at 1:00 pm

Posted in Hardware

Tagged with

The eight fallacies of distributed computing

leave a comment »

I was reading Oren Eini’s article on Building Distributed Apps with NHibernate and Rhino Service Bus in the July issue of MSDN when I came across his reference to the eight Fallacies of Distributed Computing.

It’s all obvious stuff but nice that someone has taken the time to list the issues and formalise the problem of thinking that RPC calls are the same as in-process calls.

Written by Sea Monkey

September 16, 2010 at 8:00 am

Posted in Development, Uncategorized

Tagged with

Windows Explorer slow to open – virtual CD/DVD ?

leave a comment »

Occasionally, Windows Explorer on my development PC is a bit slow to open. When I investigate why it usually turns out to be a misbehaving plugin. The last time it happened it was a particular version of TortoiseSVN that was the problem. Installing a later version resolved it.

This morning I woke my PC up from standby and every time I opened a new Explorer window the window frame would show but the contents would be blank for around 30 seconds before I’d see my drives.

Sometimes this type of problem is due to a loop in an Explorer plugin and it’s obvious which process is the culprit when examining CPU usage. So, I checked processor usage using both Task Manager and SysInternals’ Process Explorer but nothing was obviously wrong.

Fortunately, I got lucky and guessed what the problem was before wasting time and effort on the problem.

I use Daemon Tools to mount CD image  (.iso) files. The previous day I’d had a clean up and got rid of some of the .iso files lying around the file system including the one that was currently mounted by Daemon Tools. I’m surprised that I could do this as I’d assumed that I wouldn’t be able to delete the file if it was in use but apparently not.

Whenever I opened a new Explorer window it was trying to read the volume information from the virtual CD and, after 30 seconds or so, failing. Hence the delay in seeing the Explorer window contents.

I clicked on the Daemon Tools tray icon and ejected and unmounted the .iso file and the problem went away.

Written by Sea Monkey

September 13, 2010 at 7:00 pm

Posted in Debugging

Tagged with