S h o r t S t o r i e s

// Tales from software development

The limits of ServicesPipeTimeout and ServiceBase.RequestAdditionalTime

with 7 comments

The project that I’m currently working on uses a Windows Service. We deployed the project to the customer’s acceptance testing environment earlier this week and the service failed to start. Each attempt just resulted in a “Error 1053: The service failed to start in a timely fashion” message.

There’s some code in the executable’s entry point that validates the service’s configuration data and, as part of this, runs a query against a database. In the development and test environments this query completes in around 5-6 seconds but in the production environment it takes around 50 seconds. The Service Control Manager (SCM) expects a service to start in 30 seconds, or less, and aborts the start if it takes longer.

If you search the internet for similar problems and possible solutions you’ll see that there’s a registry hack to extend the time the SCM allows for a service to start. You can set the ServicesPipeTimeout DWORD value in HKLM\System\CurrentControlSet\Control\ to the value in milliseconds that you want the SCM to use as a timeout. Except… It didn’t work for me.

There are two actions that the SCM takes to start a service. First it loads the executable and calls its main entry point. In a .NET assembly this code will create an instance of a class derived from ServiceBase and call the static Run method to perform the service initialization and then return control to the SCM. The SCM then calls the service’s Start code which is handled in the derived class’s overridden OnStart method.

It appears that the ServicesPipeTimeout value doesn’t have any bearing on how long the SCM will wait when it calls the service executable’s main entry point and, of course, this is where my validation code was. I haven’t confirmed this yet but I’m guessing that the ServicesPipeTimeout value only affects how long the SCM waits when it”s called the Start method.

I moved the start up validation code from the main entry point into the OnStart method. At this point increasing the value of ServicesPipeTimeout would probably have resolved the problem but it would also apply to all the services on the machine and I wondered if there was an alternative that would only affect my service. A quick browse of the methods exposed by the ServiceBase class revealed the promisingly named RequestAdditionalTime method.

I wrote a test service that waits for configurable amounts of time in the entrypoint code, the OnStart, and the OnStop event handlers. I found that RequestAdditionalTime does indeed cause the SCM to wait longer than the default timeout period specified by ServicesTimeout but only up to a point.

The MSDN documentation doesn’t mention this but it appears that the value specified in RequestAdditionalTime is not actually ‘additional’ time. Instead, it replaces the value in ServicesPipeTimeout. Worse still, any value greater than two minutes (120000 milliseconds) is ignored, i.e. capped at two minutes.

Two minutes is still a lot more useful than 30 seconds but there’s no excuse for Microsoft not documenting this limitation. Even the BCL Team’s blog entry on the subject doesn’t mention it.

Advertisements

Written by Sea Monkey

December 18, 2009 at 10:00 am

Posted in Development

Tagged with

7 Responses

Subscribe to comments with RSS.

  1. Thanks for figuring out whether RequestAdditionalTime gives you more time on top of the default 30 seconds, or gives you whatever time value you pass to the method. The documentation really should be more explicit on that point, as well as the 2-min max.

    Chris Tybur

    January 15, 2010 at 11:22 pm

  2. Have you learned anything in the time since this post that might contradict the limit on RequestAdditionalTime? I’m in a situation where I’m hoping that there is a mechanism for arbitrarily long stop timeouts. See http://stackoverflow.com/questions/30003825/what-is-a-safe-overhead-for-requestadditionaltime

    Lars Kemmann

    May 2, 2015 at 4:16 pm

    • Lars, the short answer is no.

      What I would say is that I now realise that Windows Services ought to be designed to start and terminate processing quickly when requested to do so.

      As developers, we tend to focus on the implementation of the processing and then package it up and deliver it as a Windows Service.

      However, this really isn’t the correct approach to designing Windows Services. Services must be able to respond quickly to requests to start and stop not only when an administrator is making the request from the services console but also when the operating system is requesting a start as part of its start up processing or a stop because it is shutting down.

      Consider what happens when Windows is configured to shut down when a UPS signals that the power has failed. It’s not appropriate for the service to respond with “I need a few more minutes…”.

      It’s possible to write services that react quickly to stop requests even when they implement long running processing tasks. Usually a long running process will consist of batch processing of data and the processing should check if a stop has been requested at the level of the smallest unit of work that ensures data consistency.

      As an example, the first service where I found the stop timeout was a problem involved the processing of a notifications queue on a remote server. The processing retrieves a notification from the queue, calls a web service to retrieve data related to the subject of the notification, and then writes a data file for processing by another application.

      I implemented the processing as a timer driven call to a single method. Once the method is called it doesn’t return until all the notifications in the queue have been processed. I realised this was a mistake for a Windows Service because occasionally there might be tens of thousands of notifications in the queue and processing might take several minutes.

      The method is capable of processing 50 notifications per second. So, what I should have done was implement a check to see if a stop had been requested before processing each notification. This would have allowed the method to return when it has completed the processing of a notification but before it has started to process the next notification. This would allow the service to respond quickly to a stop request and any pending notifications would remain queued for processing when the service is restarted.

      Sea Monkey

      May 3, 2015 at 10:56 am

    • Wow, thank you so much for the detailed answer on an old post! I appreciate the design insight, and you’re absolutely right. I was planning on very quick shutdown but now I just have to make it a policy for all child activities as well.

      Lars Kemmann

      May 6, 2015 at 3:23 am

  3. For what it’s worth, I observed (checked on Windows 8.1 and 10) a different behavior from what you described: If I’m hanging (Thread.Sleep) in the OnStart method the service stays in the “Starting” state indefinitely, and if I’m haning in the Main method the service is terminated after 30 seconds or according to the ServicesPipeTimeout no matter how high it is.
    What does happen after about 2 minutes is the Services.msc snap in shows a timeout error message, but it doesn’t affect the state of the service.

    Evgeny

    July 4, 2015 at 3:26 pm

    • Hello All,
      Just to complement this page with my late experience of about service NT startup :

      I confirm Evgeny’s comment : when a service startup is in timeout, Services.msc will always show the timeout error message after 2 minutes, but the service process will still be executed so the service will finally started.
      The progress bar shown by Services.msc when starting a service is totally fake and will always wait maximum 2 minutes before the timeout error messages pops up…

      If your service is configured to be launched automatically after OS startup, the system can behave differently : if a service reaches the timeout after beeing asked to start, it can arbitrary be killed by Service Control Manager.
      In that scenario SCM logs an error system event with a clear message containing the service name and timeout used and the service will remain stopped after system has start…
      Maybe this appends only if OS need ressources at startup because I have timed some “OnStart” execution wich was slighly longer than the timeout but my service was not always killed at OS startup…
      We faced with this scenario lately because some Windows 10 update had removed the “ServicePipeTimeout” registry key that our product setup used to increase to 120000…

      After reading this post, i’m trying to add a “ServiceBase.RequestAdditionalTime(120000)” on top of our “OnStart” method and hope this will solve our timeout issue.

      Gegesse

      January 21, 2016 at 1:27 pm

  4. Amos

    January 25, 2016 at 4:18 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: