Atlas 5: Bad calls make good calls timeout

If 2 http calls somehow get stuck for too long, those two bad
calls are going to make some good calls expire too which in the
meantime got queued. Here’s a nice example:

function timeoutTest()

{

PageMethods.Timeout( { timeoutInterval : 3000, onMethodTimeout:
function() { debug.dump(“Call 1 timed out”); } } );

PageMethods.Timeout( { timeoutInterval : 3000, onMethodTimeout:
function() { debug.dump(“Call 2 timed out”); } } );

PageMethods.DoSomething( ‘Call 1’, { timeoutInterval : 3000,
onMethodTimeout: function() { debug.dump(“DoSomething 1 timed
out”); } } );

PageMethods.DoSomething( ‘Call 2’, { timeoutInterval : 3000,
onMethodTimeout: function() { debug.dump(“DoSomething 2 timed
out”); } } );

PageMethods.DoSomething( ‘Call 3’, { timeoutInterval : 3000,
onMethodTimeout: function() { debug.dump(“DoSomething 3 timed
out”); } } );

}

I am calling a method named “Timeout” on the server which does
nothing but to wait for a long time so that the call gets timed
out. After that I am calling a method which does not timeout. But
guess what the output is:

Only one call succeeded “Do Something 1”. Try again and you
might see this:

Now two calls succeeded. So, if at any moment, browser’s two
connections get jammed, then you can expect other waiting calls are
going to timeout also.

In Pageflakes, we used to get nearly 400 to 600 timeout error
reports from users’ browsers. We could never find out how this can
happen. First we suspected slow internet connection. But that
cannot happen for so many users. Then we suspected something is
wrong with the hosting providers network. We did a lot of network
analysis to find out whether there’s any problem on the network.
But we could not detect any. We used SQL Profiler to see whether
there’s any long running query which times out ASP.NET request
execution time. But no luck. We finally discovered that, it mostly
happened due to some bad calls which got stuck and made the good
calls expire too. So, we modified the Atlas Runtime and introduce
automatic retry on it and the problem disappeared completely.
However, this auto retry requires a sophisticated open heart bypass
surgery on Atlas Runtime javascript code which you have to perform
again and again whenever Microsoft releases newer version of Atlas
Runtime. You also can no longer use the
tag which produces Atlas runtime references instead you have to
manually put links to Atlas runtime and compatibility javascript
files. So, you better do auto retry yourself in your own code from
Day 1. On the onMethodTimeout method, just make one retry all the
time to be on the safe side.

One thought on “Atlas 5: Bad calls make good calls timeout”

  1. I know exactly what I went through and open heart surgery it was very scary. When they woke me up to take me off the breathing machine, I was sleeping so hard, I did not want to wake up. Ive never been so thirsty and so much pain. WBR LeoP

Leave a Reply