10 IT Admin skills every .NET developer should have before going live

My talk at London .NET User Group on couple of essential skills every .NET developer should have under their belt before going live. If you are a startup or a developer who has to code, deploy and maintain .NET apps on production, you will find these techniques very handy.

Omar Skillsmater

http://skillsmatter.com/podcast/design-architecture/10-it-admin-skills-every-dot-net-developer-should-have-before-going-live/mh-6871

Watch the video at Skills Matters website.

Here are the slides from the talk.

SlidePreview

99.99% available ASP.NET and SQL Server SaaS Production Architecture

You have a hot ASP.NET+SQL Server product, growing at thousand
users per day and you have hit the limit of your own garage hosting
capability. Now that you have enough VC money in your pocket, you
are planning to go out and host on some real hosting facility,
maybe a colocation or managed hosting. So, you are thinking, how to
design a physical architecture that will ensure performance,
scalability, security and availability of your product? How can you
achieve four-nine (99.99%) availability? How do you securely let
your development team connect to production servers? How do you
choose the right hardware for web and database server? Should you
use Storage Area Network (SAN) or just local disks on RAID? How do
you securely connect your office computers to production
environment?

Here I will answer all these queries. Let me first show you a
diagram that I made for Pageflakes where we ensured we get
four-nine availability. Since Pageflakes is a Level 3
SaaS
, it’s absolutely important that we build a high
performance, highly available product that can be used from
anywhere in the world 24/7 and end-user gets quick access to their
content with complete personalization and customization of content
and can share it with others and to the world. So, you can take
this production architecture as a very good candidate for Level 3
SaaS:


Hosting_environment

Here’s a CodeProject article that explains all the
ideas:

99.99% available ASP.NET and SQL Server SaaS Production
Architecture

Hope you like it. Appreciate your vote.


kick it on DotNetKicks.com

Fast, Streaming AJAX proxy – continuously download from cross domain

Due to browser’s prohibition on cross
domain XMLHTTP call, all AJAX websites must have server side proxy
to fetch content from external domain like Flickr or Digg. From
client side javascript code, an XMLHTTP call goes to the server
side proxy hosted on the same domain and then the proxy downloads
the content from the external server and sends back to the browser.
In general, all AJAX websites on the Internet that are showing
content from external domains are following this proxy approach
except some rare ones who are using JSONP. Such a proxy gets a very
large number of hits when a lot of component on the website are
downloading content from external domains. So, it becomes a
scalability issue when the proxy starts getting millions of hits.
Moreover, web page’s overall load performance largely depends on
the performance of the proxy as it delivers content to the page. In
this article, we will take a look how we can take a conventional
AJAX Proxy and make it faster, asynchronous, continuously stream
content and thus make it more scalable.

You can see such a proxy in action when you go to Pageflakes.com. You will see
flakes (widgets) loading many different content like weather feed,
flickr photo, youtube videos, RSS from many different external
domains. All these are done via a Content Proxy. Content
Proxy served about 42.3 million URLs last month which is
quite an engineering challenge for us to make it both fast and
scalable. Sometimes Content Proxy serves megabytes of data, which
poses even greater engineering challenge. As such proxy gets large
number of hits, if we can save on an average 100ms from each call,
we can save 4.23 million seconds of
download/upload/processing time every month. That’s about 1175 man
hours wasted throughout the world by millions of people staring at
browser waiting for content to download.

Such a content proxy takes an external server’s URL as a query
parameter. It downloads the content from the URL and then writes
the content as response back to browser.


image

Figure: Content Proxy working as a middleman between browser and
external domain

The above timeline shows how request goes to the server and then
server makes a request to external server, downloads the response
and then transmits back to the browser. The response arrow from
proxy to browser is larger than the response arrow from external
server to proxy because generally proxy server’s hosting
environment has better download speed than the user’s Internet
connectivity.

Such a content proxy is also available in my open source Ajax
Web Portal Dropthings.com.
You can see from its
code
how such a proxy is implemented.

The following is a very simple synchronous, non-streaming,
blocking Proxy:

[WebMethod]
[ScriptMethod(UseHttpGet=true)]
public string GetString(string url)
{
using (WebClient client = new WebClient())
{
string response = client.DownloadString(url);
return response;
}
}
}

Although it shows the general principle, but it’s no where close
to a real proxy because:

  • It’s a synchronous proxy and thus not scalable. Every call to
    this web method causes the ASP.NET thread to wait until the call to
    the external URL completes.
  • It’s non streaming. It first downloads the entire
    content on the server, storing it in a string and then uploading
    that entire content to the browser. If you pass MSDN feed URL, it will
    download that gigantic 220 KB RSS XML on the server and store it on
    a 220 KB long string (actually double the size as .NET strings are
    all Unicode string) and then write 220 KB to ASP.NET Response
    buffer, consuming another 220 KB UTF8 byte array in memory. Then
    that 220 KB will be passed to IIS in chunks so that it can transmit
    it to the browser.
  • It does not produce proper response header to cache the
    response on the server. Nor does it deliver important headers like
    Content-Type from the source.
  • If external URL is providing gzipped content, it decompresses
    the content into a string representation and thus wastes server
    memory.
  • It does not cache the content on the server. So, repeated call
    to the same external URL within the same second or minute will
    download content from the external URL and thus waste bandwidth on
    your server.

So, we need an asynchronous streaming proxy that
transmits the content to the browser while it downloads from the
external domain server. So, it will download bytes from external
URL in small chunks and immediately transmit that to the browser.
As a result, browser will see a continuous transmission of bytes
right after calling the web service. There will be no delay while
the content is fully downloaded on the server.

Before I show you the complex streaming proxy code, let’s take
an evolutionary approach. Let’s build a better Content Proxy that
the one shown above, which is synchronous, non-streaming but does
not have the other problems mentioned above. We will build a HTTP
Handler named RegularProxy.ashx which will take url
as a query parameter. It will also take cache as a query
parameter which it will use to produce proper response headers in
order to cache the content on the browser. Thus it will save
browser from downloading the same content again and again.

<%@ WebHandler Language="C#" Class="RegularProxy" %>

using System;
using System.Web;
using System.Web.Caching;
using System.Net;
using ProxyHelpers;
public class RegularProxy : IHttpHandler {

public void ProcessRequest (HttpContext context) {
string url = context.Request["url"];
int cacheDuration = Convert.ToInt32(context.Request["cache"]?? "0");
string contentType = context.Request["type"];

// We don't want to buffer because we want to save memory
context.Response.Buffer = false;

// Serve from cache if available
if (context.Cache[url] != null)
{
context.Response.BinaryWrite(context.Cache[url] as byte[]);
context.Response.Flush();
return;
}
using (WebClient client = new WebClient())
{
if (!string.IsNullOrEmpty(contentType))
client.Headers["Content-Type"] = contentType;

client.Headers["Accept-Encoding"] = "gzip";
client.Headers["Accept"] = "*/*";
client.Headers["Accept-Language"] = "en-US";
client.Headers["User-Agent"] =
"Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6";

byte[] data = client.DownloadData(url);

context.Cache.Insert(url, data, null,
Cache.NoAbsoluteExpiration,
TimeSpan.FromMinutes(cacheDuration),
CacheItemPriority.Normal, null);

if (!context.Response.IsClientConnected) return;


// Deliver content type, encoding and length as it is received from the external URL
context.Response.ContentType = client.ResponseHeaders["Content-Type"];
string contentEncoding = client.ResponseHeaders["Content-Encoding"];
string contentLength = client.ResponseHeaders["Content-Length"];

if (!string.IsNullOrEmpty(contentEncoding))
context.Response.AppendHeader("Content-Encoding", contentEncoding);
if (!string.IsNullOrEmpty(contentLength))
context.Response.AppendHeader("Content-Length", contentLength);

if (cacheDuration > 0)
HttpHelper.CacheResponse(context, cacheDuration);

// Transmit the exact bytes downloaded
context.Response.BinaryWrite(data);
}
}

public bool IsReusable {
get {
return false;
}
}

}

There are two enhancements in this proxy:

  • It allows server side caching of content. Same URL requested by
    a different browser within a time period will not be downloaded on
    server again, instead it will be served from cache.
  • It generates proper response cache header so that the content
    can be cached on browser.
  • It does not decompress the downloaded content in memory. It
    keeps the original byte stream intact. This saves memory
    allocation.
  • It transmits the data in non-buffered fashion, which means
    ASP.NET Response object does not buffer the response and thus saves
    memory

However, this is a blocking proxy. We need to make a streaming
asynchronous proxy for better performance. Here’s why:


image

Figure: Continuous streaming proxy

As you see, when data is transmitted from server to browser
while server downloads the content, the delay for server side
download is eliminated. So, if server takes 300ms to download
something from external source, and then 700ms to send it back to
browser, you can save up to 300ms Network Latency between server
and browser. The situation gets even better when the external
server that serves the content is slow and takes quite some time to
deliver the content. The slower external site is, the more saving
you get in this continuous streaming approach. This is
significantly faster than blocking approach when the external
server is in Asia or Australia and your server is in USA.

The approach for continuous proxy is:

  • Read bytes from external server in chunks of 8KB from a
    separate thread (Reader thread) so that it’s not blocked
  • Store the chunks in an in-memory Queue
  • Write the chunks to ASP.NET Response from that same queue
  • If the queue is finished, wait until more bytes are downloaded
    by the reader thread


image

The Pipe Stream needs to be thread safe and it needs to support
blocking Read. By blocking read it means, if a thread tries to read
a chunk from it and the stream is empty, it will suspend that
thread until another thread writes something on the stream. Once a
write happens, it will resume the reader thread and allow it to
read. I have taken the code of PipeStream from CodeProject
article by James Kolpack
and extended it to make sure it’s high
performance, supports chunks of bytes to be stored instead of
single bytes, support timeout on waits and so on.

A did some comparison between Regular proxy (blocking,
synchronous, download all then deliver) and Streaming Proxy
(continuous transmission from external server to browser). Both
proxy downloads the MSDN feed and delivers it to the browser. The
time taken here shows the total duration of browser making the
request to the proxy and then getting the entire response.


image

Figure: Time taken by Streaming Proxy vs Regular Proxy while
downloading MSDN feed

Not a very scientific graph and response time varies on the link
speed between the browser and the proxy server and then from proxy
server to the external server. But it shows that most of the time,
Streaming Proxy outperformed Regular proxy.


image

Figure: Test client to compare between Regular Proxy and Streaming
Proxy

You can also test both proxy’s response time by going to
http://labs.dropthings.com/AjaxStreamingProxy.
Put your URL and hit Regular/Stream button and see the “Statistics”
text box for the total duration. You can turn on “Cache response”
and hit a URL from one browser. Then go to another browser and hit
the URL to see the response coming from server cache directly. Also
if you hit the URL again on the same browser, you will see response
comes instantly without ever making call to the server. That’s
browser cache at work.

Learn more about Http Response caching from my blog post:

Making best use of cache for high performance website

A Visual Studio Web Test run inside a Load Test shows a better
picture:


image

Figure: Regular Proxy load test result shows Average
Requests/Sec 0.79
and Avg Response Time 2.5 sec


image

Figure: Streaming Proxy load test result shows Avg Req/Sec is
1.08
and Avg Response Time 1.8 sec.

From the above load test results, Streaming Proxy is 26%
better Request/Sec and Average Response Time is 29% better
. The
numbers may sound small, but at Pageflakes, 29% better response
time means 1.29 million seconds saved per month for all the
users on the website. So, we are effectively saving 353 man hours
per month which was wasted staring at browser screen while it
downloads content.

Building the Streaming Proxy

The details how the Streaming Proxy is built is quite long and
not suitable for a blog post. So, I have written a CodeProject
article:

Fast, Scalable,
Streaming AJAX Proxy – continuously deliver data from cross
domain

Please read the article and please vote for me if your find it
useful.

My first book – Building a Web 2.0 Portal with ASP.NET 3.5

My first book “Building a Web 2.0 Portal with ASP.NET 3.5” from
O’Reilly is published and available in the stores. This book
explains in detail the architecture design, development, test,
deployment, performance and scalability challenges of my open
source web portal Dropthings.com. Dropthings is a prototype of a web
portal similar to iGoogle or Pageflakes. But this portal is developed using
recently released brand new technologies like ASP.NET 3.5, C# 3.0,
Linq to Sql, Linq to XML, and Windows Workflow foundation. It makes
heavy use of ASP.NET AJAX 1.0. Throughout my career I have built
several state-of-the-art personal, educational, enterprise and mass consumer web
portals
. This book collects my experience in building all of
those portals.

O’Reilly Website:
http://www.oreilly.com/catalog/9780596510503/

Amazon:

http://www.amazon.com/Building-Web-2-0-Portal-ASP-NET/dp/0596510500

Disclaimer: This book does not show you how to build Pageflakes.
Dropthings is entirely different in terms of architecture,
implementation and the technologies involved.

You learn how to:

  • Implement a highly decoupled architecture following the popular
    n-tier, widget-based application model
  • Provide drag-and-drop functionality, and use ASP.NET 3.5 to
    build the server-side part of the web layer
  • Use LINQ to build the data access layer, and Windows Workflow
    Foundation to build the business layer as a collection of
    workflows
  • Build client-side widgets using JavaScript for faster
    performance and better caching
  • Get maximum performance out of the ASP.NET AJAX Framework for
    faster, more dynamic, and scalable sites
  • Build a custom web service call handler to overcome
    shortcomings in ASP.NET AJAX 1.0 for asynchronous, transactional,
    cache-friendly web services
  • Overcome JavaScript performance problems, and help the user
    interface load faster and be more responsive
  • Solve various scalability and security problems as your site
    grows from hundreds to millions of users
  • Deploy and run a high-volume production site while solving
    software, hardware, hosting, and Internet infrastructure
    problems

If you’re ready to build state-of-the art, high-volume web
applications that can withstand millions of hits per day, this book
has exactly what you need.

A significant part of sql server process memory has been paged out. This may result in performance degradation

If you are using SQL Sever Server standard edition 64 bit on a
Windows 2003 64bit, you will frequently encounter this problem
where SQL Server says:

A significant part of sql server process memory has been paged
out. This may result in performance degradation. Duration 0
seconds. Working set (KB) 25432, committed (KB) 11296912, memory
utilization 0%

The number in working set and duration will vary. What happens
here is SQL Server is forced to release memory to operating system
because some other application or OS itself needs to allocate
RAM.

We went through many support articles like:

  • 918483:
    How to reduce paging of buffer pool memory in the 64-bit version of
    SQL Server 2005
  • 905865:
    The sizes of the working sets of all the processes in a console
    session may be trimmed when you use Terminal Services to log on to
    or log off from a computer that is running Windows Server 2003
  • 920739:
    You may experience a decrease in overall system performance when
    you are copying files that are larger than approximately 500 MB in
    Windows Server 2003 Service Pack 1

But nothing solved the problem. We still have the page out
problem happening every day.

The server has 16 GB RAM where 12 GB is maximum limit allocated
to SQL Server. 4 GB is left to OS and and other application. We
have also turned off antivirus and any large backup job. 12 GB RAM
should be plenty because there’s no other app running on the
dedicated SQL Server box. But the page out still happens. When this
happens, SQL Server becomes very slow. Queries timeout, website
throws error, transactions abort. Sometimes this problems goes on
for 30 to 40 minutes and website becomes slow/unresponsive during
that time.

I have found what causes SQL Server to page out. File System
cache somehow gets really high and forces SQL Server to trim
down.

clip_image002

You see the System cache resident bytes are very high. During
this time SQL Server gets much less RAM than it needs. Queries
timeout at very high rate like 15 per sec. Moreover, there’s high
SQL Lock Timeout/sec (around 15/sec not captured in screen
shot).

clip_image004

SQL Server max memory is configured 12 GB. But here it shows
it’s getting less than 8 GB.

While the file system cache is really high, there’s no
process that’s taking significant RAM.

clip_image006

After I used SysInternal’s
CacheSet
to reset file system cache and set around 500 MB as
max limit, memory started to free up.

clip_image008

SQL Server started to see more RAM free:

clip_image010

Then I hit the “Clear” button to clear file system
cache and it came down dramatically.

clip_image012

Paging stopped. System cache was around 175 MB only. SQL Server
lock timeout came back to zero. Everything went back to normal.

So, I believe there’s either some faulty driver or the OS itself
is leaking file system cache in 64bit environment.

What we have done is, we have a dedicated person who goes to
production database servers every hour, runs the CacheSet program
and clicks “Clear” button. This clears the file system cache and
prevents it from growing too high.

There are lots of articles written about this problem. However,
the most informative one I have found is from the SQL Server PSS
team:


http://blogs.msdn.com/psssql/archive/2007/05/31/the-sql-server-working-set-message.aspx

UPDATE – THE FINAL SOLUTION!

The final solution is to run this program on Windows
Startup:

SetSystemFileCacheSize 128 256

This sets the lower and higher limit for the System Cache. You
need to run this on every windows startup because a restart will
undo the cache setting to unlimited.

You can run the program without any parameter to see what is the
current setting.

Download the program from this page:

http://www.uwe-sieber.de/ntcacheset_e.html

Go to the end and you will get the link to the
SetSystemFileCacheSize.zip