Last year at Pageflakes,
when we were getting millions of hits per day, we were having query
timeout due to lock timeout and Transaction Deadlock errors. These
locks were produced from aspnet_Users
and
aspnet_Membership
tables. Since both of these tables
are very high read (almost every request causes a read on these
tables) and high write (every anonymous visit creates a row on
aspnet_Users
), there were just way too many locks
created on these tables per second. SQL Counters showed thousands
of locks per second being created. Moreover, we had queries that
would select thousands of rows from these tables frequently and
thus produced more locks for longer period, forcing other queries
to timeout and thus throw errors on the website.
If you have read my last blog post, you know why such locks
happen. Basically every table when it grows up to hold millions of
records and becomes popular goes through this trouble. It’s
just a part of scalability problem that is common to database. But
we rarely take prevention about it in our early design.
The solution is simple, you should either have WITH
or
(NOLOCK)SET TRANSACTION ISOLATION LEVEL READ
before SELECT queries. Either of this will do.
UNCOMMITTED
They tell SQL Server not to hold any lock on the table while it is
reading the table. If some row is locked while the read is
happening, it will just ignore that row. When you are reading a
table thousand times per second, without these options, you are
issuing lock on many places around the table thousand times per
second. It not only makes read from table slower, but also so many
lock prevents insert, update, delete from happening timely and thus
queries timeout. If you have queries like “show the currently
online users from last one hour based on
LastActivityDate
field”, that is going to issue
such a wide lock that even other harmless select queries will
timeout. And did I tell you that there’s no index on
LastActivityDate
on aspnet_Users
table?
Now don’t blame yourself for not putting either of these
options on your every stored proc and every dynamically generated
SQL from the very first day. ASP.NET developers made the same
mistake. You won’t see either of these used in any of the
stored procs used by ASP.NET Membership. For example, the following
stored proc gets called whenever you access Profile
object:
ALTER PROCEDURE [dbo].[aspnet_Profile_GetProperties]
@ApplicationName nvarchar(256),
@UserName nvarchar(256),
@CurrentTimeUtc datetime
AS
BEGIN
DECLARE @ApplicationId uniqueidentifier
SELECT @ApplicationId = NULL
SELECT @ApplicationId = ApplicationId FROM
dbo.aspnet_Applications WHERE LOWER(@ApplicationName) = LoweredApplicationName
IF (@ApplicationId IS NULL)
RETURN
DECLARE @UserId uniqueidentifier
DECLARE @LastActivityDate datetime
SELECT @UserId = NULL
SELECT @UserId = UserId, @LastActivityDate = LastActivityDate
FROM dbo.aspnet_Users
WHERE ApplicationId = @ApplicationId AND LoweredUserName = LOWER(@UserName)
IF (@UserId IS NULL)
RETURN
SELECT TOP 1 PropertyNames, PropertyValuesString, PropertyValuesBinary
FROMdbo.aspnet_Profile
WHERE UserId = @UserId
IF (@@ROWCOUNT > 0)
BEGIN
UPDATE dbo.aspnet_Users
SET LastActivityDate=@CurrentTimeUtc
WHERE UserId = @UserId
END
END
There are two
SELECT operations that hold lock on two very high read tables
– aspnet_Users
and aspnet_Profile
.
Moreover, there’s a nasty UPDATE statement. It tries to
update the LastActivityDate
of a user whenever you
access Profile
object for the first time within a http
request.
This stored proc alone is enough to bring your site down. It did
to us because we are using Profile
Provider
everywhere. This stored proc was called around 300 times/sec. We
were having nightmarish slow performance on the website and many
lock timeouts and transaction deadlocks. So, we added the
transaction isolation level and we also modified the UPDATE
statement to only perform an update when the
LastActivityDate
is over an hour. So, this means, the
same user’s LastActivityDate
won’t be
updated if the user hits the site within the same hour.
So, after the modifications, the stored proc looked like
this:
ALTER PROCEDURE [dbo].[aspnet_Profile_GetProperties]
@ApplicationName nvarchar(256),
@UserName nvarchar(256),
@CurrentTimeUtc datetime
AS
BEGIN
-- 1. Please no more locks during reads
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
DECLARE @ApplicationId uniqueidentifier
--SELECT @ApplicationId = NULL
--SELECT @ApplicationId = ApplicationId FROM dbo.aspnet_Applications
WHERE LOWER(@ApplicationName) = LoweredApplicationName
--IF (@ApplicationId IS NULL)
-- RETURN
-- 2. No more call to Application table. We have only one app dude!
SET @ApplicationId = dbo.udfGetAppId()
DECLARE @UserId uniqueidentifier
DECLARE @LastActivityDate datetime
SELECT @UserId = NULL
SELECT @UserId = UserId, @LastActivityDate = LastActivityDate
FROM dbo.aspnet_Users
WHERE ApplicationId = @ApplicationId AND LoweredUserName = LOWER(@UserName)
IF (@UserId IS NULL)
RETURN
SELECT TOP 1 PropertyNames, PropertyValuesString, PropertyValuesBinary
FROM dbo.aspnet_Profile
WHERE UserId = @UserId
IF (@@ROWCOUNT > 0)
BEGIN
-- 3. Do not update the same user within an hour
IF DateDiff(n, @LastActivityDate, @CurrentTimeUtc) > 60
BEGIN
-- 4. Use ROWLOCK to lock only a row since we know this query
-- is highly selective
UPDATE dbo.aspnet_Users WITH(ROWLOCK)
SET LastActivityDate=@CurrentTimeUtc
WHERE UserId = @UserId
END
END
END
The changes I
made are numbered and commented. No need for further explanation.
The only tricky thing here is, I have eliminate call to Application
table just to get the ApplicationID from ApplicationName. Since
there’s only one application in a database (ever heard of
multiple applications storing their user separately on the same
database and the same table?), we don’t need to look up the
ApplicationID on every call to every Membership stored proc. We can
just get the ID and hard code it in a function.
CREATE FUNCTION dbo.udfGetAppId()
RETURNS uniqueidentifier
WITH EXECUTE AS CALLER
AS
BEGIN
RETURN CONVERT(uniqueidentifier, 'fd639154-299a-4a9d-b273-69dc28eb6388')
END;
This UDF returns the ApplicationID
that I have
hardcoded copying from the Application table. Thus it eliminates
the need for quering on the Application table.
Similarly you should do the changes in all other stored
procedures that belong to Membership Provider. All the stroc procs
are missing proper locking, issues aggressive lock during update
and too frequent updates than practical need. Most of them also try
to resolve ApplicationID from ApplicationName, which is unnecessary
when you have only one web application per database. Make these
changes and enjoy lock contention free super performance from
Membership Provider!
I'd be nice if “performance” tweaks made it back into ASP.NET via an update, so everybody could benefit from others learning curves. I'm sure there are a ton of default implemenations that would perk up with this change. Great Post!
ever heard of multiple applications storing their user separately on the same database and the same table?
umm…yes – that's why there is an ApplicationId
Joe, you are right this is done sometimes. But in my experience it is the exception rather than the rule and a good optimisation if the situation suits you.
I agree – great article 🙂
Joe
nice, really nice!
Hi Omar,
I was wondering if you tried using Snapshot Isolation level- msdn.microsoft.com/…/ms130975.aspx to solve this issue?
The guy from PlentyOfFish.com had some performance issues with DB's too- you might be interested in listening to the Podcast:
channel9.msdn.com/…/ARCastTV-PlentyOfFishcom-How-one-man-beat-the-big-guys
Hi Omar,
How large is the web site (in this case study) in terms of monthly page impressions, requiring authentication?
I am investigating the impact of ramping up the use of membership on our sites. Your case study may be a good reference point for anyone undertaking this kind of work.
Thanks,
Vi
Thank you for submitting this cool story – Trackback from PimpThisBlog.com
Hi Omar,
What’s your take on the Membership API and the use of GUID, SQL NewID(), etc. as it relates to performance and large numbers of users? Have you noticed any performance issues when using the default random GUIDs? See here: http://www.ericswann.org/blog/archive/2008/03/15/asp.net-sql-server-2005-membership-provider-and-newsequentialid.aspx
Thanks
Hi! This is a tremendous post, thanks for sharing it! I was having some issues with my future website… testing it with Jmeter I could only get 50 threads at the same time… because logon and logoff were taking so much time…
thanks to this I can now have more than 500 users at the same time! (I made some other changes, but this is 80% of the problem)
Thanks!