Tech

Managing SQL Server Concurrency

CIOL Bureau

13 May 2009 00:00 IST

Updated On 13 May 2009 21:44 IST

New Update

BANGALORE, INDIA: Gone are the days when we used to write applications for the desktop which would be used by only one user at a time. Almost all applications that we write today are multi-user systems which would be used by many users concurrently.

Advertisment

And today we have sophisticated frameworks and application infrastructures that let the application developer focus on solving the business problem at hand, where the frameworks and infrastructures take care of concurrency. For any data-driven application, if you use a database server like Microsoft SQL Server, the server handles the database concurrency issues for you, whereas you, the developer, focuses on your core task of solving your client's business problem.

You will not have to write code to manage database level concurrency in most projects you would be involved in. SQL Server would figure out how to let multiple users access the same piece of data without them overwriting each other's changes. However, there would be times when you would need better control over how the server manages concurrency and then you would need to provide guidelines to the server as to how you want it to manage concurrency for you.

Advertisment

For example, when you start tuning your database server for performance, you will realize performance tuning is actually a tight rope walk. And more often than not, you gain performance at the expense of concurrency and you gain concurrency at the expense of performance. Managing database concurrency is quite an involved topic in itself.

For the remainder of the article, I would focus on one of the most important aspects of managing concurrency for Microsoft SQL Server, configuring the transaction isolation level. I will explain the term as we go along. I am running my code samples on a Microsoft SQL Server 2008 database, but most of what I am discussing would work on the older versions also.

Advertisment

Prepare the data

Let us first get some sample data ready that we will later try to modify concurrently.

CREATE TABLE Employee

(

empid int primary key IDENTITY(1, 1),

empname nvarchar(20),

city nvarchar(15)

)

INSERT INTO Employee(empname, city)

VALUES('Ashish', 'Mumbai')

INSERT INTO Employee(empname, city)

VALUES('Asha', 'Pune')

To keep matters simple, we will not have a front end with multiple users who try to modify data simultaneously. Instead, I will open up multiple query Windows in SQL Server Management Studio to simulate multiple users.

Advertisment

Start Concurrent updates

In one query window, start updating the Employee table you just created inside a transaction.

BEGIN TRANSACTION

UPDATE Employee

SET city = 'Delhi'

WHERE empid = 1

Advertisment

Do not commit or rollback the transaction right away. Instead, while this connection is still open, start another query window by clicking on the 'New Query' button on the toolbar. You will normally find this button on the top left corner of Management Studio. In the new query window, let us now try to access the same row which is under the update transaction in the old query window:

SELECT empid, city FROM Employee

WHERE empid = 1

You will see that this activitywill get blocked. This query will complete only when you finish the transaction in the first query editor window:

Advertisment

ROLLBACK TRANSACTION

Management Studio neatly tabs both the query windows for easy navigation. You could alternately have committed the transaction. Either way, the query in the second query window will be unblocked and will proceed to completion.

In this case, the second query window waited for the first query window to complete the transaction before reading the data the first query window was trying to modify. This is the default behavior with SQL Server.

Advertisment

Configuring Concurrent Updates

What if you are in a hurry and want to read the data even before it is committed or rolled back in the previous transaction. You would need to state your intent then explicitly. Let us repeat the entire exercise then. This is what you do in the first query window, as before:

BEGIN TRANSACTION

UPDATE Employee

SET city = 'Delhi'

WHERE empid = 1

This time, in the second query window, you specify that you want to read the data before it has been committed.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

SELECT empid, city FROM Employee

WHERE empid = 1

You will now find your query responds immediately. Now the second query is not blocked by the first query by setting the transaction isolation level to READ UNCOMMITTED. Reading the result of an uncommitted transaction of a different connection, as we have done now, is called as dirty read. What did we gain? The job is done faster, because we did not have to wait for the first query window to complete its transaction. What did we lose? The transaction could be committed or rolled back. If it is committed, data as you read in the second query window is the final data. If it is rolled back, data read in the second query window would be different after the rollback.

Now please rollback the first transaction.

ROLLBACK TRANSACTION

Execute the query in the second query window again.The result is different because the transaction was rolled back.

If you recollect, before we set the transaction isolation level, the default behavior was to wait till the other transaction completes. This is known as the transaction isolation level READ COMMITTED. Instead of relying on the default behavior, you can explicitly set the transaction isolation level to READ COMMITTED.Let us repeat the concurrent updates that way then. This is what you do in the first query editor window:

BEGIN TRANSACTION

UPDATE Employee

SET city = 'Delhi'

WHERE empid = 1

And this is what you do in the second query editor window:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

SELECT empid, city FROM Employee

WHERE empid = 1

As expected, this query is blocked. Let us roll back the first transaction.

ROLLBACK TRANSACTION

This will unblock the second query.

SNAPSHOT

Remember what had happened when we set the transaction isolation level to READ UNCOMMITTED? We were getting the data after the update, even if it was not committed. What if we would want the data before the update? We would set the transaction isolation level to SNAPSHOT. You would first need to configure the database to allow SNAPHOT isolation. You can execute the following statement in any of the query windows. Only take care not to run this inside a transaction boundary.

ALTER DATABASE payroll

SET ALLOW_SNAPSHOT_ISOLATION ON

My database name is payroll; substitute whatever is the name of your database. Now let me attempt the same transaction in the first query window.

BEGIN TRANSACTION

UPDATE Employee

SET city = 'Delhi'

WHERE empid = 1

Set the transaction isolation level to SNAPSHOT in the second query window.

SET TRANSACTION ISOLATION LEVEL SNAPSHOT

SELECT empid, city FROM Employee

WHERE empid = 1

The query runs without blocking and shows the data as it was before the modification. But if you choose the commit the transaction in the first query window and read the data again, you will get the new values.

COMMIT TRANSACTION

Either in the case of your transaction level being READ UNCOMMITED or SNAPSHOT, you say you want to read the data before the other transaction has finished modification. You can take the choice of whether you want to see the data as it was before the modification or after the modification, there is still no guarantee the data will remain the same after the transaction completes, because it may either commit or rollback.

REPEATABLE READ

Let us now run the query in the second window also as a transaction. We will set the transaction isolation level back to the default ? READ COMMITTED

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

BEGIN TRANSACTION

SELECT empid, city FROM Employee

WHERE empid = 1

Run a transaction in the first query window.

BEGIN TRANSACTION

UPDATE Employee

SET city = 'Mumbai'

WHERE empid = 1

COMMIT TRANSACTION

Come back and execute the SELECT statement in the transaction of the second query window.

SELECT empid, city FROM Employee

WHERE empid = 1

You will get the value after the modification.

What we notice is within the transaction of the second query window, the same SELECT statement is giving two different results at two different points in time. We may however want to make the read repeatable, that is, we want to be assured that within the same transaction if I read the same piece of data, the results would be the same, i.e., repeatable.

For that we need to set the transaction isolation level of the second query window to REPEATABLE READ.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

BEGIN TRANSACTION

SELECT empid, city FROM Employee

WHERE empid = 1

Now run the update in the first query window. Now the first query is blocked by the second query to make sure its read is repeatable. Complete the second transaction so that the first query proceeds to completion.

COMMIT TRANSACTION

SERIALIZABLE

Start the transaction of the second query window again. Let us now see all rows instead of only one row.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

BEGIN TRANSACTION

SELECT empid, empname, city FROM Employee

We now know the first query window would be blocked if it tries to do an update. However, let us try something different.

BEGIN TRANSACTION

INSERT INTO Employee

VALUES('Atul', 'Bangalore')

COMMIT TRANSACTION

This would proceed to completion. The transaction in the second query window is not complete. Let us run the SELECT statement again.

SELECT empid, empname, city FROM Employee

The same query is returning an extra row in the same transaction. This extra row is what is called a phantom row. Let us complete the second transaction.

COMMIT TRANSACTION

If we do not want phantom rows to appear within the same transaction, we would set the transaction isolation level to SERIALIZABLE.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION

SELECT empid, empname, city FROM Employee

Insert a different row in the first query window.

BEGIN TRANSACTION

INSERT INTO Employee

VALUES('Avinash', 'Bangalore')

COMMIT TRANSACTION

The INSERT would now be blocked by the second query window.

Complete the second transaction.

COMMIT TRANSACTION

The first query will complete now.

Summary

We learnt five transaction isolation levels in Microsoft SQL Server 2005 and 2008 ? READ UNCOMMITTED, SNAPSHOT, READ COMMITTED, REPEATABLE READ and SERIALIZABLE. SNAPSHOT isolation level was introduced in SQL 2005, but the other four would work on SQL Server 2000 also.

In principle, the higher the concurrency (SERIALIZABLE would be the highest), the lower the performance because other users would be blocked for a long duration. And higher the performance, lower the concurrency (READ UNCOMMITTED would be the lowest). You would need to set the right balance between the two, depending on your priorities for the specific application you are developing.

Amaresh Patnaik

tech-news