BANGALORE, INDIA: Gone are the days when we used to write applications for the desktop which would be used by only one user at a time. Almost all applications that we write today are multi-user systems which would be used by many users concurrently.
And today we have sophisticated frameworks and application infrastructures that let the application developer focus on solving the business problem at hand, where the frameworks and infrastructures take care of concurrency. For any data-driven application, if you use a database server like Microsoft SQL Server, the server handles the database concurrency issues for you, whereas you, the developer, focuses on your core task of solving your client's business problem.
Prepare the data
Let us first get some sample data ready that we will later try to modify concurrently.
CREATE TABLE Employee
(
empid int primary key IDENTITY(1, 1),
empname nvarchar(20),
city nvarchar(15)
)
INSERT INTO Employee(empname, city)
VALUES('Ashish', 'Mumbai')
INSERT INTO Employee(empname, city)
VALUES('Asha', 'Pune')
To keep matters simple, we will not have a front end with multiple users who try to modify data simultaneously. Instead, I will open up multiple query Windows in SQL Server Management Studio to simulate multiple users.
Start Concurrent updates
In one query window, start updating the Employee table you just created inside a transaction.
BEGIN TRANSACTION
UPDATE Employee
SET city = 'Delhi'
WHERE empid = 1
Do not commit or rollback the transaction right away. Instead, while this connection is still open, start another query window by clicking on the 'New Query' button on the toolbar. You will normally find this button on the top left corner of Management Studio. In the new query window, let us now try to access the same row which is under the update transaction in the old query window:
SELECT empid, city FROM Employee
WHERE empid = 1
You will see that this activitywill get blocked. This query will complete only when you finish the transaction in the first query editor window:
ROLLBACK TRANSACTION
Management Studio neatly tabs both the query windows for easy navigation. You could alternately have committed the transaction. Either way, the query in the second query window will be unblocked and will proceed to completion.
In this case, the second query window waited for the first query window to complete the transaction before reading the data the first query window was trying to modify. This is the default behavior with SQL Server.
Configuring Concurrent Updates
What if you are in a hurry and want to read the data even before it is committed or rolled back in the previous transaction. You would need to state your intent then explicitly. Let us repeat the entire exercise then. This is what you do in the first query window, as before:
BEGIN TRANSACTION
UPDATE Employee
SET city = 'Delhi'
WHERE empid = 1
This time, in the second query window, you specify that you want to read the data before it has been committed.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT empid, city FROM Employee
WHERE empid = 1
You will now find your query responds immediately. Now the second query is not blocked by the first query by setting the transaction isolation level to READ UNCOMMITTED. Reading the result of an uncommitted transaction of a different connection, as we have done now, is called as dirty read. What did we gain? The job is done faster, because we did not have to wait for the first query window to complete its transaction. What did we lose? The transaction could be committed or rolled back. If it is committed, data as you read in the second query window is the final data. If it is rolled back, data read in the second query window would be different after the rollback.
Now please rollback the first transaction.
ROLLBACK TRANSACTION
Execute the query in the second query window again.The result is different because the transaction was rolled back.
If you recollect, before we set the transaction isolation level, the default behavior was to wait till the other transaction completes. This is known as the transaction isolation level READ COMMITTED. Instead of relying on the default behavior, you can explicitly set the transaction isolation level to READ COMMITTED.Let us repeat the concurrent updates that way then. This is what you do in the first query editor window:
BEGIN TRANSACTION
UPDATE Employee
SET city = 'Delhi'
WHERE empid = 1
And this is what you do in the second query editor window:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
SELECT empid, city FROM Employee
WHERE empid = 1
As expected, this query is blocked. Let us roll back the first transaction.
ROLLBACK TRANSACTION
This will unblock the second query.
SNAPSHOT
Remember what had happened when we set the transaction isolation level to READ UNCOMMITTED? We were getting the data after the update, even if it was not committed. What if we would want the data before the update? We would set the transaction isolation level to SNAPSHOT. You would first need to configure the database to allow SNAPHOT isolation. You can execute the following statement in any of the query windows. Only take care not to run this inside a transaction boundary.
ALTER DATABASE payroll
SET ALLOW_SNAPSHOT_ISOLATION ON
My database name is payroll; substitute whatever is the name of your database. Now let me attempt the same transaction in the first query window.
BEGIN TRANSACTION
UPDATE Employee
SET city = 'Delhi'
WHERE empid = 1
Set the transaction isolation level to SNAPSHOT in the second query window.
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
SELECT empid, city FROM Employee
WHERE empid = 1
The query runs without blocking and shows the data as it was before the modification. But if you choose the commit the transaction in the first query window and read the data again, you will get the new values.
COMMIT TRANSACTION
Either in the case of your transaction level being READ UNCOMMITED or SNAPSHOT, you say you want to read the data before the other transaction has finished modification. You can take the choice of whether you want to see the data as it was before the modification or after the modification, there is still no guarantee the data will remain the same after the transaction completes, because it may either commit or rollback.
REPEATABLE READ
Let us now run the query in the second window also as a transaction. We will set the transaction isolation level back to the default ? READ COMMITTED
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION
SELECT empid, city FROM Employee
WHERE empid = 1
Run a transaction in the first query window.
BEGIN TRANSACTION
UPDATE Employee
SET city = 'Mumbai'
WHERE empid = 1
COMMIT TRANSACTION
Come back and execute the SELECT statement in the transaction of the second query window.
SELECT empid, city FROM Employee
WHERE empid = 1
You will get the value after the modification.
What we notice is within the transaction of the second query window, the same SELECT statement is giving two different results at two different points in time. We may however want to make the read repeatable, that is, we want to be assured that within the same transaction if I read the same piece of data, the results would be the same, i.e., repeatable.
For that we need to set the transaction isolation level of the second query window to REPEATABLE READ.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT empid, city FROM Employee
WHERE empid = 1
Now run the update in the first query window. Now the first query is blocked by the second query to make sure its read is repeatable. Complete the second transaction so that the first query proceeds to completion.
COMMIT TRANSACTION
SERIALIZABLE
Start the transaction of the second query window again. Let us now see all rows instead of only one row.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT empid, empname, city FROM Employee
We now know the first query window would be blocked if it tries to do an update. However, let us try something different.
BEGIN TRANSACTION
INSERT INTO Employee
VALUES('Atul', 'Bangalore')
COMMIT TRANSACTION
This would proceed to completion. The transaction in the second query window is not complete. Let us run the SELECT statement again.
SELECT empid, empname, city FROM Employee
The same query is returning an extra row in the same transaction. This extra row is what is called a phantom row. Let us complete the second transaction.
COMMIT TRANSACTION
If we do not want phantom rows to appear within the same transaction, we would set the transaction isolation level to SERIALIZABLE.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT empid, empname, city FROM Employee
Insert a different row in the first query window.
BEGIN TRANSACTION
INSERT INTO Employee
VALUES('Avinash', 'Bangalore')
COMMIT TRANSACTION
The INSERT would now be blocked by the second query window.
Complete the second transaction.
COMMIT TRANSACTION
The first query will complete now.
Summary
We learnt five transaction isolation levels in Microsoft SQL Server 2005 and 2008 ? READ UNCOMMITTED, SNAPSHOT, READ COMMITTED, REPEATABLE READ and SERIALIZABLE. SNAPSHOT isolation level was introduced in SQL 2005, but the other four would work on SQL Server 2000 also.
In principle, the higher the concurrency (SERIALIZABLE would be the highest), the lower the performance because other users would be blocked for a long duration. And higher the performance, lower the concurrency (READ UNCOMMITTED would be the lowest). You would need to set the right balance between the two, depending on your priorities for the specific application you are developing.
Amaresh Patnaik