Share on Facebook Share on Twitter Share on Digg Share on Stumble Upon Share via e-mail Print

The Dangers of Dirty Reads

by Craig S. Mullins

If you build an application program that accesses a database, it is crucial to understand the potential for concurrency problems. When one program attempts to read data that is in the process of being changed by another, the database management system must forbid access until the modification is complete in order to ensure data integrity. Most DBMS products use a locking mechanism for all data items being changed. Therefore, when one task is updating data on a page, another task cannot access data (read or update) on that same page until the data modification is complete and committed.

Programs that read database data can access numerous rows and are therefore susceptible to concurrency problems. To get around this issue, most major RDBMS products support read-through locks, also known as “dirty read” or “uncommitted read,” to help overcome concurrency problems. When using uncommitted reads (UR), an application program can access data that has been changed, but is not yet committed. Dirty read capability is commonly implemented using an isolation level, but the exact naming and implementation differs by DBMS vendor.

A program using dirty reads will read data without taking locks. This enables the application program to read data contained in the table as it is being manipulated. And it generally improves performance and availability of data because no locking mechanism is invoked during the process. Consider the following sequence of events:

1. At 9:00 AM, a transaction is executed containing the following SQL to change a specific value.



WHERE EMPNO = 10020;

The transaction is a long-running one and continues to execute without issuing a COMMIT.

2. At 9:01 AM, a second transaction attempts to SELECT the data that was changed, but not committed.

If it is a dirty read, the transaction would select the changed data even though it had yet to be committed. Obviously, if the program need not wait to take a lock and merely reads the data in whatever state it happens to be at that moment, the program will execute faster than if it had to wait for locks to be taken and resources to be freed before processing.

However, the implications of reading uncommitted data must be carefully examined before being implemented. Several types of “problems” can occur. A dirty read can cause duplicate rows to be returned where none exist. Alternatively, a dirty read can cause no rows to be returned when one (or more) actually exists. In some cases, dirty reads can return data that was never in the database at all (e.g., rolled back before committed). Obviously, these “problems” must be taken into consideration before using the UR isolation level.

Benefits and Drawbacks

So when might you want to consider using dirty reads in your applications? The general rule of thumb is to avoid dirty reads whenever the results must be 100% accurate. For example, avoid UR if calculations must balance, data is being retrieved from one source to modify another, or for any production, mission-critical work that cannot tolerate data integrity problems. Succinctly stated, it is my opinion that most production, mission-critical applications are not candidates for dirty reads. Yet I see UR used in production programs more frequently than is probably wise.

There are a few specific situations in which the dirty read capability may make sense. Consider the following cases:

The dirty read capability can provide relief to concurrency problems and deliver faster performance in very specific situations. Be certain to understand the implications of the UR isolation level and the “problems” it can cause before diving headlong into implementing it in your production applications.

From Database Trends and Applications, August 2014.

© 2014 Craig S. Mullins,  

August 2014

DBA Corner