Common Concurrency Abstractions in Haskell, IORef
2015-02-15
Table of Contents
Haskell is one of the languages that can be considered a laboratory for all sorts of crazy ideas in software development including most challenging areas for parallel and concurrent programming. We parallelize tasks with using threads - heavy operating system threads or lightweight runtime threads (so called green threads). And so, one of the most interesting type of questions in this area are questions related to coordinating work between threads and how to do it efficiently.
Haskell has many concurrency abstractions built into the base library as well as lots more in the form of libraries. This series of blog posts starts with mutable reference in IO
monad, IORef
.
Mutable Reference in IO
monad, IORef
Official Data.IORef
module documentation can be found here.
IORef
is the lowest level non-blocking synchronization mechanism available in Haskell. It is using compare-and-swap atomic instruction (CAS) that is supported on hardware level on multitude of existing hardware platforms.
Here are some important functions from Data.IORef
module:
-- | Creates a new IORef referencing a value (it needs to always reference value).
newIORef :: a -> IO (IORef a)
-- | Reads a value from IORef.
readIORef :: IORef a -> IO a
-- | Writes a new value into IORef (ordering is NOT guaranteed).
writeIORef :: IORef a -> a -> IO ()
-- | Safely writes a new value into IORef (ordering is guaranteed).
atomicWriteIORef :: IORef a -> a -> IO ()
-- | Modifies a value in IORef in a thread-safe way (ordering is guaranteed).
atomicModifyIORef :: IORef a -> (a -> (a, b)) -> IO b
A very important and very inconvenient feature of writeIORef
function is that it does not guarantee ordering, meaning that operations are not always observed in the same order that they go in the text of your program. This inconveniency is due to the memory model of an underlying computer architecture. With this drawback IORef
could have been too unreliable to be used. In order to ensure ordering writeIORef
function should be replaced by atomicWriteIORef
that has ordering guarantees built in, meaning that operations occur in strict program order.
Most importantly, in order to be useful and thread-safe IORef
should make it possible to guarantee the atomicity of read-modify-write kind of operations. This is achieved with the atomicModifyIORef
function that relies on CAS instructions. atomicModifyIORef
guarantees ordering of operations and is implemented similar to following (pseudo implementation):
-- | Calls to a CAS instruction.
-- Compares old value supplied with value stored in IORef:
-- 1. When values are equal, swaps old value to a new one in IORef and returns True;
-- 2. When values are not equal any longer, returns False.
compareAndSwap
:: IORef a -- ^ IORef holding a value.
-> a -- ^ Old value.
-> a -- ^ New value.
-> IO Bool -- ^ Returns True if old value can still be found in IORef, else False.
compareAndSwap = undefined
-- | Pseudo implementation of atomicModifyIORef.
atomicModifyIORef :: IORef a -> (a -> (a, b)) -> IO b
atomicModifyIORef ref f = do
oldValue <- readIORef ref -- begin transaction
let (newValue, result) = f oldValue
succeeded <- compareAndSwap ref oldValue newValue
if succeeded
then return result -- commit transaction
else atomicModifyIORef ref f -- retry transaction
It is somewhat like a mini-transaction over a single variable inside an IORef
: it reads a value from IORef
, calls a function to mutate it and then calls out to the CAS implementation. In case CAS fails, it simply retries until it succeeds. Here it is easy to see that the IORef
mechanism is non-blocking for updating a shared state, meaning that if other threads are stuck this current thread can proceed with shared state modification without getting blocked.
IORef
abstraction can be very efficient, but has a critical limitation – it is not composable, i.e. it is not possible to combine two IORef
variables and get same guarantees of atomicity during modification across both of them (it is possible with STM
as we will see later).
As we just saw atomicity within single IORef
is guaranteed with using atomicModifyIORef
function. On the other hand atomicity across several different instances of IORef
can not be guaranteed. In order to address this issue it is sometimes possible to:
- Merge several
IORef
’s into a single one where its values will also be merged into one, thus becoming unnecessarily complex. This approach is more like a workaround that may lead to some other inefficiencies. - Use other concurrency abstractions like
MVar
orSTM
with potential performance overhead
Another limitation or a benefit of IORef
dependning on how you see it is that the critical section can only be a pure function and cannot launch some missiles for you (note the signature of atomicModifyIORef
function again).
Here is a very simple example that demonstrates how several threads can update shared state with using IORef
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Let’s get possible confusions out of our way. liftM
function in case you are not familiar with it - code on line 21
is functionally equivalent to the commented code on line 20
.
In the next post, we will look into intermediate level blocking synchronization mechanism, MVar
.
References
- “Comparing the performance of concurrent lined-list implementations in Haskell” by Martin Sulzmann, Edmund S.L. Lam and Simon Marlow
“Common Concurrency Abstractions in Haskell” Series
- Mutable reference in
IO
monad,IORef
- Synchronized mutable variable,
MVar
- Unbounded channel,
Chan
- Simple quantity semaphore,
QSem
- Parameterized quantity semaphore,
QSemN
- Sample variable,
SampleVar
- Software Transactional Memory,
STM
- Actor-based Model