Thread Synchronization Mechanisms in Python

Fredrik Lundh | July 2007

This article discusses how to synchronize access to shared resources, and otherwise coordinate execution of threads.

Synchronizing Access to Shared Resources #

One important issue when using threads is to avoid conflicts when more than one thread needs to access a single variable or other resource. If you’re not careful, overlapping accesses or modifications from multiple threads may cause all kinds of problems, and what’s worse, those problems have a tendency of appearing only under heavy load, or on your production servers, or on some faster hardware that’s only used by one of your customers.

For example, consider a program that does some kind of processing, and keeps track of how many items it has processed:

counter = 0defprocess_item(item):global counter    ... do somethingwith item ...    counter += 1

If you call this function from more than one thread, you’ll find that the counter isn’t necessarily accurate. It works in most cases, but sometimes misses one or more items. The reason for this is that the increment operation is actually executed in three steps; first, the interpreter fetches the current value of the counter, then it calculates the new value, and finally, it writes the new value back to the variable.

If another thread gets control after the current thread has fetched the variable, it may fetch the variable, increment it, and write it back,before the current thread does the same thing. And since they’re both seeing the same original value, only one item will be accounted for.

Another common problem is access to incomplete or inconsistent state, which can happen if one thread is initializing or updating some non-trivial data structure, and another thread attempts to read the structure while it’s being updated.

Atomic Operations #

The simplest way to synchronize access to shared variables or other resources is to rely on atomic operations in the interpreter. An atomic operation is an operation that is carried out in a single execution step, without any chance that another thread gets control.

In general, this approach only works if the shared resource consists of a single instance of a core data type, such as a string variable, a number, or a list or dictionary. Here are some thread-safe operations:

reading or replacing a single instance attribute
reading or replacing a single global variable
fetching an item from a list
modifying a list in place (e.g. adding an item usingappend)
fetching an item from a dictionary
modifying a dictionary in place (e.g. adding an item, or calling theclear method)

Note that as mentioned earlier, operations that read a variable or attribute, modifies it, and then writes it back are not thread-safe. Another thread may update the variable after it’s been read by the current thread, but before it’s been updated.

Also note that Python code may be executed when objects are destroyed, so even seemingly simple operations may cause other threads to run, and may thus cause conflicts. When in doubt, use explicit locks.

Locks #

Locks are the most fundamental synchronization mechanism provided by thethreading module. At any time, a lock can be held by a single thread, or by no thread at all. If a thread attempts to hold a lock that’s already held by some other thread, execution of the first thread is halted until the lock is released.

Locks are typically used to synchronize access to a shared resource. For each shared resource, create aLock object. When you need to access the resource, callacquire to hold the lock (this will wait for the lock to be released, if necessary), and callrelease to release it:

lock = Lock()lock.acquire()# will block if lock is already held... access shared resourcelock.release()

For proper operation, it’s important to release the lock even if something goes wrong when accessing the resource. You can usetry-finally for this purpose:

lock.acquire()try:    ... access shared resourcefinally:    lock.release()# release lock, no matter what

In Python 2.5 and later, you can also use thewith statement. When used with a lock, this statement automatically acquires the lock before entering the block, and releases it when leaving the block:

from __future__import with_statement# 2.5 onlywith lock:    ... access shared resource

Theacquire method takes an optional wait flag, which can be used to avoid blocking if the lock is held by someone else. If you pass in False, the method never blocks, but returns False if the lock was already held:

ifnot lock.acquire(False):    ... failed to lock the resourceelse:try:        ... access shared resourcefinally:        lock.release()

You can use thelocked method to check if the lock is held. Note that you cannot use this method to determine if a call toacquire would block or not; some other thread may have acquired the lock between the method call and the next statement.

ifnot lock.locked():# some other thread may run before we get# to the next line    lock.acquire()# may block anyway

Problems with Simple Locking #

The standard lock object doesn’t care which thread is currently holding the lock; if the lock is held, any thread that attempts to acquire the lock will block, even if the same thread is already holding the lock. Consider the following example:

lock = threading.Lock()defget_first_part():    lock.acquire()try:        ... fetch datafor first partfrom shared objectfinally:        lock.release()return datadefget_second_part():    lock.acquire()try:        ... fetch datafor second partfrom shared objectfinally:        lock.release()return data

Here, we have a shared resource, and two access functions that fetch different parts from the resource. The access functions both use locking to make sure that no other thread can modify the resource while we’re accessing it.

Now, if we want to add a third function that fetches both parts, we quickly get into trouble. The naive approach is to simply call the two functions, and return the combined result:

defget_both_parts():    first = get_first_part()    second = get_second_part()return first, second

The problem here is that if some other thread modifies the resource between the two calls, we may end up with inconsistent data. The obvious solution to this is to grab the lock in this function as well:

defget_both_parts():    lock.acquire()try:        first = get_first_part()        second = get_second_part()finally:        lock.release()return first, second

However, this won’t work; the individual access functions will get stuck, because the outer function already holds the lock. To work around this, you can add flags to the access functions that enables the outer function to disable locking, but this is error-prone, and can quickly get out of hand. Fortunately, thethreading module contains a more practical lock implementation; re-entrant locks.

Re-Entrant Locks (RLock) #

TheRLock class is a version of simple locking that only blocks if the lock is held byanother thread. While simple locks will block if the same thread attempts to acquire the same lock twice, a re-entrant lock only blocks if another thread currently holds the lock. If the current thread is trying to acquire a lock that it’s already holding, execution continues as usual.

lock = threading.Lock()lock.acquire()lock.acquire()# this will blocklock = threading.RLock()lock.acquire()lock.acquire()# this won't block

The main use for this is nested access to shared resources, as illustrated by the example in the previous section. To fix the access methods in that example, just replace the simple lock with a re-entrant lock, and the nested calls will work just fine.

lock = threading.RLock()defget_first_part():    ... see abovedefget_second_part():    ... see abovedefget_both_parts():    ... see above

With this in place, you can fetch either the individual parts, or both parts at once, without getting stuck or getting inconsistent data.

Note that this lock keeps track of the recursion level, so you still need to callrelease once for each call toacquire.

Semaphores #

A semaphore is a more advanced lock mechanism. A semaphore has an internal counter rather than a lock flag, and it only blocks if more than a given number of threads have attempted to hold the semaphore. Depending on how the semaphore is initialized, this allows multiple threads to access the same code section simultaneously.

semaphore = threading.BoundedSemaphore()semaphore.acquire()# decrements the counter... access the shared resourcesemaphore.release()# increments the counter

The counter is decremented when the semaphore is acquired, and incremented when the semaphore is released. If the counter reaches zero when acquired, the acquiring thread will block. When the semaphore is incremented again, one of the blocking threads (if any) will run.

Semaphores are typically used to limit access to resource with limited capacity, such as a network connection or a database server. Just initialize the counter to the maximum number, and the semaphore implementation will take care of the rest.

max_connections = 10semaphore = threading.BoundedSemaphore(max_connections)

If you don’t pass in a value, the counter is initialized to 1.

Python’sthreading module provides two semaphore implementations; theSemaphore class provides an unlimited semaphore which allows you to callrelease any number of times to increment the counter. To avoid simple programming errors, it’s usually better to use theBoundedSemaphore class, which considers it to be an error to callrelease more often than you’ve calledacquire.

Synchronization Between Threads #

Locks can also be used for synchronization between threads. Thethreading module contains several classes designed for this purpose.

Events #

An event is a simple synchronization object; the event represents an internal flag, and threads can wait for the flag to be set, or set or clear the flag themselves.

event = threading.Event()# a client thread can wait for the flag to be setevent.wait()# a server thread can set or reset itevent.set()event.clear()

If the flag is set, thewait method doesn’t do anything. If the flag is cleared,wait will block until it becomes set again. Any number of threads may wait for the same event.

Conditions #

A condition is a more advanced version of the event object. A condition represents some kind of state change in the application, and a thread can wait for a given condition, or signal that the condition has happened. Here’s a simple consumer/producer example. First, you need a condition object:

# represents the addition of an item to a resourcecondition = threading.Condition()

The producing thread needs to acquire the condition before it can notify the consumers that a new item is available:

# producer thread... generate itemcondition.acquire()... add item to resourcecondition.notify()# signal that a new item is availablecondition.release()

The consumers must acquire the condition (and thus the related lock), and can then attempt to fetch items from the resource:

# consumer threadcondition.acquire()while True:    ... get itemfrom resourceif item:break    condition.wait()# sleep until item becomes availablecondition.release()... process item

Thewait method releases the lock, blocks the current thread until another thread callsnotify ornotifyAll on the same condition, and then reacquires the lock. If multiple threads are waiting, thenotify method only wakes up one of the threads, whilenotifyAll always wakes them all up.

To avoid blocking inwait, you can pass in a timeout value, as a floating-point value in seconds. If given, the method will return after the given time, even ifnotify hasn’t been called. If you use a timeout, you must inspect the resource to see if something actually happened.

Note that the condition object is associated with a lock, and that lock must be held before you can access the condition. Likewise, the condition lock must be released when you’re done accessing the condition. In production code, you should usetry-finally orwith, as shown earlier.

To associate the condition with an existing lock, pass the lock to theCondition constructor. This is also useful if you want to use several conditions for a single resource:

lock = threading.RLock()condition_1 = threading.Condition(lock)condition_2 = threading.Condition(lock)

::: effbot.org
::: zone :::

::: contents

rendered by adjango application. hosted bywebfaction.

Oct	NOV	Dec
	01
2019	2020	2021

Movatterモバイル変換