Proper Locking Under a Preemptible Kernel: Keeping Kernel Code Preempt-Safe¶
- Author:
Robert Love <rml@tech9.net>
Introduction¶
A preemptible kernel creates new locking issues. The issues are the same asthose under SMP: concurrency and reentrancy. Thankfully, the Linux preemptiblekernel model leverages existing SMP locking mechanisms. Thus, the kernelrequires explicit additional locking for very few additional situations.
This document is for all kernel hackers. Developing code in the kernelrequires protecting these situations.
RULE #1: Per-CPU data structures need explicit protection¶
Two similar problems arise. An example code snippet:
struct this_needs_locking tux[NR_CPUS];tux[smp_processor_id()] = some_value;/* task is preempted here... */something = tux[smp_processor_id()];
First, since the data is per-CPU, it may not have explicit SMP locking, butrequire it otherwise. Second, when a preempted task is finally rescheduled,the previous value of smp_processor_id may not equal the current. You mustprotect these situations by disabling preemption around them.
You can also useput_cpu() andget_cpu(), which will disable preemption.
RULE #2: CPU state must be protected.¶
Under preemption, the state of the CPU must be protected. This is arch-dependent, but includes CPU structures and state not preserved over a contextswitch. For example, on x86, entering and exiting FPU mode is now a criticalsection that must occur while preemption is disabled. Think what would happenif the kernel is executing a floating-point instruction and is then preempted.Remember, the kernel does not save FPU state except for user tasks. Therefore,upon preemption, the FPU registers will be sold to the lowest bidder. Thus,preemption must be disabled around such regions.
Note, some FPU functions are already explicitly preempt safe. For example,kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
RULE #3: Lock acquire and release must be performed by same task¶
A lock acquired in one task must be released by the same task. Thismeans you can’t do oddball things like acquire a lock and go off toplay while another task releases it. If you want to do somethinglike this, acquire and release the task in the same code path andhave the caller wait on an event by the other task.
Solution¶
Data protection under preemption is achieved by disabling preemption for theduration of the critical region.
preempt_enable() decrement the preempt counterpreempt_disable() increment the preempt counterpreempt_enable_no_resched() decrement, but do not immediately preemptpreempt_check_resched() if needed, reschedulepreempt_count() return the preempt counter
The functions are nestable. In other words, you can call preempt_disablen-times in a code path, and preemption will not be reenabled until the n-thcall to preempt_enable. The preempt statements define to nothing ifpreemption is not enabled.
Note that you do not need to explicitly prevent preemption if you are holdingany locks or interrupts are disabled, since preemption is implicitly disabledin those cases.
But keep in mind that ‘irqs disabled’ is a fundamentally unsafe way ofdisabling preemption - anycond_resched() orcond_resched_lock() might triggera reschedule if the preempt count is 0. A simpleprintk() might trigger areschedule. So use this implicit preemption-disabling property only if youknow that the affected codepath does not do any of this. Best policy is to usethis only for small, atomic code that you wrote and which calls no complexfunctions.
Example:
cpucache_t *cc; /* this is per-CPU */preempt_disable();cc = cc_data(searchp);if (cc && cc->avail) { __free_block(searchp, cc_entry(cc), cc->avail); cc->avail = 0;}preempt_enable();return 0;Notice how the preemption statements must encompass every reference of thecritical variables. Another example:
int buf[NR_CPUS];set_cpu_val(buf);if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");spin_lock(&buf_lock);/* ... */
This code is not preempt-safe, but see how easily we can fix it by simplymoving the spin_lock up two lines.
Preventing preemption using interrupt disabling¶
It is possible to prevent a preemption event using local_irq_disable andlocal_irq_save. Note, when doing so, you must be very careful to not causean event that would set need_resched and result in a preemption check. Whenin doubt, rely on locking or explicit preemption disabling.
Note in 2.5 interrupt disabling is now only per-CPU (e.g. local).
An additional concern is proper usage of local_irq_disable and local_irq_save.These may be used to protect from preemption, however, on exit, if preemptionmay be enabled, a test to see if preemption is required should be done. Ifthese are called from the spin_lock and read/write lock macros, the right thingis done. They may also be called within a spin-lock protected region, however,if they are ever called outside of this context, a test for preemption shouldbe made. Do note that calls from interrupt context or bottom half/ taskletsare also protected by preemption locks and so may use the versions which donot check preemption.