10.
KERNEL
SYNCHRONIZATION
METHODS
Critical Region: A critical section is a piece of code which should be executed
under mutual exclusion. If two threads are updating the same variable which is in
parent process’s address space, the code area where both thread access/update the
shared variable/resource is called as a Critical Region. It is essential to protect critical
region to avoid collision in code/system.
Race Condition: A race condition is a flaw that occurs when the timing or ordering of
events affects a program’s correctness. By using appropriate synchronization
mechanism or properly protecting Critical Region we can avoid/reduce the chance
of this flaw.
Deadlock: This is the another flaw which can be generated by NOT using proper
synchronization mechanism. It is a situation in which two thread/process sharing the
same resource are effectively preventing each other from accessing the resource,
resulting in both programs ceasing to function.
IMPORTANT TERMS
Simple, yet powerful mechanisms to provide the synchronization without locking.
Less overhead as compared to the usual synchronization mechanisms such as
semaphore and mutex
Atomic operations are indivisible and uninterruptible(perform in one clock cycle)
The atomic integer methods operate on a special data type, atomic_t.
The kernel provides two sets of interfaces for atomic operations, one that operates
on integers and another that operates on individual bits.
All atomic functions are inline functions.
ATOMIC OPERATIONS
#include <asm/atomic.h>
void atomic_set(atomic_t *a, int i); // Set the atomic variable a to
integer value i
int atomic_read(atomic *a); // Return the value of atomic
variable a
void atomic_add(int i, atomic_t *a); // Add i to atomic variable a
void atomic_sub(i, atomic_t *a); // Subtract i from atomic
variable a
void atomic_inc(atomic_t *a); // Increment operation
void atomic_dec(atomic_t *a); // Decrement operation
To flag some condition
A single bit may serve the purpose well
atomic_t type variable doesn’t work well for manipulating the bits.
The kernel operations for this are given below
#include <asm/bitops.h>
void set_bit(int nr, void *a); // Set the bit number nr in value pointed by a
void clear_bit(int nr, void *a); // Clear the bit number nr in value pointed by a
void change_bit(int nr, void *a); // Toggle the bit at position nr
ATOMIC BIT OPERATIONS
•In real-world scenarios, critical sections can involve:
•Multiple functions
•Complex operations (e.g., removing, formatting, and inserting data)
•Such operations need to be atomic (i.e., uninterrupted).
•Therefore, simple atomic ops are insufficient, and locks are required for
synchronization.
•A spin lock is the most common lock in the kernel.
•One thread only can hold the spin lock at a time.
•If another thread tries to acquire a held lock, it will:
•Spin (busy wait) until the lock is free.
•This is called lock contention.
SPIN-LOCK
•Analogy
•If no one is in the room: you take the key and enter.
•If someone is in: you keep checking (spinning) until they leave and give you the key.
Important Characteristics of Spin Locks
• Lightweight and used for short critical sections.
• Should not be held for long because:
• Spinning wastes CPU time.
• Thread does no useful work while waiting.
• Ideal to hold spin locks for less time than two context switches.
Sleeping Locks (e.g., Semaphores)
• Instead of spinning, the thread sleeps when the lock is
contended.
• Uses context switches (saves CPU time but adds
overhead).
• Used when critical sections are longer or involve
sleeping/I/O.
SEMAPHORES
DEFINE_SPINLOCK(mr_lock);
spin_lock(&mr_lock);
/* critical region */
spin_unlock(&mr_lock);
• Only one thread can hold the lock at a time → ensures mutual
exclusion.
• On uniprocessor systems, spin locks:
• Compile away
• Act as markers to disable/enable preemption.
SPINLOCK METHODS
•Spin locks CAN be used in interrupt handlers.
•Semaphores CANNOT (they sleep, which is not allowed in interrupt context).
• If a lock is used in an interrupt handler:
•Must disable local interrupts before acquiring it.
•Prevents deadlock (double-acquire deadlock) where:
•A thread holds the lock.
•An interrupt handler tries to reacquire it.
•Both wait forever (deadlock)
•.
----spin_lock_irqsave():
DEFINE_SPINLOCK(mr_lock); Saves current interrupt state.
unsigned long flags; Disables local interrupts.
spin_lock_irqsave(&mr_lock, flags); Acquires the lock.
/* critical region */
spin_unlock_irqrestore(&mr_lock, flags); ----spin_unlock_irqrestore():
Releases the lock.
Restores previous interrupt state
TO USE SPIN LOCK IN INTERRUPT CONTEXT
•Bottom halves are deferred work mechanisms in the Linux kernel (e.g., tasklets, softirqs).
•They run after interrupt handling, to finish work that doesn't need to be done immediately.
spin_lock_bh() and spin_unlock_bh()
•spin_lock_bh(&lock):
•Acquires the spin lock
•Disables all bottom halves
•spin_unlock_bh(&lock):
•Releases the spin lock
•Re-enables bottom halves
SPINLOCKS & BOTTOMHALVES
1. Process Context Bottom Half
• Use both:
• Spin lock
• Disable bottom halves
• Rationale: A bottom half can preempt process context.
2. Interrupt Handler Bottom Half
• Use both:
• Spin lock
• Disable interrupts
• Rationale: Interrupts can preempt bottom halves.
Tasklets
Safe Case:
• Same type of tasklet → No locking needed.
• Reason: Tasklets of the same type do not run simultaneously.
Shared Data Between Different Tasklets:
• Use normal spin lock.
• Do not need to disable bottom halves.
• Tasklets cannot preempt each other on the same CPU.
SoftIRQs
• Always protect shared data with a spin lock.
• Do not need to disable bottom halves.
• Reason:
• SoftIRQs of the same or different types can run concurrently on
different CPUs.
• But they do not preempt each other on the same CPU.
•Useful when:
•Multiple readers can safely access data concurrently.
•Writers require exclusive access.
•Classic use case: Shared data structures like task lists or caches.
READER-WRITER LOCKS
Declare
DEFINE_RWLOCK(mr_rwlock);
Reader code
read_lock(&mr_rwlock);
/* read-only critical section */
read_unlock(&mr_rwlock);
Writer Code
write_lock(&mr_rwlock);
/* read-write critical section */
write_unlock(&mr_rwlock);
Reader Preference (Starvation)
• Readers are favored over writers.
• Writers wait until all readers release the lock.
• Continuous readers can cause writer starvation.
When to Use RW Spin Locks
• Shared data accessed often by many readers but rarely
modified.
• Critical sections are short and non-blocking.
• Example: Routing tables, process/task lists, lookup caches.
When NOT to Use
• Complex reader/writer logic where roles are unclear.
• Need to sleep or block inside the critical section → Use a
semaphore instead.
• Do NOT upgrade a read lock to a write lock
• read_lock(&mylock); write_lock(&mylock); // This causes a deadlock!
• The program will hang because the writer is waiting for the reader (itself) to finish,
which never happens!
Writer Must Wait for All Readers
If many threads keep reading, the writer may never get a turn.
This is called writer starvation.
Reader-writer spin locks favor readers.
Be careful if writers are also important in your program.
If you’re using these locks in interrupt handlers, you may need to disable interrupts
to avoid deadlocks.
•For safe reading in interrupt handlers: read_lock() is okay if there are no writers.
•For writing: use write_lock_irqsave() and write_unlock_irqrestore()
to disable interrupts safely while accessing data.
USAGE IN INTERRUPT HANDLERS
•Semaphores are sleeping locks: When unavailable, tasks are put to sleep
and placed in a wait queue.
•When the semaphore becomes free, one task is awakened from the wait queue.
Door and Key Analogy:
• A key represents the semaphore.
• The room is the critical section.
• If the key is taken, the next person waits (sleeps) by taking a number.
• When the key is returned, the next person is woken up and allowed in.
• This avoids busy-waiting, unlike spinlocks.
SEMAPHORES
•Better CPU utilization than spinlocks (no time wasted in spinning).
•Ideal for locks held for a long time.
•Can sleep while holding a semaphore (won’t cause deadlock).
•Do not disable kernel preemption → do not increase scheduling latency.
ADVANTAGES
• Higher overhead than spinlocks (due to sleeping and waking).
• Not suitable for short-duration locks (overhead > lock hold time).
• Cannot be used in interrupt context (sleeping is not allowed
there).
DISADVANTAGES
•Use semaphores when sleeping is necessary
(e.g., synchronizing with user-space).
•Avoid holding spinlocks while acquiring a semaphore — deadlock risk due to inability to sleep.
•Choose between semaphore and spinlock based on expected lock hold time:
•Short time → Spinlock
•Long time / sleep required → Semaphore
•Semaphores can allow more than one simultaneous lock holder,
unlike spinlocks (which only allow one).
•The maximum number of simultaneous holders is set during declaration —
this value is called the count or usage count.
•Binary Semaphore (or Mutex)
•Count initialized to 1.
•Allows only one task at a time → used for mutual exclusion.
•Most common in kernel code.
•Counting Semaphore
•Count initialized to a value greater than 1.
•Allows multiple tasks in the critical section simultaneously.
•Used to enforce limits, not mutual exclusion.
•Rare in kernel usage.
TYPES OF SEMAPHORES
•Introduced by Edsger Dijkstra in 1968.
•Two atomic operations:
•P() → From Dutch "Proberen" (to test)
•V() → From Dutch "Verhogen" (to increment)
•In Linux, these are called:
•down() → Acquire the semaphore (decrement count)
•up() → Release the semaphore (increment count)
Semaphore Operations
• down():
• Decrements count by 1.
• If count ≥ 0 → task gets the lock.
• If count < 0 → task is put to sleep on the wait queue.
• up():
• Increments count by 1.
• If there are waiting tasks → one task is awakened to acquire the
lock.
BEHAVIOR OF DOWN() AND UP()
•Binary semaphores (mutexes) = used for mutual exclusion.
•Counting semaphores are used for resource limits, not mutual exclusion.
•When a task is blocked in down(), CPU moves on to other tasks → better CPU utilization.
Architecture-dependent, defined in:
<asm/semaphore.h>
Declared using:
struct semaphore name;
Static Semaphore Initialization
•Use sema_init() to initialize with a specific count (usage count):
sema_init(&name, count);
Eg: struct semaphore sem;
sema_init(&sem, 1); // acts like a mutex
Function Behavior Process State Return Value Use Case
Use only if the task
TASK_UNINTERRUPTIB must not be
down() Blocks if unavailable Always 0
LE interrupted (rarely
recommended)
Preferred for user-
Returns -EINTR if
down_interruptible() Blocks if unavailable TASK_INTERRUPTIBLE interacting or signal-
interrupted by signal
aware processes
Use when you want
Returns 0 if
to try acquiring a
down_trylock() Non-blocking Doesn’t sleep successful, non-zero
semaphore without
otherwise
sleeping
Releasing the Semaphore
•Use up(&sem); to release a semaphore (regardless of how it was acquired).
Reader-writer semaphores are a special kind of lock used in the
Linux kernel that allows:
• Multiple readers to access a shared resource at the same time, if
no writer is active.
• Only one writer to access the resource, if no readers or writers are
active.
They help improve performance when reads are frequent but
writes are rare.
READER-WRITER SEMAPHORES
They are used just like reader-writer spin locks, and are better than
regular semaphores or spinlocks when:
• Shared data is read frequently.
• Write operations are rare.
• Concurrency is needed, especially for read-heavy scenarios.
1.Static Declaration (known at compile time):
static DECLARE_RWSEM(my_rwsem);
2.Dynamic Initialization:
struct rw_semaphore my_rwsem;
init_rwsem(&my_rwsem);
DECLARATION AND USE
Use the Lock For Reading:
down_read(&my_rwsem); // Acquire read lock
// Critical section (read-only)
up_read(&my_rwsem); // Release read lock
For Writing:
down_write(&my_rwsem); // Acquire write lock
// Critical section (read-write)
up_write(&my_rwsem); // Release write lock
Mutexes are available in the kernel as a way to accomplish semaphore behavior.
Mutexes are binary semaphores
Only one task may hold the mutex at a time, and only this task can unlock the mutex.
There is no recursive locking or unlocking of mutexes, and mutexes may not be used within interrupt
context.
But mutexes are faster and more compact than the current kernel semaphore option, so if they fit the
need, they're the choice to use.
Create and initialize a mutex in one operation through the DEFINE_MUTEX macro. This creates a new
mutex and initializes the structure
DEFINE_MUTEX( my_mutex );
MUTEXES
To dynamically initialize a mutex,call
mutex_init(&mutex);
Locking and unlocking the mutex is easy:
mutex_lock(&mutex);
/* critical region ... */
mutex_unlock(&mutex);
Requirement Recommended lock
Low overhead locking Spin lock is preferred
Short lock hold time Spin lock is preferred
Long lock hold time Mutex is preferred
Need to lock from interrupt Spin lock is required
context
Need to sleep while holding lock Mutex is required
SPINLOCK VS MUTEX
Completion variables are used for synchronization between two
tasks where:
• One task waits until another task signals that something has
finished.
• They're lightweight and simple — perfect when you just need to
wait for one specific event.
COMPLETION VARIABLES
•create a completion variable .
•DECLARE_COMPLETION(my_comp);
•One task (like a parent process) waits for completion of task using
• wait_for_completion(&my_comp);
•The other task (like a child process) calls complete(&my_comp) when it finishes its job.
•The first task wakes up.
•This keeps the parent from running too early and interfering with the child's memory.
WORKING
Global kernel lock
• A global spin lock used in early SMP (Symmetric MultiProcessing)
Linux kernels.
• Introduced to ease the transition from single-threaded kernel
code to multi-threaded (SMP-safe) kernel code.
BIG KERNEL LOCK OR BKL
Feature Explanation
Sleep Allowed You can sleep while holding the BKL — it’s
automatically released when the process
is scheduled out and reacquired later.
Recursive Lock Same process can acquire the BKL
multiple times without deadlocking.
Process Context Only Can only be used in process context, not
in interrupt context.
Preemption Disabled Holding the BKL disables kernel pre-
emption.
No Real Locking on UP Systems On uniprocessor systems, BKL doesn’t
actually do any locking, since only one
process can be in the kernel anyway.
•Modern kernels support fine-grained locking (different locks for different tasks).
•The BKL is too broad — it blocks CPUs even if they are doing work that
would not interfere with each other.
•So, it wastes the potential of modern multi-core CPUs.
•As a result The system doesn’t scale well — performance drops as more CPUs are added.
•BKL is now considered a scalability bottleneck — because it prevents efficient parallelism in the kernel.
Function Description
lock_kernel () Acquires the BKL.
unlock_ kernel() Releases the BKL.
kernel_ locked() Returns nonzero if the lock is held and zero otherwise.
(UP always returns nonzero.)
• A lightweight locking mechanism introduced in Linux 2.6.
• Ideal for many readers and few writers.
• Uses a sequence counter to manage synchronization between
readers and writers.
SEQUENTIAL LOCKS
•Writers acquire a lock and increment a sequence number.
•Readers:
•Read the sequence number before reading the data.
•Read the data.
•Check if the sequence number changed (i.e., a write happened during read).
•If changed → retry the read.
WORKING
•Highly scalable when there are many readers.
•Writers are favored (no starvation).
•Efficient, since readers don't block each other or the writer.
•Good for simple data structures (like a 64-bit integer).
ADVANTAGES
• Readers may spin (retry) if writes happen during reads.
• Not suitable for complex data or large critical sections.
• Only usable when data consistency during read is critical.
LIMITATIONS
•The kernel is preemptive, meaning a task running in kernel mode can be interrupted
if a higher-priority task needs to run.
•This ensures better responsiveness but can cause concurrency issues in critical sections.
KERNEL PREEMPTION
• Spin locks are used as markers:
• If a spin lock is held, kernel preemption is disabled.
• This ensures critical regions are not preempted, making them safe
even under SMP (Symmetric Multiprocessing).
PREEMPTION CONTROL
• Some data is specific to each CPU (per-processor data).
• Normally, it doesn't require a lock because only one processor
accesses it.
• But if preemption is allowed, a process might be moved to
another CPU and access another processor’s data, causing
problems.
PER-PROCESSOR DATA
(1) Disable Preemption
preempt_disable();
/* safely access per-CPU data */
preempt_enable();
•Ensures the current process won't be moved to another CPU during the critical section.
•Must call preempt_enable() for each preempt_disable() to re-enable preemption.
HANDLING
Use get_cpu()
int cpu = get_cpu();
/* safely use per-CPU data for 'cpu' */
put_cpu(); // Re-enables preemption
•get_cpu() disables preemption and returns the current CPU number.
•put_cpu() re-enables preemption.
CONTD…
•Tracks how many locks or preempt_disable() calls are active.
•If preempt_count() == 0, kernel can be preempted.
•Used for debugging atomicity and sleep issues.
Preemption Count