KEMBAR78
Non blocking programming and waiting | PPTX
Non blocking programming
and waiting
Advanced non-blocking concurrency
Presented at GeekOut 2017
/Roman Elizarov @ JetBrains
Speaker: Roman Elizarov
•16+ years experience
•Previously developed high-perf
trading software @ Devexperts
•Teach concurrent & distributed
programming @ St. Petersburg
ITMO University
•Chief judge @ Northeastern
European Region of ACM ICPC
•Now work on Kotlin @ JetBrains
Why concurrency?
• Key motivators
• Performance
• Scalability
• Unless you need both, don’t
bother with concurrency:
• Write single-threaded
• Scale by running multiple copies of
code
Thread 1
Thread 2
Thread N
Shared
Object
Practical examples:
queue, cache,
dictionary, state,
statistics, log, etc
Share nothing and
sleep well
Basic concepts
Lock-freedom and waiting
What is blocking?
What is non-blocking algorithm?
Blocking (aka locking)
• Semi-formally
• In Java practice non-blocking algorithms
• just read/write volatile variable and/or use
• j.u.c.a.AtomicXXX classes with compareAndSet and other methods
• Blocking algorithms (with locks) use
• synchronized (…) which produces monitorEnter/monitorExit instrs
• j.u.c.l.Lock lock/unlock methods
• NOTE: You can code blocking without realizing it
An algorithm is called non-blocking (lock-free)
if suspension of any thread cannot
cause suspension of another thread
Toy problem solved with locks
Locks are the easiest way to
make your object linearizable
(aka thread-safe)
Just protect all operations on a
shared state with the same lock
(or monitor)
1
2
3
What is waiting [for condition]?
What is waiting operation?
sometimes aka “blocking”, too 
Waiting for condition
• Formally
• Waiting is orthogonal to blocking/non-blocking
• For example, let’s implement partial takeValue operation that is
defined only when there is value != null in DataHolder
Partial function
from object state set X to result set Y
is defined only on a subset of X’ of X.
Method invocation can complete only
when object state is in X’
(when condition is satisfied).
Waiting is easy with monitors
1
2
If in doubt, always use notifyAll instead of
notify (but can use notify here)
This is code with locks (synchronized):
suspension of one thread on any of these lines
causes suspension of all other threads that
attempt to do any operation
This is waiting code (partial function):
it is only defined when value != null
Non-blocking
Why and how
Why go non-blocking (aka lock-free)?
• Performance
• Locking is expensive when contended
• Actually, context switches are expensive
• Dead-lock avoidance
• Too much locking can get you into trouble
• Sometimes it is just easier to get rid of locks
Let’s go lock-free
1
2
3
4
Lock-free loop
expect update
Look ma, no locks!
Powerful we have become!
Lock-free waiting (aka parking)
2
3
4
1
“wait”
Lock-free wakeup (aka unparking)
• Note: in lock-free code order is important (first update, then unpark)
• Updaters are 100% wait-free (never locked out by other threads)
• Taker (takeValue) can get starved in CAS loop, but still non-blocking
(formally, lock-free)
“notify”
Park/unpark magic
LockSupport.unpark(T): “Makes available the
permit for the given thread, if it was not
already available. If the thread was blocked on
park then it will unblock. Otherwise, its next
call to park is guaranteed not to block.”
U update state unpark(T)
updateValue
T check state park()
takeValue
^ value == null
Lock-free waiting from multiple threads
• Must maintain wait queue of threads in a lock-free way
• This is a non-trivial
• j.u.c.l.AbstractQueuedSynchronizer is a good place to start
• It is used to implement a number of j.u.c.* classes:
- ReentrantLock
- ReentrantReadWriteLock
- Semaphore
- CountDownLatch
You can use it to for your own needs, too
Anatomy of AbstractQueuedSynchronizer
int state; // optionally use for state
wait queue <Node>; // nodes reference threads
int getState()
void setState(int newState)
boolean compareAndSetState(int expect, int update)
boolean tryAcquire(int arg)
boolean tryRelease(int arg)
int tryAcquireShared(int arg)
boolean tryReleaseShared(int arg)
void acquire(int arg)
void acquireInterruptibly(int arg)
boolean tryAcquireNanos(int arg, long nanos)
boolean release(int arg)
void acquireShared(int arg)
// and others
1
private state
2
state access
3
override
4
use
almost
separate
aspects
Anatomy of AbstractQueuedSynchronizer (2)
1
3
2
adds to
wait queue
4
unlinks from
wait queue
Our own synchronizer
1
2
3
Use synchronizer to implement notify/wait
1
2
3
Why double check? (more internals)
void doAcquireXXX(int arg) {
addToWaitQueue();
for (;;) {
if (isFirstInQueue() && tryAcquire(arg)) {
unlinkFromWaitQueue(); return;
}
doPark();
}
}
U update state release -> unparkSuccessor()
updateValue
T tryAcquire unlinkFromWaitQueue()
takeValue
^ valueRef.CAS(oldValue, null) == true
Simplified code
Naïve “performance improvement”
• The idea is to unpark just one thread when setting value for the first
time only (and avoid unparking on subsequent updates)
• IT FAILS SUBTLY(!).
1
2
T set release -> unparkSuccessor()
updateValue
U get park()
takeValue
^ valueRef.CAS(oldValue, null) == false
updateValue may cause concurrent tryAcquire to
fail on CAS and park, but we don’t call release in
this case anymore, so it will never unpark
.
Fail
CAS
^ oldValue != null1
2
3
Corrected Sync.tryAcquire method
• Use CAS-loop idiom to retry in the case of contention
• Optimal version in terms of context switching
1
2
Use the loop,
Luke!
This is optimal, but not fair!
• Let’s take a closer look at AQS.acquireXXX
• Thread might jump ahead of the queue
• Good or bad? – depends on the problem being solved
Make it fair (if needed)
Conclusion
• Waiting can be implemented in a non-blocking way
• Recap non-blocking: suspension of any thread (on any line of code) cannot
cause suspension of another thread
• Bonus: context switch only when really need to wait & wakeup
• Fairness: is an optional aspect of waiting
• AbstractQueuedSynchronizer
• is designed for writing custom lock-like classes
• but can be repurposed as a ready wait-queue impl for other cases
Lock-free programming is extremely
error-prone
Learn the patterns of concurrent code
You shall, young Padawan.
Thank you
Any questions?
Slides are available at www.slideshare.net/elizarov
email me to elizarov at gmail
relizarov

Non blocking programming and waiting

  • 1.
    Non blocking programming andwaiting Advanced non-blocking concurrency Presented at GeekOut 2017 /Roman Elizarov @ JetBrains
  • 2.
    Speaker: Roman Elizarov •16+years experience •Previously developed high-perf trading software @ Devexperts •Teach concurrent & distributed programming @ St. Petersburg ITMO University •Chief judge @ Northeastern European Region of ACM ICPC •Now work on Kotlin @ JetBrains
  • 3.
    Why concurrency? • Keymotivators • Performance • Scalability • Unless you need both, don’t bother with concurrency: • Write single-threaded • Scale by running multiple copies of code Thread 1 Thread 2 Thread N Shared Object Practical examples: queue, cache, dictionary, state, statistics, log, etc Share nothing and sleep well
  • 5.
  • 6.
    What is blocking? Whatis non-blocking algorithm?
  • 7.
    Blocking (aka locking) •Semi-formally • In Java practice non-blocking algorithms • just read/write volatile variable and/or use • j.u.c.a.AtomicXXX classes with compareAndSet and other methods • Blocking algorithms (with locks) use • synchronized (…) which produces monitorEnter/monitorExit instrs • j.u.c.l.Lock lock/unlock methods • NOTE: You can code blocking without realizing it An algorithm is called non-blocking (lock-free) if suspension of any thread cannot cause suspension of another thread
  • 8.
    Toy problem solvedwith locks Locks are the easiest way to make your object linearizable (aka thread-safe) Just protect all operations on a shared state with the same lock (or monitor) 1 2 3
  • 9.
    What is waiting[for condition]? What is waiting operation? sometimes aka “blocking”, too 
  • 10.
    Waiting for condition •Formally • Waiting is orthogonal to blocking/non-blocking • For example, let’s implement partial takeValue operation that is defined only when there is value != null in DataHolder Partial function from object state set X to result set Y is defined only on a subset of X’ of X. Method invocation can complete only when object state is in X’ (when condition is satisfied).
  • 11.
    Waiting is easywith monitors 1 2 If in doubt, always use notifyAll instead of notify (but can use notify here) This is code with locks (synchronized): suspension of one thread on any of these lines causes suspension of all other threads that attempt to do any operation This is waiting code (partial function): it is only defined when value != null
  • 12.
  • 13.
    Why go non-blocking(aka lock-free)? • Performance • Locking is expensive when contended • Actually, context switches are expensive • Dead-lock avoidance • Too much locking can get you into trouble • Sometimes it is just easier to get rid of locks
  • 14.
    Let’s go lock-free 1 2 3 4 Lock-freeloop expect update Look ma, no locks!
  • 15.
  • 16.
    Lock-free waiting (akaparking) 2 3 4 1 “wait”
  • 17.
    Lock-free wakeup (akaunparking) • Note: in lock-free code order is important (first update, then unpark) • Updaters are 100% wait-free (never locked out by other threads) • Taker (takeValue) can get starved in CAS loop, but still non-blocking (formally, lock-free) “notify”
  • 18.
    Park/unpark magic LockSupport.unpark(T): “Makesavailable the permit for the given thread, if it was not already available. If the thread was blocked on park then it will unblock. Otherwise, its next call to park is guaranteed not to block.” U update state unpark(T) updateValue T check state park() takeValue ^ value == null
  • 19.
    Lock-free waiting frommultiple threads • Must maintain wait queue of threads in a lock-free way • This is a non-trivial • j.u.c.l.AbstractQueuedSynchronizer is a good place to start • It is used to implement a number of j.u.c.* classes: - ReentrantLock - ReentrantReadWriteLock - Semaphore - CountDownLatch You can use it to for your own needs, too
  • 20.
    Anatomy of AbstractQueuedSynchronizer intstate; // optionally use for state wait queue <Node>; // nodes reference threads int getState() void setState(int newState) boolean compareAndSetState(int expect, int update) boolean tryAcquire(int arg) boolean tryRelease(int arg) int tryAcquireShared(int arg) boolean tryReleaseShared(int arg) void acquire(int arg) void acquireInterruptibly(int arg) boolean tryAcquireNanos(int arg, long nanos) boolean release(int arg) void acquireShared(int arg) // and others 1 private state 2 state access 3 override 4 use almost separate aspects
  • 21.
    Anatomy of AbstractQueuedSynchronizer(2) 1 3 2 adds to wait queue 4 unlinks from wait queue
  • 22.
  • 23.
    Use synchronizer toimplement notify/wait 1 2 3
  • 24.
    Why double check?(more internals) void doAcquireXXX(int arg) { addToWaitQueue(); for (;;) { if (isFirstInQueue() && tryAcquire(arg)) { unlinkFromWaitQueue(); return; } doPark(); } } U update state release -> unparkSuccessor() updateValue T tryAcquire unlinkFromWaitQueue() takeValue ^ valueRef.CAS(oldValue, null) == true Simplified code
  • 25.
    Naïve “performance improvement” •The idea is to unpark just one thread when setting value for the first time only (and avoid unparking on subsequent updates) • IT FAILS SUBTLY(!). 1 2
  • 26.
    T set release-> unparkSuccessor() updateValue U get park() takeValue ^ valueRef.CAS(oldValue, null) == false updateValue may cause concurrent tryAcquire to fail on CAS and park, but we don’t call release in this case anymore, so it will never unpark . Fail CAS ^ oldValue != null1 2 3
  • 27.
    Corrected Sync.tryAcquire method •Use CAS-loop idiom to retry in the case of contention • Optimal version in terms of context switching 1 2 Use the loop, Luke!
  • 28.
    This is optimal,but not fair! • Let’s take a closer look at AQS.acquireXXX • Thread might jump ahead of the queue • Good or bad? – depends on the problem being solved
  • 29.
    Make it fair(if needed)
  • 30.
    Conclusion • Waiting canbe implemented in a non-blocking way • Recap non-blocking: suspension of any thread (on any line of code) cannot cause suspension of another thread • Bonus: context switch only when really need to wait & wakeup • Fairness: is an optional aspect of waiting • AbstractQueuedSynchronizer • is designed for writing custom lock-like classes • but can be repurposed as a ready wait-queue impl for other cases Lock-free programming is extremely error-prone
  • 31.
    Learn the patternsof concurrent code You shall, young Padawan.
  • 33.
    Thank you Any questions? Slidesare available at www.slideshare.net/elizarov email me to elizarov at gmail relizarov