- MongoDB's concurrency control uses multiple-granularity locking at the instance, database, and collection level. This allows finer-grained locking than previous approaches.
- The storage engine handles concurrency control at lower levels like the document level, using either MVCC or locking depending on the engine. WiredTiger uses MVCC while MMAPv1 uses locking at the collection level.
- Intents signal the intention to access lower levels without acquiring locks upfront, improving concurrency compared to directly acquiring locks. The lock manager enforces the locking protocol and ensures consistency.
In this document
Powered by AI
Overview of the presentation and audience including operations engineers, developers, and curious individuals.
Definition of concurrency control involving locking, data consistency, and MVCC (Multi-Version Concurrency Control).
Learning objectives for top-level concurrency control, cooperation with the storage engine, and an outline of the talk.
Description of locking mechanisms in MongoDB 2.0 and 2.2, including top-level locks and database-level locks with intents.
Explanation of intents in MongoDB, their compatibility, and necessity to optimize locking mechanisms.
Role of the Lock Manager that ensures protocol adherence with low overhead, fairness, and statistics collection.
Overview of the Storage Engine API and comparing operations using wiredTiger's MVCC (Multi-Version Concurrency Control).
Discussion of MMAP V1's handling of concurrency through collection-level locking and the roles of intents.
Conclusions on storage engines controlling concurrency, decoupling from top-level locking, and improvements in MMAP V1.
Opportunity for audience questions regarding the topics covered in the presentation.
Additional slides that may not contain further significant content related to the main presentation.
38
Conclusion
• Storage engineshave direct control
over concurrency
–Completely decoupled from top-level
locking
–Enabled document-level control
39.
39
Conclusion
• Retrofitted MMAPV1 to use multi-
granularity locks
–Support for collection-level locking
–Got rid of the global lock usage for
journaling
#2 Thank for coming to MongoDB World
Introduce myself and what team I work on
Say about what I did before MongoDB (Microsoft SQL Server and then AWS)
Say what I worked on in MongoDB 3.0
Explain what I will be talking about
#3 Explain for whom is this talk intended and why is it useful for that particular audience.
Leave the questions for the end. I will also be available in the Ask the Experts area.
#4 What is concurrency control? PAUSE
Locks
Consistent data (ensuring the correct amount of money is in a bank account)
High throughput
#5 In database systems, concurrency control is all of those things, but also more.
Before I tell you what more means, let’s do a quick recap of what’s the logical composition of mongodb.
#6 You all know what are the logical parts of the database, but let’s recap.
#7 In MongoDB 3.0:
Call out what is the Top part – responsibility of MongoDB
Call out what is the Bottom part (actual documents) – responsibility of the storage engine
Top part and bottom part talk through the storage engine API
PAUSE
#8 Now that I have highlighted the parts I’ll be talking about…
#11 CHANGE THE PACE OF TALKING
Single “top” lock
Protected both the data and the metadata
#12 Allowed two operations for two different databases to run in parallel (this is for 2.2, 2.4 and 2.6)
Synchronized access to the same database
Question: How is the instance protected?
#13 Before I formally define what intents are, let me give a couple of examples of how they work.
#16 WAIT A FEW SECONDS BEFOR SHOWING THE FIRST BULLET
#17 Let’s look at this hypothetical scenario. It kind of looks like it works.
#18 Now, consider an operation which wants to have the entire server in read-only mode.
PAUSE BEFORE WALKING THROUGH THE ANIMATION
QUESTION: Has it made the server read-only? PAUSE
#19 Now let’s see how intents help make this correct.
#20 At some point the insert operation will complete and the S lock will be granted.
PAUSE BEFORE CONTINUING
#24 For completeness, this is the compatibility matrix between locks and intents.
PAUSE
#25 CHANGE PACE OF TALKING
This was the world up until MongoDB version 2.6. Now, in 3.0 we wanted to add document-level locking.
QUESTION AND PAUSE BEFORE CLICKING: Can you guess what is the first thing we did?
#26 This started to look like multi-granularity locking, which is a very well studied and tested concept from database systems.
Multiple granularity locking requires locks to be acquired in top to bottom fashion.
#27 EXPLAIN THE BULLETS
PAUSE A LITTLE BIT: Now let me show you how multi-granularity locking works in the document world.
CHANGE OF PACE
#28 EXPLAIN THE BULLETS
PAUSE A LITTLE BIT: Now let me show you how multi-granularity locking works in the document world.
CHANGE OF PACE
#29 PAUSE A LITTLE BIT AND WALK THROUGH A STORY
STORY: Turned out adapting a data structure which was never intended to be concurrent is extremely difficult. Documents are not as independent as we thought (indexes).
#30 Instead we did something even better. We scrapped this document-level locking…
#31 … and implemented a storage engine API.
PAUSE A BIT TO LET THAT SINK IN
#33 Many things can go wrong - such as WT accessing a null pointer.
#34 This showed how top-level concurrency control works with WiredTiger (and any other storage engine, which supports document-level locking). How does MMAP V1 work?
Wait a few seconds.
#35 Operation 1 and Operation 2 both access completely different parts of the storage engine.
QUESTION AND PAUSE: Protection against the collection being dropped?
#36 AT THE END – protection against the database being dropped remains exactly the same.
#39 Separated responsibilities for concurrency control between MongoDB and the pluggable storage engine.
#43 This slide zooms in on what the Locker API is and what is the state of a particular locker object at runtime. This locker is for an operation, which does some kind of write action (insert/update/delete) for a storage engine, which supports document-level locking.
The locker has intent on the GLOBAL resource, intent on the ‘sales’ database resource and intent on the ‘orders’ collection resource. If there were another thread doing a read for example, it would locally have similar state, but instead of IX, it would have IS instead.
So these two threads never need to synchronize at the catalog level and all the concurrency control is left to the storage engine instead.
#44 The lock manager, since it uses named resources and we do not know the names of these resources in advance, is essentially a chained hash table with an entry for each resource containing the resource’s type and name like shown in this diagram.
Like any hash table, it contains a fixed (non-growing array) of lock buckets and each bucket is either empty or contains a linked-list of something we call LockHead. There is a single LockBuckets array for the entire instance and there is one LockHead per each lockable resource.
#45
----- Meeting Notes (5/21/15 15:35) -----
REMOVE THESE SLIDES ABOUT IMPLEMENTATION