MongoDB Manual Master
MongoDB Manual Master
Release 3.2.4
MongoDB, Inc.
3
Contents
1 Introduction to MongoDB 3
1.1 What is MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Install MongoDB 5
2.1 Supported Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Deprecation of 32-bit Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Administration 199
5.1 Administration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.2 Administration Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
5.3 Administration Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
5.4 Production Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
6 Security 315
6.1 Security Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
6.2 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
6.3 Role-Based Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
6.4 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
6.5 Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
6.6 Security Hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
6.7 Security Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
6.8 Security Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
6.9 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
i
7 Aggregation 443
7.1 Aggregation Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
7.2 Map-Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
7.3 Single Purpose Aggregation Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
7.4 Additional Features and Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
7.5 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
8 Indexes 487
8.1 Index Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
8.2 Index Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
8.3 Indexing Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
8.4 Indexing Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
9 Storage 587
9.1 Storage Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
9.2 Journaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
9.3 GridFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
9.4 FAQ: MongoDB Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
10 Replication 613
10.1 Replication Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
10.2 Replication Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
10.3 Replica Set Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
10.4 Replication Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
11 Sharding 725
11.1 Sharding Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
11.2 Sharding Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
11.3 Sharded Cluster Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
11.4 Sharding Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
ii
MongoDB Documentation, Release 3.2.4
Note: This version of the PDF does not include the reference section, see MongoDB Reference Manual1 for a PDF
edition of all MongoDB Reference Material.
1 http://docs.mongodb.org/master/MongoDB-reference-manual.pdf
Contents 1
MongoDB Documentation, Release 3.2.4
2 Contents
CHAPTER 1
Introduction to MongoDB
On this page
What is MongoDB (page 3)
Welcome to MongoDB. This document provides a brief introduction to MongoDB and some key concepts. See the
installation guides (page 5) for information on downloading and installing MongoDB.
MongoDB is an open-source document database that provides high performance, high availability, and automatic
scaling.
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB docu-
ments are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
3
MongoDB Documentation, Release 3.2.4
High Performance
High Availability
To provide high availability, MongoDBs replication facility, called replica sets, provide:
automatic failover.
data redundancy.
A replica set (page 613) is a group of MongoDB servers that maintain the same data set, providing redundancy and
increasing data availability.
Automatic Scaling
Install MongoDB
On this page
Supported Platforms (page 5)
Deprecation of 32-bit Versions (page 5)
Tutorials (page 6)
Additional Resources (page 59)
Changed in version 3.2: Starting in MongoDB 3.2, 32-bit binaries are deprecated and will be unavailable in future
releases.
Changed in version 3.0: Commercial support is no longer provided for MongoDB on 32-bit platforms (Linux and
Windows). See Platform Support (page 944).
In addition, the 32-bit versions of MongoDB have the following limitations:
5
MongoDB Documentation, Release 3.2.4
2.3 Tutorials
Install on Linux (page 6) Install MongoDB Community Edition and required dependencies on Linux.
Install on OS X (page 26) Install MongoDB Community Edition on OS X systems from Homebrew packages or from
MongoDB archives.
Install on Windows (page 28) Install MongoDB Community Edition on Windows systems and optionally start Mon-
goDB as a Windows service.
Install on Linux (page 34) Install the official builds of MongoDB Enterprise on Linux-based systems.
Install on OS X (page 51) Install the official build of MongoDB Enterprise on OS X
Install on Windows (page 52) Install MongoDB Enterprise on Windows using the .msi installer.
On this page
Recommended (page 7)
Manual Installation (page 7)
1 http://blog.mongodb.org/post/137788967/32-bit-limitations
These documents provide instructions to install MongoDB Community Edition for various Linux systems.
Note: Starting in MongoDB 3.2, 32-bit binaries are deprecated and will be unavailable in future releases.
Recommended For the best installation experience, MongoDB provides packages for popular Linux distributions.
These packages, which support specific platforms and provide improved performance and TLS/SSL support, are the
preferred way to run MongoDB. The following guides detail the installation process for these systems:
Install on Red Hat (page 7) Install MongoDB Community Edition on Red Hat Enterprise and related Linux systems
using .rpm packages.
Install on SUSE (page 11) Install MongoDB Community Edition on SUSE Linux systems using .rpm packages.
Install on Amazon Linux (page 14) Install MongoDB Community Edition on Amazon Linux systems using .rpm
packages.
Install on Ubuntu (page 17) Install MongoDB Community Edition on Ubuntu Linux systems using .deb packages.
Install on Debian (page 20) Install MongoDB Community Edition on Debian systems using .deb packages.
For systems without supported packages, refer to the Manual Installation tutorial.
Manual Installation For Linux systems without supported packages, MongoDB provides a generic Linux release.
These versions of MongoDB dont include TLS/SSL, and may not perform as well as the targeted packages, but are
compatible on most contemporary Linux systems. See the following guides for installation:
Install MongoDB From Tarball (page 23) Install the official build of MongoDB Community Edition on other Linux
systems from MongoDB archives.
Overview Use this tutorial to install MongoDB Community Edition on Red Hat Enterprise Linux or CentOS Linux
versions 6 and 7 using .rpm packages. While some of these distributions include their own MongoDB packages, the
official MongoDB Community Edition packages are generally more up to date.
Platform Support
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
MongoDB 3.2 deprecates support for Red Hat Enterprise Linux 5.
Packages MongoDB provides officially supported packages in their own repository. This repository contains the
following packages:
2.3. Tutorials 7
MongoDB Documentation, Release 3.2.4
mongodb-org A metapackage that will automatically install the four component packages listed below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-org-server
Contains the mongos daemon.
mongodb-org-mongos
Contains the mongo shell.
mongodb-org-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-org-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
Init Scripts The mongodb-org package includes various init scripts, including the init script
/etc/rc.d/init.d/mongod. You can use these scripts to stop, start, and restart daemon processes.
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the init scripts. See the
Configuration File reference for documentation of settings available in the configuration file.
As of version 3.2.4, there are no init scripts for mongos. The mongos process is used only in sharding (page 731).
You can use the mongod init script to derive your own mongos init script for use in such environments. See the
mongos reference for configuration details.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
For the latest stable release of MongoDB Use the following repository file:
[mongodb-org-3.2]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.2/x86_64/
gpgcheck=0
enabled=1
For versions of MongoDB earlier than 3.0 To install the packages from an earlier release series (page 1061), such
as 2.4 or 2.6, you can specify the release series in the repository configuration. For example, to restrict your system
to the 2.6 release series, create a /etc/yum.repos.d/mongodb-org-2.6.repo file to hold the following
configuration information for the MongoDB 2.6 repository:
[mongodb-org-2.6]
name=MongoDB 2.6 Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
2 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-red-hat/
You can find .repo files for each release in the repository itself3 . Remember that odd-numbered minor release
versions (e.g. 2.5) are development versions and are unsuitable for production use.
Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether
to install the current release or a previous one. This step provides the commands for both.
To install the latest stable version of MongoDB, issue the following command:
sudo yum install -y mongodb-org
To install a specific release of MongoDB, specify each component package individually and append the version number
to the package name, as in the following example:
sudo yum install -y mongodb-org-3.2.4 mongodb-org-server-3.2.4 mongodb-org-shell-3.2.4 mongodb-org-mo
You can specify any available version of MongoDB. However yum will upgrade the packages when a newer version
becomes available. To prevent unintended upgrades, pin the package. To pin a package, add the following exclude
directive to your /etc/yum.conf file:
exclude=mongodb-org,mongodb-org-server,mongodb-org-shell,mongodb-org-mongos,mongodb-org-tools
Prerequisites
Configure SELinux
Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat
Enterprise Linux or CentOS Linux).
Note: All three options require root privileges. The first two options each requires a system reboot and may have
larger implications for your deployment.
Note: You can use setenforce to change to permissive mode; this method does not require a reboot but is
not persistent.
Enable access to the relevant ports (e.g. 27017) for SELinux if in enforcing mode. See
https://docs.mongodb.org/manual/reference/default-mongodb-port for more infor-
mation on MongoDBs default ports. For default settings, this can be accomplished by running
semanage port -a -t mongod_port_t -p tcp 27017
3 https://repo.mongodb.org/yum/redhat/
2.3. Tutorials 9
MongoDB Documentation, Release 3.2.4
Warning: On RHEL 7.0, if you change the data path, the default SELinux policies will prevent mongod
from having write access on the new data path if you do not change the security context.
You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system,
or choose to remove the relevant packages. This option is the most invasive and is not recommended.
Warning: On RHEL 7.0, if you change the data path, the default SELinux policies will preve
Data Directories and Permissions
having write access on the new data path if you do not change the security context.
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb
by default, and runs using the mongod user account. You can specify alternate log and data file directories in
/etc/mongod.conf. See systemLog.path and storage.dbPath for additional information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongo and /var/log/mongodb directories to give this user access to these directories.
Procedure
Step 1: Start MongoDB. You can start the mongod process by issuing the following command:
sudo service mongod start
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-
cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command:
sudo service mongod restart
You can follow the state of the process for errors or important messages by watching the output in the
/var/log/mongodb/mongod.log file.
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB Community Edition To completely remove MongoDB from a system, you must remove the
MongoDB applications themselves, the configuration files, and any directories containing data and logs. The following
section guides you through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo yum erase $(rpm -qa | grep mongodb-org)
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongo
On this page
Overview (page 11)
Packages (page 11)
Install MongoDB Community Edition on SUSE Init Scripts (page 12)
Install MongoDB Community Edition (page 12)
Run MongoDB Community Edition (page 13)
Uninstall MongoDB Community Edition (page 14)
Overview Use this tutorial to install MongoDB Community Edition on SUSE Linux from .rpm packages. While
SUSE distributions include their own MongoDB Community Edition packages, the official MongoDB Community
Edition packages are generally more up to date.
Platform Support
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
Packages MongoDB provides officially supported packages in their own repository. This repository contains the
following packages:
mongodb-org A metapackage that will automatically install the four component packages listed below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-org-server
Contains the mongos daemon.
mongodb-org-mongos
Contains the mongo shell.
mongodb-org-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-org-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
2.3. Tutorials 11
MongoDB Documentation, Release 3.2.4
These packages conflict with the mongodb, mongodb-server, and mongodb-clients packages provided by
Ubuntu.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
Init Scripts The mongodb-org package includes various init scripts, including the init script
/etc/rc.d/init.d/mongod. You can use these scripts to stop, start, and restart daemon processes.
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the init scripts. See the
Configuration File reference for documentation of settings available in the configuration file.
As of version 3.2.4, there are no init scripts for mongos. The mongos process is used only in sharding (page 731).
You can use the mongod init script to derive your own mongos init script for use in such environments. See the
mongos reference for configuration details.
Note: SUSE Linux Enterprise Server and potentially other SUSE distributions ship with virtual memory address
space limited to 8 GB by default. You must adjust this in order to prevent virtual memory allocation failures as the
database grows.
The SLES packages for MongoDB adjust these limits in the default scripts, but you will need to make this change
manually if you are using custom scripts and/or the tarball release rather than the SLES packages.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
Step 1: Configure the package management system (zypper). Add the repository so that you can install Mon-
goDB using zypper.
Changed in version 3.0: MongoDB Linux packages are in a new repository beginning with 3.0.
For the latest stable release of MongoDB Use the following command:
sudo zypper addrepo --no-gpgcheck https://repo.mongodb.org/zypper/suse/$(sed -rn 's/VERSION=.*([0-9]{
For versions of MongoDB earlier than 3.2 To install MongoDB packages from a previous release series
(page 1061), such as 3.0, you can specify the release series in the repository configuration. For example, to restrict
your SUSE 11 system to the 3.0 release series, use the following command:
sudo zypper addrepo --no-gpgcheck https://repo.mongodb.org/zypper/suse/11/mongodb-org/3.0/x86_64/ mon
Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether
to install the current release or a previous one. This step provides the commands for both.
To install the latest stable version of MongoDB, issue the following command:
sudo zypper -n install mongodb-org
4 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-suse/
To install a specific release of MongoDB, specify each component package individually and append the version number
to the package name, as in the following example:
sudo zypper install mongodb-org-3.2.4 mongodb-org-server-3.2.4 mongodb-org-shell-3.2.4 mongodb-org-mo
You can specify any available version of MongoDB. However zypper will upgrade the packages when a newer
version becomes available. To prevent unintended upgrades, pin the packages by running the following command:
sudo zypper addlock mongodb-org-3.2.4 mongodb-org-server-3.2.4 mongodb-org-shell-3.2.4 mongodb-org-mo
Previous versions of MongoDB packages use a different repository location. Refer to the version of the documentation
appropriate for your MongoDB version.
Prerequisites The MongoDB instance stores its data files in /var/lib/mongo and its log files in
/var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and
data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional in-
formation.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongo and /var/log/mongodb directories to give this user access to these directories.
Procedure
Step 1: Start MongoDB. You can start the mongod process by issuing the following command:
sudo service mongod start
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-
cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command:
sudo service mongod restart
You can follow the state of the process for errors or important messages by watching the output in the
/var/log/mongodb/mongod.log file.
2.3. Tutorials 13
MongoDB Documentation, Release 3.2.4
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB Community Edition To completely remove MongoDB from a system, you must remove the
MongoDB applications themselves, the configuration files, and any directories containing data and logs. The following
section guides you through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo zypper remove $(rpm -qa | grep mongodb-org)
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongo
On this page
Overview (page 14)
Packages (page 14)
Install MongoDB Community Edition on Amazon Linux Init Scripts (page 15)
Install MongoDB Community Edition (page 15)
Run MongoDB Community Edition (page 16)
Uninstall MongoDB Community Edition (page 17)
Overview Use this tutorial to install MongoDB Community Edition on Amazon Linux from .rpm packages.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
Packages MongoDB provides officially supported packages in their own repository. This repository contains the
following packages:
mongodb-org A metapackage that will automatically install the four component packages listed below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-org-server
Contains the mongos daemon.
mongodb-org-mongos
Contains the mongo shell.
mongodb-org-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-org-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
Init Scripts The mongodb-org package includes various init scripts, including the init script
/etc/rc.d/init.d/mongod. You can use these scripts to stop, start, and restart daemon processes.
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the init scripts. See the
Configuration File reference for documentation of settings available in the configuration file.
As of version 3.2.4, there are no init scripts for mongos. The mongos process is used only in sharding (page 731).
You can use the mongod init script to derive your own mongos init script for use in such environments. See the
mongos reference for configuration details.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
For the latest stable release of MongoDB Use the following repository file:
[mongodb-org-3.2]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/amazon/2013.03/mongodb-org/3.2/x86_64/
gpgcheck=0
enabled=1
For versions of MongoDB earlier than 3.0 To install the packages from an earlier release series (page 1061), such
as 2.4 or 2.6, you can specify the release series in the repository configuration. For example, to restrict your system
to the 2.6 release series, create a /etc/yum.repos.d/mongodb-org-2.6.repo file to hold the following
configuration information for the MongoDB 2.6 repository:
[mongodb-org-2.6]
name=MongoDB 2.6 Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
You can find .repo files for each release in the repository itself6 . Remember that odd-numbered minor release
versions (e.g. 2.5) are development versions and are unsuitable for production use.
Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether
to install the current release or a previous one. This step provides the commands for both.
To install the latest stable version of MongoDB, issue the following command:
5 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-amazon/
6 https://repo.mongodb.org/yum/amazon/
2.3. Tutorials 15
MongoDB Documentation, Release 3.2.4
To install a specific release of MongoDB, specify each component package individually and append the version number
to the package name, as in the following example:
sudo yum install -y mongodb-org-3.2.4 mongodb-org-server-3.2.4 mongodb-org-shell-3.2.4 mongodb-org-mo
You can specify any available version of MongoDB. However yum will upgrade the packages when a newer version
becomes available. To prevent unintended upgrades, pin the package. To pin a package, add the following exclude
directive to your /etc/yum.conf file:
exclude=mongodb-org,mongodb-org-server,mongodb-org-shell,mongodb-org-mongos,mongodb-org-tools
Run MongoDB Community Edition The MongoDB instance stores its data files in /var/lib/mongo and its
log files in /var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log
and data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional
information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongo and /var/log/mongodb directories to give this user access to these directories.
Step 1: Start MongoDB. You can start the mongod process by issuing the following command:
sudo service mongod start
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-
cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command:
sudo service mongod restart
You can follow the state of the process for errors or important messages by watching the output in the
/var/log/mongodb/mongod.log file.
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB Community Edition To completely remove MongoDB from a system, you must remove the
MongoDB applications themselves, the configuration files, and any directories containing data and logs. The following
section guides you through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo yum erase $(rpm -qa | grep mongodb-org)
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongo
On this page
Overview (page 17)
Packages (page 17)
Install MongoDB Community Edition on Ubuntu Init Scripts (page 18)
Install MongoDB Community Edition (page 18)
Run MongoDB Community Edition (page 19)
Uninstall MongoDB Community Edition (page 20)
Overview Use this tutorial to install MongoDB Community Edition on LTS Ubuntu Linux systems from .deb
packages. While Ubuntu includes its own MongoDB packages, the official MongoDB Community Edition packages
are generally more up-to-date.
Platform Support
MongoDB only provides packages for 64-bit long-term support Ubuntu releases. Currently, this means 12.04 LTS
(Precise Pangolin) and 14.04 LTS (Trusty Tahr). While the packages may work with other Ubuntu releases, this is not
a supported configuration.
Packages MongoDB provides officially supported packages in their own repository. This repository contains the
following packages:
mongodb-org A metapackage that will automatically install the four component packages listed below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-org-server
Contains the mongos daemon.
mongodb-org-mongos
Contains the mongo shell.
mongodb-org-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-org-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
2.3. Tutorials 17
MongoDB Documentation, Release 3.2.4
These packages conflict with the mongodb, mongodb-server, and mongodb-clients packages provided by
Ubuntu.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
Init Scripts The mongodb-org package includes various init scripts, including the init script
/etc/init.d/mongod. You can use these scripts to stop, start, and restart daemon processes.
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the init scripts. See the
Configuration File reference for documentation of settings available in the configuration file.
As of version 3.2.4, there are no init scripts for mongos. The mongos process is used only in sharding (page 731).
You can use the mongod init script to derive your own mongos init script for use in such environments. See the
mongos reference for configuration details.
MongoDB only provides packages for 64-bit long-term support Ubuntu releases. Currently, this means 12.04 LTS
(Precise Pangolin) and 14.04 LTS (Trusty Tahr). While the packages may work with other Ubuntu releases, this is not
a supported configuration.
Step 1: Import the public key used by the package management system. The Ubuntu package management tools
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with
GPG keys. Issue the following command to import the MongoDB public GPG Key8 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
Ubuntu 14.04
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse" | sudo tee /etc/apt/s
Step 3: Reload local package database. Issue the following command to reload the local package database:
sudo apt-get update
Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific
version of MongoDB.
7 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-ubuntu/
8 https://www.mongodb.org/static/pgp/server-3.2.asc
Install the latest stable version of MongoDB. Issue the following command:
sudo apt-get install -y mongodb-org
Install a specific release of MongoDB. To install a specific release, you must specify each component package
individually along with the version number, as in the following example:
sudo apt-get install -y mongodb-org=3.2.4 mongodb-org-server=3.2.4 mongodb-org-shell=3.2.4 mongodb-or
If you only install mongodb-org=3.2.4 and do not include the component packages, the latest version of each
MongoDB package will be installed regardless of what version you specified.
Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will
upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To
pin the version of MongoDB at the currently installed version, issue the following command sequence:
echo "mongodb-org hold" | sudo dpkg --set-selections
echo "mongodb-org-server hold" | sudo dpkg --set-selections
echo "mongodb-org-shell hold" | sudo dpkg --set-selections
echo "mongodb-org-mongos hold" | sudo dpkg --set-selections
echo "mongodb-org-tools hold" | sudo dpkg --set-selections
Run MongoDB Community Edition The MongoDB instance stores its data files in /var/lib/mongodb and
its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alter-
nate log and data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for
additional information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories to give this user access to these directories.
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
2.3. Tutorials 19
MongoDB Documentation, Release 3.2.4
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB Community Edition To completely remove MongoDB from a system, you must remove the
MongoDB applications themselves, the configuration files, and any directories containing data and logs. The following
section guides you through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo apt-get purge mongodb-org*
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb
On this page
Overview (page 20)
Packages (page 20)
Install MongoDB Community Edition on Debian Init Scripts (page 21)
Install MongoDB Community Edition (page 21)
Run MongoDB Community Edition (page 22)
Uninstall MongoDB Community Edition (page 23)
Overview Use this tutorial to install MongoDB Community Edition from .deb packages on Debian 7 Wheezy.
While Debian includes its own MongoDB packages, the official MongoDB Community Edition packages are more up
to date.
MongoDB only provides packages for 64-bit Debian Wheezy. These packages may work with other Debian releases,
but this is not a supported configuration.
Packages MongoDB provides officially supported packages in their own repository. This repository contains the
following packages:
mongodb-org A metapackage that will automatically install the four component packages listed below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-org-server
Contains the mongos daemon.
mongodb-org-mongos
Contains the mongo shell.
mongodb-org-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-org-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
These packages conflict with the mongodb, mongodb-server, and mongodb-clients packages provided by
Debian.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
Init Scripts The mongodb-org package includes various init scripts, including the init script
/etc/init.d/mongod. You can use these scripts to stop, start, and restart daemon processes.
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the init scripts. See the
Configuration File reference for documentation of settings available in the configuration file.
As of version 3.2.4, there are no init scripts for mongos. The mongos process is used only in sharding (page 731).
You can use the mongod init script to derive your own mongos init script for use in such environments. See the
mongos reference for configuration details.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
The Debian package management tools (i.e. dpkg and apt) ensure package consistency and authenticity by requiring
that distributors sign packages with GPG keys.
Step 1: Import the public key used by the package management system. The Ubuntu package management tools
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with
GPG keys. Issue the following command to import the MongoDB public GPG Key10 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
Step 3: Reload local package database. Issue the following command to reload the local package database:
sudo apt-get update
Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific
version of MongoDB.
9 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-debian/
10 https://www.mongodb.org/static/pgp/server-3.2.asc
2.3. Tutorials 21
MongoDB Documentation, Release 3.2.4
Install the latest stable version of MongoDB. Issue the following command:
sudo apt-get install -y mongodb-org
Install a specific release of MongoDB. To install a specific release, you must specify each component package
individually along with the version number, as in the following example:
sudo apt-get install -y mongodb-org=3.2.4 mongodb-org-server=3.2.4 mongodb-org-shell=3.2.4 mongodb-or
If you only install mongodb-org=3.2.4 and do not include the component packages, the latest version of each
MongoDB package will be installed regardless of what version you specified.
Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will
upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To
pin the version of MongoDB at the currently installed version, issue the following command sequence:
echo "mongodb-org hold" | sudo dpkg --set-selections
echo "mongodb-org-server hold" | sudo dpkg --set-selections
echo "mongodb-org-shell hold" | sudo dpkg --set-selections
echo "mongodb-org-mongos hold" | sudo dpkg --set-selections
echo "mongodb-org-tools hold" | sudo dpkg --set-selections
Run MongoDB Community Edition The MongoDB instance stores its data files in /var/lib/mongodb and
its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alter-
nate log and data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for
additional information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories to give this user access to these directories.
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB Community Edition To completely remove MongoDB from a system, you must remove the
MongoDB applications themselves, the configuration files, and any directories containing data and logs. The following
section guides you through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo apt-get purge mongodb-org*
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb
On this page
Overview Compiled versions of MongoDB Community Edition for Linux provide a simple option for installing
MongoDB Community Edition for other Linux systems without supported packages.
Install MongoDB Community Edition MongoDB provides archives for both 64-bit and 32-bit (depreciated) builds
of Linux. Follow the installation procedure appropriate for your system.
Note: To install a version of MongoDB prior to 3.2, please refer to that versions documentation. For example, see
version 3.011 .
2.3. Tutorials 23
MongoDB Documentation, Release 3.2.4
Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from
https://www.mongodb.org/downloads.
For example, to download the latest release through the shell, issue the following:
curl -O https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-3.2.4.tgz
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through
the tar command:
tar -zxvf mongodb-linux-x86_64-3.2.4.tgz
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which
MongoDB will run.
mkdir -p mongodb
cp -R -n mongodb-linux-x86_64-3.2.4/ mongodb
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH.
For example, you can add the following line to your shells rc file (e.g. ~/.bashrc):
export PATH=<mongodb-install-directory>/bin:$PATH
Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from
https://www.mongodb.org/downloads.
For example, to download the latest release through the shell, issue the following:
curl -O https://fastdl.mongodb.org/linux/mongodb-linux-i686-3.2.4.tgz
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through
the tar command:
tar -zxvf mongodb-linux-i686-3.2.4.tgz
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which
MongoDB will run.
mkdir -p mongodb
cp -R -n mongodb-linux-i686-3.2.4/ mongodb
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH.
For example, you can add the following line to your shells rc file (e.g. ~/.bashrc):
export PATH=<mongodb-install-directory>/bin:$PATH
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process
later in this procedure.
The following example command creates the default /data/db directory:
mkdir -p /data/db
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user
account running mongod has read and write permissions for the directory.
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the
path of the mongod or the data directory. See the following examples.
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt:
mongod
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full
path to the mongod binary at the system prompt:
<path to binary>/mongod
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the
path to the data directory using the --dbpath option:
mongod --dbpath <path to data directory>
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
2.3. Tutorials 25
MongoDB Documentation, Release 3.2.4
On this page
Overview (page 26)
Install MongoDB Community Edition (page 26)
Run MongoDB (page 27)
Platform Support
Starting in version 3.0, MongoDB only supports OS X versions 10.7 (Lion) and later on Intel x86-64.
MongoDB Community Edition is available through the popular OS X package manager Homebrew12 or through the
MongoDB Download site13 .
You can install MongoDB Community Edition with Homebrew15 or manually. This section describes both methods.
Install MongoDB Community Edition with Homebrew Homebrew16 installs binary packages based on published
formulae. This section describes how to update brew to the latest packages and install MongoDB Community
Edition. Homebrew requires some initial setup and configuration, which is beyond the scope of this document.
Step 1: Update Homebrews package database. In a system shell, issue the following command:
brew update
Step 2: Install MongoDB. You can install MongoDB via brew with several different options. Use one of the
following operations:
Install the MongoDB Binaries To install the MongoDB binaries, issue the following command in a system shell:
brew install mongodb
Build MongoDB from Source with TLS/SSL Support To build MongoDB from the source files and include
TLS/SSL support, issue the following from a system shell:
brew install mongodb --with-openssl
12 http://brew.sh/
13 http://www.mongodb.org/downloads
14 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-os-x/
15 http://brew.sh/
16 http://brew.sh/
Install the Latest Development Release of MongoDB To install the latest development release for use in testing
and development, issue the following command in a system shell:
brew install mongodb --devel
Install MongoDB Community Edition Manually Only install MongoDB Community Edition using this procedure
if you cannot use homebrew (page 26).
Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from
https://www.mongodb.org/downloads.
For example, to download the latest release through the shell, issue the following:
curl -O https://fastdl.mongodb.org/osx/mongodb-osx-x86_64-3.2.4.tgz
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through
the tar command:
tar -zxvf mongodb-osx-x86_64-3.2.4.tgz
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which
MongoDB will run.
mkdir -p mongodb
cp -R -n mongodb-osx-x86_64-3.2.4/ mongodb
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH.
For example, you can add the following line to your shells rc file (e.g. ~/.bashrc):
export PATH=<mongodb-install-directory>/bin:$PATH
Run MongoDB
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process
later in this procedure.
The following example command creates the default /data/db directory:
mkdir -p /data/db
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user
account running mongod has read and write permissions for the directory.
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the
path of the mongod or the data directory. See the following examples.
2.3. Tutorials 27
MongoDB Documentation, Release 3.2.4
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt:
mongod
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full
path to the mongod binary at the system prompt:
<path to binary>/mongod
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the
path to the data directory using the --dbpath option:
mongod --dbpath <path to data directory>
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
On this page
Overview (page 28)
Requirements (page 28)
Get MongoDB Community Edition (page 29)
Install MongoDB Community Edition (page 29)
Run MongoDB Community Edition (page 30)
Configure a Windows Service for MongoDB Community Edition (page 31)
Manually Create a Windows Service for MongoDB Community Edition (page 32)
Additional Resources (page 33)
Overview Use this tutorial to install MongoDB Community Edition on Windows systems.
Platform Support
Starting in version 2.2, MongoDB does not support Windows XP. Please use a more recent version of Windows to use
more recent releases of MongoDB.
Important: If you are running any edition of Windows Server 2008 R2 or Windows 7, please install a hotfix to
resolve an issue with memory mapped files on Windows17 .
Requirements MongoDB Community Edition requires Windows Server 2008 R2, Windows Vista, or later. The
.msi installer includes all other software dependencies and will automatically upgrade any older version of MongoDB
installed using an .msi file.
17 http://support.microsoft.com/kb/2731284
Step 1: Determine which MongoDB build you need. There are three builds of MongoDB for Windows:
MongoDB for Windows 64-bit runs only on Windows Server 2008 R2, Windows 7 64-bit, and newer versions of
Windows. This build takes advantage of recent enhancements to the Windows Platform and cannot operate on older
versions of Windows.
MongoDB for Windows 32-bit runs on any 32-bit version of Windows newer than Windows Vista. 32-bit versions
of MongoDB are only intended for older systems and for use in testing and development systems. 32-bit versions of
MongoDB only support databases smaller than 2GB.
Note: Starting in MongoDB 3.2, 32-bit binaries are deprecated and will be unavailable in future releases.
MongoDB for Windows 64-bit Legacy runs on Windows Vista, Windows Server 2003, and Windows Server 2008
and does not include recent performance enhancements.
To find which version of Windows you are running, enter the following commands in the Command Prompt or Pow-
ershell:
wmic os get caption
wmic os get osarchitecture
Step 2: Download MongoDB for Windows. Download the latest production release of MongoDB from the Mon-
goDB downloads page19 . Ensure you download the correct version of MongoDB for your Windows system. The
64-bit versions of MongoDB do not work with 32-bit Windows.
Interactive Installation
Step 1: Install MongoDB for Windows. In Windows Explorer, locate the downloaded MongoDB .msi file, which
typically is located in the default Downloads folder. Double-click the .msi file. A set of screens will appear to
guide you through the installation process.
You may specify an installation directory if you choose the Custom installation option.
Note: These instructions assume that you have installed MongoDB to C:\mongodb.
MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from any folder
you choose. You may install MongoDB in any folder (e.g. D:\test\mongodb).
Unattended Installation You may install MongoDB Community unattended on Windows from the command line
using msiexec.exe.
18 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-windows/
19 http://www.mongodb.org/downloads
2.3. Tutorials 29
MongoDB Documentation, Release 3.2.4
Step 1: Open an Administrator command prompt. Press the Win key, type cmd.exe, and press Ctrl +
Shift + Enter to run the Command Prompt as Administrator.
Execute the remaining steps from the Administrator command prompt.
Step 2: Install MongoDB for Windows. Change to the directory containing the .msi installation binary of your
choice and invoke:
msiexec.exe /q /i mongodb-win32-x86_64-2008plus-ssl-3.2.4-signed.msi ^
INSTALLLOCATION="C:\mongodb" ^
ADDLOCAL="all"
You can specify the installation location for the executable by modifying the INSTALLLOCATION value.
By default, this method installs all MongoDB binaries. To install specific MongoDB component sets, you can specify
them in the ADDLOCAL argument using a comma-separated list including one or more of the following component
sets:
Component Set Binaries
Server mongod.exe
Router mongos.exe
Client mongo.exe
MonitoringTools mongostat.exe, mongotop.exe
ImportExportTools mongodump.exe, mongorestore.exe, mongoexport.exe,
mongoimport.exe
MiscellaneousTools bsondump.exe, mongofiles.exe, mongooplog.exe, mongoperf.exe
For instance, to install only the MongoDB utilities, invoke:
msiexec.exe /q /i mongodb-win32-x86_64-2008plus-ssl-3.2.4-signed.msi ^
INSTALLLOCATION="C:\mongodb" ^
ADDLOCAL="MonitoringTools,ImportExportTools,MiscellaneousTools"
Warning: Do not make mongod.exe visible on public networks without running in Sec
Run MongoDB Community Edition auth setting. MongoDB is designed to be run in trusted environments, and the database does
Mode by default.
Step 1: Set up the MongoDB environment. MongoDB requires a data directory to store all data. MongoDBs
default data directory path is \data\db. Create this folder using the following commands from a Command Prompt:
md \data\db
You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example:
C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data
If your path includes spaces, enclose the entire path in double quotes, for example:
C:\mongodb\bin\mongod.exe --dbpath "d:\test\mongo db data"
Step 2: Start MongoDB. To start MongoDB, run mongod.exe. For example, from the Command Prompt:
C:\mongodb\bin\mongod.exe
This starts the main MongoDB database process. The waiting for connections message in the console
output indicates that the mongod.exe process is running successfully.
Depending on the security level of your system, Windows may pop up a Security Alert dialog box about blocking
some features of C:\mongodb\bin\mongod.exe from communicating on networks. All users should select
Private Networks, such as my home or work network and click Allow access. For additional
information on security and MongoDB, please see the Security Documentation (page 315).
Step 3: Connect to MongoDB. To connect to MongoDB through the mongo.exe shell, open another Command
Prompt.
C:\mongodb\bin\mongo.exe
If you want to develop applications using .NET, see the documentation of C# and MongoDB20 for more information.
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Step 1: Open an Administrator command prompt. Press the Win key, type cmd.exe, and press Ctrl +
Shift + Enter to run the Command Prompt as Administrator.
Execute the remaining steps from the Administrator command prompt.
Step 2: Create directories. Create directories for your database and log files:
mkdir c:\data\db
mkdir c:\data\log
Step 3: Create a configuration file. Create a configuration file. The file must set systemLog.path. Include
additional configuration options as appropriate.
For example, create a file at C:\mongodb\mongod.cfg that specifies both systemLog.path and
storage.dbPath:
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
20 https://docs.mongodb.org/ecosystem/drivers/csharp
2.3. Tutorials 31
MongoDB Documentation, Release 3.2.4
Install the MongoDB service by starting mongod.exe with the --install option and the -config option to
specify the previously created configuration file.
"C:\mongodb\bin\mongod.exe" --config "C:\mongodb\mongod.cfg" --install
To use an alternate dbpath, specify the path in the configuration file (e.g. C:\mongodb\mongod.cfg) or on the
command line with the --dbpath option.
If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with
a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system
resources exist and your system design requires it.
Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-
mand:
net stop MongoDB
Manually Create a Windows Service for MongoDB Community Edition You can set up the MongoDB server as
a Windows Service that starts automatically at boot time.
The following procedure assumes you have installed MongoDB Community using the .msi installer with the path
C:\mongodb\.
If you have installed in an alternative directory, you will need to adjust the paths as appropriate.
Step 1: Open an Administrator command prompt. Press the Win key, type cmd.exe, and press Ctrl +
Shift + Enter to run the Command Prompt as Administrator.
Execute the remaining steps from the Administrator command prompt.
Step 2: Create directories. Create directories for your database and log files:
mkdir c:\data\db
mkdir c:\data\log
Step 3: Create a configuration file. Create a configuration file. The file must set systemLog.path. Include
additional configuration options as appropriate.
For example, create a file at C:\mongodb\mongod.cfg that specifies both systemLog.path and
storage.dbPath:
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
sc.exe requires a space between = and the configuration values (eg binPath= ), and a \ to escape double
quotes.
If successfully created, the following log message will display:
[SC] CreateService SUCCESS
Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service, use the following com-
mand:
net stop MongoDB
To remove the MongoDB service, first stop the service and then run the following command:
sc.exe delete MongoDB
Additional Resources
MongoDB for Developers Free Course21
MongoDB for .NET Developers Free Online Course22
MongoDB Architecture Guide23
2.3. Tutorials 33
MongoDB Documentation, Release 3.2.4
Install on Red Hat (page 34) Install the MongoDB Enterprise build and required dependencies on Red Hat Enterprise
or CentOS Systems using packages.
Install on Ubuntu (page 38) Install the MongoDB Enterprise build and required dependencies on Ubuntu Linux Sys-
tems using packages.
Install on Debian (page 41) Install the MongoDB Enterprise build and required dependencies on Debian Linux Sys-
tems using packages.
Install on SUSE (page 44) Install the MongoDB Enterprise build and required dependencies on SUSE Enterprise
Linux.
Install on Amazon AMI (page 48) Install the MongoDB Enterprise build and required dependencies on Amazon
Linux AMI.
Install From Tarball (page 49) Install the official build of MongoDB Enterprise from a tarball.
On this page
Overview (page 34)
Install MongoDB Enterprise on Red Hat Enterprise or CentOS Install MongoDB Enterprise (page 35)
Install MongoDB Enterprise From Tarball (page 36)
Run MongoDB Enterprise (page 36)
Uninstall MongoDB (page 38)
Overview Use this tutorial to install MongoDB Enterprise24 on Red Hat Enterprise Linux or CentOS Linux versions
6 and 7 from .rpm packages.
Platform Support
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
MongoDB 3.2 deprecates support for Red Hat Enterprise Linux 5.
MongoDB provides officially supported Enterprise packages in their own repository. This repository contains the
following packages:
A metapackage that will automatically install the four component packages listed
mongodb-enterprise
below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-enterprise-server
Contains the mongos daemon.
mongodb-enterprise-mongos
Contains the mongo shell.
mongodb-enterprise-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-enterprise-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
The default /etc/mongod.conf configuration file supplied by the packages have bind_ip set to 127.0.0.1
by default. Modify this setting as needed for your environment before initializing a replica set.
24 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
When you install the packages for MongoDB Enterprise, you choose whether to install the current release or a previous
one. This procedure describes how to do both.
Use the provided distribution packages as described in this page if possible. These packages will automatically install
all of MongoDBs dependencies, and are the recommended installation method.
For the latest stable release of MongoDB Enterprise Use the following repository file:
[mongodb-enterprise]
name=MongoDB Enterprise Repository
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/stable/$basearch/
gpgcheck=0
enabled=1
For specific version of MongoDB Enterprise To install MongoDB Enterprise packages from a spe-
cific release series (page 1061), such as 2.4 or 2.6, you can specify the release series in the
repository configuration. For example, to restrict your system to the 2.6 release series, create a
/etc/yum.repos.d/mongodb-enterprise-2.6.repo file to hold the following configuration information
for the MongoDB Enterprise 2.6 repository:
[mongodb-enterprise-2.6]
name=MongoDB Enterprise 2.6 Repository
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/2.6/$basearch/
gpgcheck=0
enabled=1
.repo files for each release can also be found in the repository itself26 . Remember that odd-numbered minor release
versions (e.g. 2.5) are development versions and are unsuitable for production deployment.
Step 2: Install the MongoDB Enterprise packages and associated tools. You can install either the latest stable
version of MongoDB Enterprise or a specific version of MongoDB Enterprise.
To install the latest stable version of MongoDB Enterprise, issue the following command:
sudo yum install -y mongodb-enterprise
Install a specific release of MongoDB Enterprise. Specify each component package individually and append the
version number to the package name, as in the following example that installs the 2.6.1 release of MongoDB:
sudo yum install -y mongodb-enterprise-2.6.1 mongodb-enterprise-server-2.6.1 mongodb-enterprise-shell
25 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-red-hat/
26 https://repo.mongodb.com/yum/redhat/
2.3. Tutorials 35
MongoDB Documentation, Release 3.2.4
Pin a specific version of MongoDB Enterprise. Although you can specify any available version of MongoDB
Enterprise, yum will upgrade the packages when a newer version becomes available. To prevent unintended upgrades,
pin the package. To pin a package, add the following exclude directive to your /etc/yum.conf file:
exclude=mongodb-enterprise,mongodb-enterprise-server,mongodb-enterprise-shell,mongodb-enterprise-mong
Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for
more information27 .
Install MongoDB Enterprise From Tarball While you should use the .rpm packages as previously described, you
may also manually install MongoDB using the tarballs.
First you must install any dependencies as appropriate:
Version 5
yum install perl cyrus-sasl cyrus-sasl-plain cyrus-sasl-gssapi krb5-libs \
lm_sensors net-snmp openssl popt rpm-libs tcp_wrappers zlib
Version 6
yum install cyrus-sasl cyrus-sasl-plain cyrus-sasl-gssapi krb5-libs \
net-snmp openssl
Version 7
yum install cyrus-sasl cyrus-sasl-plain cyrus-sasl-gssapi krb5-libs \
lm_sensors-libs net-snmp-agent-libs net-snmp openssl rpm-libs \
tcp_wrappers-libs
To perform the installation, see Install MongoDB Enterprise From Tarball (page 49).
Prerequisites
Configure SELinux
Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat
Enterprise Linux or CentOS Linux).
Note: All three options require root privileges. The first two options each requires a system reboot and may have
larger implications for your deployment.
SELINUX=permissive
Note: You can use setenforce to change to permissive mode; this method does not require a reboot but is
not persistent.
Enable access to the relevant ports (e.g. 27017) for SELinux if in enforcing mode. See
https://docs.mongodb.org/manual/reference/default-mongodb-port for more infor-
mation on MongoDBs default ports. For default settings, this can be accomplished by running
semanage port -a -t mongod_port_t -p tcp 27017
Warning: On RHEL 7.0, if you change the data path, the default SELinux policies will prevent mongod
from having write access on the new data path if you do not change the security context.
You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system,
or choose to remove the relevant packages. This option is the most invasive and is not recommended.
Warning: On RHEL 7.0, if you change the data path, the default SELinux policies will preve
Data Directories and Permissions
having write access on the new data path if you do not change the security context.
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb
by default, and runs using the mongod user account. You can specify alternate log and data file directories in
/etc/mongod.conf. See systemLog.path and storage.dbPath for additional information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongo and /var/log/mongodb directories to give this user access to these directories.
Procedure
Step 1: Start MongoDB. You can start the mongod process by issuing the following command:
sudo service mongod start
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-
cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
2.3. Tutorials 37
MongoDB Documentation, Release 3.2.4
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command:
sudo service mongod restart
You can follow the state of the process for errors or important messages by watching the output in the
/var/log/mongodb/mongod.log file.
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB To completely remove MongoDB from a system, you must remove the MongoDB applications
themselves, the configuration files, and any directories containing data and logs. The following section guides you
through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo yum erase $(rpm -qa | grep mongodb-enterprise)
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongo
On this page
Overview (page 38)
Install MongoDB Enterprise on Ubuntu Install MongoDB Enterprise (page 39)
Install MongoDB Enterprise From Tarball (page 40)
Run MongoDB Enterprise (page 40)
Uninstall MongoDB (page 41)
Overview Use this tutorial to install MongoDB Enterprise28 on LTS Ubuntu Linux systems from .deb packages.
Platform Support
28 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
MongoDB only provides packages for 64-bit long-term support Ubuntu releases. Currently, this means 12.04 LTS
(Precise Pangolin) and 14.04 LTS (Trusty Tahr). While the packages may work with other Ubuntu releases, this is not
a supported configuration.
MongoDB provides officially supported Enterprise packages in their own repository. This repository contains the
following packages:
A metapackage that will automatically install the four component packages listed
mongodb-enterprise
below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-enterprise-server
Contains the mongos daemon.
mongodb-enterprise-mongos
Contains the mongo shell.
mongodb-enterprise-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-enterprise-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
MongoDB only provides packages for 64-bit long-term support Ubuntu releases. Currently, this means 12.04 LTS
(Precise Pangolin) and 14.04 LTS (Trusty Tahr). While the packages may work with other Ubuntu releases, this is not
a supported configuration.
Use the provided distribution packages as described in this page if possible. These packages will automatically install
all of MongoDBs dependencies, and are the recommended installation method.
Step 1: Import the public key used by the package management system. The Ubuntu package management tools
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with
GPG keys. Issue the following command to import the MongoDB public GPG Key30 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
Ubuntu 14.04
echo "deb http://repo.mongodb.com/apt/ubuntu trusty/mongodb-enterprise/stable multiverse" | sudo tee
If youd like to install MongoDB Enterprise packages from a particular release series (page 1061), such as 2.4 or 2.6,
you can specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release
series, add the following repository:
echo "deb http://repo.mongodb.com/apt/ubuntu "$(lsb_release -sc)"/mongodb-enterprise/2.6 multiverse"
29 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-ubuntu/
30 https://www.mongodb.org/static/pgp/server-3.2.asc
2.3. Tutorials 39
MongoDB Documentation, Release 3.2.4
Step 3: Reload local package database. Issue the following command to reload the local package database:
sudo apt-get update
Step 4: Install the MongoDB Enterprise packages. You can install either the latest stable version of MongoDB or
a specific version of MongoDB.
Install the latest stable version of MongoDB Enterprise. Issue the following command:
sudo apt-get install -y mongodb-enterprise
Install a specific release of MongoDB Enterprise. To install a specific release, you must specify each component
package individually along with the version number, as in the following example:
sudo apt-get install -y mongodb-enterprise=3.2.4 mongodb-enterprise-server=3.2.4 mongodb-enterprise-s
If you only install mongodb-enterprise=3.2.4 and do not include the component packages, the latest version
of each MongoDB package will be installed regardless of what version you specified.
Pin a specific version of MongoDB Enterprise. Although you can specify any available version of MongoDB,
apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin
the package. To pin the version of MongoDB at the currently installed version, issue the following command sequence:
echo "mongodb-enterprise hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections
Versions of the MongoDB packages before 2.6 use a different repository location. Refer to the version of the docu-
mentation appropriate for your MongoDB version.
Install MongoDB Enterprise From Tarball While you should use the .deb packages as previously described, you
may also manually install MongoDB using the tarballs.
First you must install any dependencies as appropriate:
sudo apt-get install libgssapi-krb5-2 libsasl2-2 libssl1.0.0 libstdc++6 snmp
To perform the installation, see Install MongoDB Enterprise From Tarball (page 49).
Run MongoDB Enterprise The MongoDB instance stores its data files in /var/lib/mongodb and its log files
in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log
and data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional
information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories to give this user access to these directories.
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB To completely remove MongoDB from a system, you must remove the MongoDB applications
themselves, the configuration files, and any directories containing data and logs. The following section guides you
through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo apt-get purge mongodb-enterprise*
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb
On this page
Overview (page 42)
Install MongoDB Enterprise on Debian Install MongoDB Enterprise (page 42)
Install MongoDB Enterprise From Tarball (page 43)
Run MongoDB Enterprise (page 43)
Uninstall MongoDB (page 44)
2.3. Tutorials 41
MongoDB Documentation, Release 3.2.4
Overview Use this tutorial to install MongoDB Enterprise31 from .deb packages on Debian 7 Wheezy.
Platform Support
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
MongoDB provides officially supported Enterprise packages in their own repository. This repository contains the
following packages:
A metapackage that will automatically install the four component packages listed
mongodb-enterprise
below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-enterprise-server
Contains the mongos daemon.
mongodb-enterprise-mongos
Contains the mongo shell.
mongodb-enterprise-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-enterprise-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
Use the provided distribution packages as described in this page if possible. These packages will automatically install
all of MongoDBs dependencies, and are the recommended installation method.
Step 1: Import the public key used by the package management system. The Ubuntu package management tools
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with
GPG keys. Issue the following command to import the MongoDB public GPG Key33 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
If youd like to install MongoDB Enterprise packages from a particular release series (page 1061), such as 2.6, you
can specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release
series, add the following repository:
echo "deb http://repo.mongodb.com/apt/debian wheezy/mongodb-enterprise/2.6 main" | sudo tee /etc/apt/
Step 3: Reload local package database. Issue the following command to reload the local package database:
sudo apt-get update
31 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
32 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-debian/
33 https://www.mongodb.org/static/pgp/server-3.2.asc
Step 4: Install the MongoDB Enterprise packages. You can install either the latest stable version of MongoDB or
a specific version of MongoDB.
Install the latest stable version of MongoDB Enterprise. Issue the following command:
sudo apt-get install -y mongodb-enterprise
Install a specific release of MongoDB Enterprise. To install a specific release, you must specify each component
package individually along with the version number, as in the following example:
sudo apt-get install -y mongodb-enterprise=3.2.4 mongodb-enterprise-server=3.2.4 mongodb-enterprise-s
If you only install mongodb-enterprise=3.2.4 and do not include the component packages, the latest version
of each MongoDB package will be installed regardless of what version you specified.
Pin a specific version of MongoDB Enterprise. Although you can specify any available version of MongoDB,
apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin
the package. To pin the version of MongoDB at the currently installed version, issue the following command sequence:
echo "mongodb-enterprise hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections
echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections
Versions of the MongoDB packages before 2.6 use a different repository location. Refer to the version of the docu-
mentation appropriate for your MongoDB version.
Install MongoDB Enterprise From Tarball While you should use the .deb packages as previously described, you
may also manually install MongoDB using the tarballs.
First you must install any dependencies as appropriate:
sudo apt-get install libgssapi-krb5-2 libsasl2-2 libssl1.0.0 libstdc++6 snmp
To perform the installation, see Install MongoDB Enterprise From Tarball (page 49).
Run MongoDB Enterprise The MongoDB instance stores its data files in /var/lib/mongodb and its log files
in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log
and data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional
information.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories to give this user access to these directories.
2.3. Tutorials 43
MongoDB Documentation, Release 3.2.4
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB To completely remove MongoDB from a system, you must remove the MongoDB applications
themselves, the configuration files, and any directories containing data and logs. The following section guides you
through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo apt-get purge mongodb-enterprise*
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb
On this page
Overview (page 45)
Considerations (page 45)
Install MongoDB Enterprise on SUSE Install MongoDB Enterprise (page 45)
Install MongoDB Enterprise From Tarball (page 46)
Run MongoDB Enterprise (page 46)
Uninstall MongoDB (page 47)
Overview Use this tutorial to install MongoDB Enterprise34 on SUSE Linux. MongoDB Enterprise is available on
select platforms and contains support for several features related to security and monitoring.
Platform Support
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
MongoDB provides officially supported Enterprise packages in their own repository. This repository contains the
following packages:
A metapackage that will automatically install the four component packages listed
mongodb-enterprise
below.
Contains the mongod daemon and associated configuration and init scripts.
mongodb-enterprise-server
Contains the mongos daemon.
mongodb-enterprise-mongos
Contains the mongo shell.
mongodb-enterprise-shell
Contains the following MongoDB tools: mongoimport bsondump, mongodump,
mongodb-enterprise-tools
mongoexport, mongofiles, mongooplog, mongoperf, mongorestore,
mongostat, and mongotop.
Considerations MongoDB only provides Enterprise packages for 64-bit builds of SUSE Enterprise Linux versions
11 and 12.
Use the provided distribution packages as described in this page if possible. These packages will automatically install
all of MongoDBs dependencies, and are the recommended installation method.
Note: SUSE Linux Enterprise Server and potentially other SUSE distributions ship with virtual memory address
space limited to 8 GB by default. You must adjust this in order to prevent virtual memory allocation failures as the
database grows.
The SLES packages for MongoDB adjust these limits in the default scripts, but you will need to make this change
manually if you are using custom scripts and/or the tarball release rather than the SLES packages.
Step 1: Configure the package management system (zypper). Add the repository so that you can install Mon-
goDB using zypper.
Specify the latest stable release of MongoDB using the command appropriate for your version of SUSE:
34 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
35 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-suse/
2.3. Tutorials 45
MongoDB Documentation, Release 3.2.4
SUSE 11
sudo zypper addrepo --no-gpgcheck "https://repo.mongodb.com/zypper/suse/11/mongodb-enterprise/stable/
SUSE 12
sudo zypper addrepo --no-gpgcheck "https://repo.mongodb.com/zypper/suse/12/mongodb-enterprise/stable/
If youd like to install MongoDB packages from a previous release series (page 1061), such as 2.6, you can specify the
release series in the repository configuration. For example, to restrict your SUSE 11 system to the 2.6 release series,
use the following command:
sudo zypper addrepo --no-gpgcheck https://repo.mongodb.com/zypper/suse/11/mongodb-enterprise/2.6/x86_
Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether
to install the current release or a previous one. This step provides the commands for both.
To install the latest stable version of MongoDB, issue the following command:
sudo zypper -n install mongodb-enterprise
To install a specific release of MongoDB, specify each component package individually and append the version number
to the package name, as in the following example:
sudo zypper install mongodb-enterprise-3.2.4 mongodb-enterprise-server-3.2.4 mongodb-enterprise-shell
You can specify any available version of MongoDB. However zypper will upgrade the packages when a newer
version becomes available. To prevent unintended upgrades, pin the packages by running the following command:
sudo zypper addlock mongodb-enterprise-3.2.4 mongodb-enterprise-server-3.2.4 mongodb-enterprise-shell
Previous versions of MongoDB packages use a different repository location. Refer to the version of the documentation
appropriate for your MongoDB version.
Install MongoDB Enterprise From Tarball While you should use the .rpm packages as previously described, you
may also manually install MongoDB using the tarballs.
First you must install any dependencies as appropriate:
zypper install cyrus-sasl cyrus-sasl-plain cyrus-sasl-gssapi krb5 \
libopenssl0_9_8 net-snmp libstdc++46 zlib
To perform the installation, see Install MongoDB Enterprise From Tarball (page 49).
Prerequisites The MongoDB instance stores its data files in /var/lib/mongo and its log files in
/var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and
data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional in-
formation.
If you change the user that runs the MongoDB process, you must modify the access control rights to the
/var/lib/mongo and /var/log/mongodb directories to give this user access to these directories.
Procedure
Step 1: Start MongoDB. You can start the mongod process by issuing the following command:
sudo service mongod start
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-
cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading
[initandlisten] waiting for connections on port <port>
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command:
sudo service mongod stop
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command:
sudo service mongod restart
You can follow the state of the process for errors or important messages by watching the output in the
/var/log/mongodb/mongod.log file.
Step 5: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Uninstall MongoDB To completely remove MongoDB from a system, you must remove the MongoDB applications
themselves, the configuration files, and any directories containing data and logs. The following section guides you
through the necessary steps.
Warning: This process will completely remove MongoDB, its configuration, and all databases. This process is
not reversible, so ensure that all of your configuration and data is backed up before proceeding.
Step 1: Stop MongoDB. Stop the mongod process by issuing the following command:
sudo service mongod stop
Step 2: Remove Packages. Remove any MongoDB packages that you had previously installed.
sudo zypper remove $(rpm -qa | grep mongodb-enterprise)
Step 3: Remove Data Directories. Remove MongoDB databases and log files.
2.3. Tutorials 47
MongoDB Documentation, Release 3.2.4
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongo
On this page
Overview (page 48)
Install MongoDB Enterprise on Amazon Linux AMI Prerequisites (page 48)
Install MongoDB Enterprise (page 48)
Run MongoDB Enterprise (page 48)
Overview Use this tutorial to install MongoDB Enterprise36 on Amazon Linux AMI. MongoDB Enterprise is avail-
able on select platforms and contains support for several features related to security and monitoring.
This installation guide only supports 64-bit systems. See Platform Support (page 944) for details.
Note: The Enterprise packages include an example SNMP configuration file named mongod.conf. This file is not
a MongoDB configuration file.
Step 1: Download and install the MongoDB Enterprise packages. After you have installed the required pre-
requisite packages, download and install the MongoDB Enterprise packages from https://mongodb.com/download/.
The MongoDB binaries are located in the bin/ directory of the archive. To download and install, use the following
sequence of commands.
curl -O https://downloads.mongodb.com/linux/mongodb-linux-x86_64-enterprise-amzn64-3.2.4.tgz
tar -zxvf mongodb-linux-x86_64-enterprise-amzn64-3.2.4.tgz
cp -R -n mongodb-linux-x86_64-enterprise-amzn64-3.2.4/ mongodb
Step 2: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied
the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not,
either include it or create symbolic links from the binaries to a directory that is included.
Run MongoDB Enterprise The MongoDB instance stores its data files in /data/db and its log files in
/var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and
data file directories in /etc/mongod.conf. See systemLog.path and storage.dbPath for additional in-
formation.
36 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
37 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-amazon/
If you change the user that runs the MongoDB process, you must modify the access control rights to the /data/db
and /var/log/mongodb directories to give this user access to these directories.
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process
later in this procedure.
The following example command creates the default /data/db directory:
mkdir -p /data/db
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user
account running mongod has read and write permissions for the directory.
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the
path of the mongod or the data directory. See the following examples.
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt:
mongod
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full
path to the mongod binary at the system prompt:
<path to binary>/mongod
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the
path to the data directory using the --dbpath option:
mongod --dbpath <path to data directory>
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
On this page
Overview Compiled versions of MongoDB Enterprise for Linux provide a simple option for installing MongoDB
for other Linux systems without supported packages.
2.3. Tutorials 49
MongoDB Documentation, Release 3.2.4
Install MongoDB
Note: To install a version of MongoDB prior to 3.2, please refer to that versions documentation. For example, see
version 3.038 .
Step 1: Install any missing dependencies. To manually install MongoDB Enterprise, first install any dependencies
as appropriate.
Step 2: Download and install the MongoDB Enterprise packages. After you have installed the required pre-
requisite packages, download and install the MongoDB Enterprise packages from https://mongodb.com/download/.
The MongoDB binaries are located in the bin/ directory of the archive. To download and install, use the following
sequence of commands.
Step 3: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied
the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not,
either include it or create symbolic links from the binaries to a directory that is included.
Run MongoDB
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process
later in this procedure.
The following example command creates the default /data/db directory:
mkdir -p /data/db
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user
account running mongod has read and write permissions for the directory.
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the
path of the mongod or the data directory. See the following examples.
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt:
mongod
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full
path to the mongod binary at the system prompt:
<path to binary>/mongod
38 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-linux/
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the
path to the data directory using the --dbpath option:
mongod --dbpath <path to data directory>
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Overview Use this tutorial to install MongoDB Enterprise39 on OS X systems. MongoDB Enterprise is available on
select platforms and contains support for several features related to security and monitoring.
Platform Support
MongoDB only supports OS X versions 10.7 (Lion) and later on Intel x86-64. Versions of MongoDB Enterprise prior
to 3.2 did not support OS X.
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through
the tar command:
tar -zxvf mongodb-osx-x86_64-enterprise-3.2.4.tgz
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which
MongoDB will run.
mkdir -p mongodb
cp -R -n mongodb-osx-x86_64-enterprise-3.2.4/ mongodb
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH.
For example, you can add the following line to your shells rc file (e.g. ~/.bashrc):
export PATH=<mongodb-install-directory>/bin:$PATH
2.3. Tutorials 51
MongoDB Documentation, Release 3.2.4
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process
later in this procedure.
The following example command creates the default /data/db directory:
mkdir -p /data/db
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user
account running mongod has read and write permissions for the directory.
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the
path of the mongod or the data directory. See the following examples.
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt:
mongod
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full
path to the mongod binary at the system prompt:
<path to binary>/mongod
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the
path to the data directory using the --dbpath option:
mongod --dbpath <path to data directory>
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
On this page
Overview (page 53)
Prerequisites (page 53)
Get MongoDB Enterprise (page 53)
Install MongoDB Enterprise (page 53)
Run MongoDB Enterprise (page 54)
Configure a Windows Service for MongoDB Enterprise (page 55)
Manually Create a Windows Service for MongoDB Enterprise (page 56)
Overview Use this tutorial to install MongoDB Enterprise41 on Windows systems. MongoDB Enterprise is available
on select platforms and contains support for several features related to security and monitoring.
Prerequisites MongoDB Enterprise Server for Windows requires Windows Server 2008 R2 or later. The .msi
installer includes all other software dependencies and will automatically upgrade any older version of MongoDB
installed using an .msi file.
Step 1: Download MongoDB Enterprise for Windows. Download the latest production release of MongoDB
Enterprise43 .
To find which version of Windows you are running, enter the following commands in the Command Prompt or Pow-
ershell:
wmic os get caption
wmic os get osarchitecture
Interactive Installation
Step 1: Install MongoDB Enterprise for Windows. In Windows Explorer, locate the downloaded MongoDB .msi
file, which typically is located in the default Downloads folder. Double-click the .msi file. A set of screens will
appear to guide you through the installation process.
You may specify an installation directory if you choose the Custom installation option.
Note: These instructions assume that you have installed MongoDB to C:\mongodb.
MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from any folder
you choose. You may install MongoDB in any folder (e.g. D:\test\mongodb).
Unattended Installation You may install MongoDB unattended on Windows from the command line using
msiexec.exe.
Step 1: Install MongoDB Enterprise for Windows. Change to the directory containing the .msi installation
binary of your choice and invoke:
msiexec.exe /q /i mongodb-win32-x86_64-2008plus-ssl-3.2.4-signed.msi ^
INSTALLLOCATION="C:\mongodb" ^
ADDLOCAL="all"
41 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
42 https://docs.mongodb.org/v3.0/tutorial/install-mongodb-enterprise-on-windows/
43 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
2.3. Tutorials 53
MongoDB Documentation, Release 3.2.4
You can specify the installation location for the executable by modifying the INSTALLLOCATION value.
By default, this method installs all MongoDB binaries. To install specific MongoDB component sets, you can specify
them in the ADDLOCAL argument using a comma-separated list including one or more of the following component
sets:
Component Set Binaries
Server mongod.exe
Router mongos.exe
Client mongo.exe
MonitoringTools mongostat.exe, mongotop.exe
ImportExportTools mongodump.exe, mongorestore.exe, mongoexport.exe,
mongoimport.exe
MiscellaneousTools bsondump.exe, mongofiles.exe, mongooplog.exe, mongoperf.exe
For instance, to install only the MongoDB utilities, invoke:
msiexec.exe /q /i mongodb-win32-x86_64-2008plus-ssl-3.2.4-signed.msi ^
INSTALLLOCATION="C:\mongodb" ^
ADDLOCAL="MonitoringTools,ImportExportTools,MiscellaneousTools"
Warning: Do not make mongod.exe visible on public networks without running in Secure Mode
Run MongoDB Enterprise auth setting. MongoDB is designed to be run in trusted environments, and the database does not enab
Mode by default.
Step 1: Set up the MongoDB environment. MongoDB requires a data directory to store all data. MongoDBs
default data directory path is \data\db. Create this folder using the following commands from a Command Prompt:
md \data\db
You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example:
C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data
If your path includes spaces, enclose the entire path in double quotes, for example:
C:\mongodb\bin\mongod.exe --dbpath "d:\test\mongo db data"
Step 2: Start MongoDB. To start MongoDB, run mongod.exe. For example, from the Command Prompt:
C:\mongodb\bin\mongod.exe
This starts the main MongoDB database process. The waiting for connections message in the console
output indicates that the mongod.exe process is running successfully.
Depending on the security level of your system, Windows may pop up a Security Alert dialog box about blocking
some features of C:\mongodb\bin\mongod.exe from communicating on networks. All users should select
Private Networks, such as my home or work network and click Allow access. For additional
information on security and MongoDB, please see the Security Documentation (page 315).
Step 3: Connect to MongoDB. To connect to MongoDB through the mongo.exe shell, open another Command
Prompt.
C:\mongodb\bin\mongo.exe
If you want to develop applications using .NET, see the documentation of C# and MongoDB44 for more information.
Step 4: Begin using MongoDB. To help you start using MongoDB, MongoDB provides Getting Started Guides in
various driver editions. See getting-started for the available editions.
Before deploying MongoDB in a production environment, consider the Production Notes (page 214) document.
Later, to stop MongoDB, press Control+C in the terminal where the mongod instance is running.
Step 1: Open an Administrator command prompt. Press the Win key, type cmd.exe, and press Ctrl +
Shift + Enter to run the Command Prompt as Administrator.
Execute the remaining steps from the Administrator command prompt.
Step 2: Create directories. Create directories for your database and log files:
mkdir c:\data\db
mkdir c:\data\log
Step 3: Create a configuration file. Create a configuration file. The file must set systemLog.path. Include
additional configuration options as appropriate.
For example, create a file at C:\mongodb\mongod.cfg that specifies both systemLog.path and
storage.dbPath:
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
Install the MongoDB service by starting mongod.exe with the --install option and the -config option to
specify the previously created configuration file.
"C:\mongodb\bin\mongod.exe" --config "C:\mongodb\mongod.cfg" --install
To use an alternate dbpath, specify the path in the configuration file (e.g. C:\mongodb\mongod.cfg) or on the
command line with the --dbpath option.
If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with
a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system
resources exist and your system design requires it.
44 https://docs.mongodb.org/ecosystem/drivers/csharp
2.3. Tutorials 55
MongoDB Documentation, Release 3.2.4
Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-
mand:
net stop MongoDB
Manually Create a Windows Service for MongoDB Enterprise You can set up the MongoDB server as a Windows
Service that starts automatically at boot time.
The following procedure assumes you have installed MongoDB using the .msi installer with the path
C:\mongodb\.
If you have installed in an alternative directory, you will need to adjust the paths as appropriate.
Step 1: Open an Administrator command prompt. Press the Win key, type cmd.exe, and press Ctrl +
Shift + Enter to run the Command Prompt as Administrator.
Execute the remaining steps from the Administrator command prompt.
Step 2: Create directories. Create directories for your database and log files:
mkdir c:\data\db
mkdir c:\data\log
Step 3: Create a configuration file. Create a configuration file. The file must set systemLog.path. Include
additional configuration options as appropriate.
For example, create a file at C:\mongodb\mongod.cfg that specifies both systemLog.path and
storage.dbPath:
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
sc.exe requires a space between = and the configuration values (eg binPath= ), and a \ to escape double
quotes.
If successfully created, the following log message will display:
[SC] CreateService SUCCESS
Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service, use the following com-
mand:
net stop MongoDB
To remove the MongoDB service, first stop the service and then run the following command:
sc.exe delete MongoDB
On this page
Overview (page 57)
Procedures (page 57)
Overview
The MongoDB release team digitally signs all software packages to certify that a particular MongoDB package is a
valid and unaltered MongoDB release. Before installing MongoDB, you should validate the package using either the
provided PGP signature or SHA-256 checksum.
PGP signatures provide the strongest guarantees by checking both the authenticity and integrity of a file to prevent
tampering.
Cryptographic checksums only validate file integrity to prevent network transmission errors.
Procedures
Use PGP/GPG MongoDB signs each release branch with a different PGP key. The public key files for each release
branch since MongoDB 2.2 are available for download from the key server45 in both textual .asc and binary .pub
formats.
Step 1: Download the MongoDB installation file. Download the binaries from
https://www.mongodb.org/downloads based on your environment.
For example, to download the 3.0.5 release for OS X through the shell, type this command:
curl -LO https://fastdl.mongodb.org/osx/mongodb-osx-x86_64-3.0.5.tgz
2.3. Tutorials 57
MongoDB Documentation, Release 3.2.4
Step 3: Download then import the key file. If you have not downloaded and imported the MongoDB 3.0 public
key, enter these commands:
curl -LO https://www.mongodb.org/static/pgp/server-3.0.asc
gpg --import server-3.0.asc
If you receive a message such as the following, confirm that you imported the correct public key:
gpg: Signature made Mon 27 Jul 2015 07:51:53 PM EDT using RSA key ID 24F3C978
gpg: Can't check signature: public key not found
gpg will return the following message if the package is properly signed, but you do not currently trust the signing
key in your local trustdb.
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 89AE C6ED 5423 0831 793F 1384 BE0E B6AA 24F3 C978
Use SHA-256
Step 1: Download the MongoDB installation file. Download the binaries from
https://www.mongodb.org/downloads based on your environment.
For example, to download the 3.0.5 release for OS X through the shell, type this command:
curl -LO https://fastdl.mongodb.org/osx/mongodb-osx-x86_64-3.0.5.tgz
Step 3: Use the SHA-256 checksum to verify the MongoDB package file. Compute the checksum of the package
file:
shasum -c mongodb-osx-x86_64-3.0.5.tgz.sha256
which should return the following if the checksum matched the downloaded package:
mongodb-osx-x86_64-3.0.5.tgz: OK
46 https://docs.cloud.mongodb.com/tutorial/getting-started?jmp=docs
47 https://docs.opsmanager.mongodb.com/current/tutorial/nav/management
48 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
MongoDB provides rich semantics for reading and manipulating data. CRUD stands for create, read, update, and
delete. These terms are the foundation for all interactions with the database.
MongoDB CRUD Introduction (page 61) An introduction to the MongoDB data model as well as queries and data
manipulations.
MongoDB CRUD Concepts (page 63) The core documentation of query and data manipulation.
MongoDB CRUD Tutorials (page 99) Examples of basic query and data modification operations.
MongoDB CRUD Reference (page 140) Reference material for the query and data manipulation interfaces.
On this page
Database Operations (page 62)
MongoDB stores data in the form of documents, which are JSON-like field and value pairs. Documents are analogous
to structures in programming languages that associate keys with values (e.g. dictionaries, hashes, maps, and associative
arrays). Formally, MongoDB documents are BSON documents. BSON is a binary representation of JSON with
additional type information. In the documents, the value of a field can be any of the BSON data types, including other
documents, arrays, and arrays of documents. For more information, see Documents (page 186).
MongoDB stores all documents in collections. A collection is a group of related documents that have a set of shared
common indexes. Collections are analogous to a table in relational databases.
61
MongoDB Documentation, Release 3.2.4
Query
In MongoDB a query targets a specific collection of documents. Queries specify criteria, or conditions, that identify
the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from
the matching documents to return. You can optionally modify queries to impose limits, skips, and sort orders.
In the following diagram, the query process specifies a query criteria and a sort modifier:
Data Modification
Data modification refers to operations that create, update, or delete data. In MongoDB, these operations modify the
data of a single collection. For the update and delete operations, you can specify the criteria to select the documents
to update or remove.
In the following diagram, the insert operation adds a new document to the users collection.
The Read Operations (page 64) and Write Operations (page 77) documents introduce the behavior and operations of
read and write operations for MongoDB deployments.
Read Operations (page 64) Queries are the core operations that return data in MongoDB. Introduces queries, their
behavior, and performances.
Cursors (page 67) Queries return iterable objects, called cursors, that hold the full result set.
Query Optimization (page 69) Analyze and improve query performance.
Distributed Queries (page 73) Describes how sharded clusters and replica sets affect the performance of read
operations.
Write Operations (page 77) Write operations insert, update, or remove documents in MongoDB. Introduces data
create and modify operations, their behavior, and performances.
Atomicity and Transactions (page 88) Describes write operation atomicity in MongoDB.
Distributed Write Operations (page 89) Describes how MongoDB directs write operations on sharded clusters
and replica sets and the performance characteristics of these operations.
Continue reading from Write Operations (page 77) for additional background on the behavior of data modifica-
tion operations in MongoDB.
On this page
Query Interface (page 64)
Query Behavior (page 65)
Query Statements (page 65)
Projections (page 66)
Read operations, or queries, retrieve data stored in the database. In MongoDB, queries select documents from a single
collection.
Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may
include a projection that specifies the fields from the matching documents to return. The projection limits the amount
of data that MongoDB returns to the client over the network.
Query Interface
For query operations, MongoDB provides a db.collection.find() method. The method accepts both the
query criteria and projections and returns a cursor (page 67) to the matching documents. You can optionally modify
the query to impose limits, skips, and sort orders.
The following diagram highlights the components of a MongoDB query operation:
The next diagram shows the same query in SQL:
Example
This query selects the documents in the users collection that match the condition age is greater than 18. To specify
the greater than condition, query criteria uses the greater than (i.e. $gt) query selection operator. The query returns
at most 5 matching documents (or more precisely, a cursor to those documents). The matching documents will return
with only the _id, name and address fields. See Projections (page 66) for details.
See
SQL to MongoDB Mapping Chart (page 145) for additional examples of MongoDB queries and the corresponding
SQL statements.
Query Behavior
Query Statements
Consider the following diagram of the query process that specifies a query criteria and a sort modifier:
In the diagram, the query selects documents from the users collection. Using a query selection operator
to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt)
18. Then the sort() modifier sorts the results by age in ascending order.
Projections
Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB
sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications
reduce their network overhead and processing requirements.
Projections, which are the second argument to the find() method, may either specify a list of fields to return or list
fields to exclude in the result documents.
Important: Except for excluding the _id field in inclusive projections, you cannot mix exclusive and inclusive
projections.
Consider the following diagram of the query process that specifies a query criteria and a projection:
In the diagram, the query selects from the users collection. The criteria matches the documents that have age equal
to 18. Then the projection specifies that only the name field should return in the matching documents.
Projection Examples
This query selects documents in the records collection that match the condition { "user_id": { $lt: 42
} }, and uses the projection { "history": 0 } to exclude the history field from the documents in the result
set.
This query selects documents in the records collection that match the query { "user_id": { $lt: 42 }
} and uses the projection { "name": 1, "email": 1 } to return just the _id field (implicitly included),
name field, and the email field in the documents in the result set.
This query selects documents in the records collection that match the query { "user_id": { $lt: 42}
}, and only returns the name and email fields in the documents in the result set.
See
Limit Fields to Return from a Query (page 115) for more examples of queries with projection statements.
Cursors
On this page
Cursor Behaviors (page 68)
Cursor Information (page 69)
In the mongo shell, the primary method for the read operation is the db.collection.find() method. This
method queries a collection and returns a cursor to the returning documents.
To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not
assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times 1 to print up to
the first 20 documents in the results.
For example, in the mongo shell, the following read operation queries the inventory collection for documents that
have type equal to food and automatically print up to the first 20 matching documents:
db.inventory.find( { type: 'food' } );
To manually iterate the cursor to access the documents, see Iterate a Cursor in the mongo Shell (page 120).
Cursor Behaviors
Closure of Inactive Cursors By default, the server will automatically close the cursor after 10 minutes of in-
activity, or if client has exhausted the cursor. To override this behavior in the mongo shell, you can use the
cursor.noCursorTimeout() method:
var myCursor = db.inventory.find().noCursorTimeout();
After setting the noCursorTimeout option, you must either close the cursor manually with cursor.close()
or by exhausting the cursors results.
See your driver documentation for information on setting the noCursorTimeout option.
Cursor Isolation As a cursor returns documents, other operations may interleave with the query. For the MMAPv1
(page 595) storage engine, intervening write operations on a document may result in a cursor that returns a docu-
ment more than once if that document has changed. To handle this situation, see the information on snapshot mode
(page 833).
Cursor Batches The MongoDB server returns the query results in batches. Batch size will not exceed the maximum
BSON document size. For most queries, the first batch returns 101 documents or just enough documents to exceed 1
megabyte. Subsequent batch size is 4 megabytes. To override the default size of the batch, see batchSize() and
limit().
For queries that include a sort operation without an index, the server must load all the documents in memory to perform
the sort before returning any results.
As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next()
will perform a getmore operation to retrieve the next batch. To see how many documents remain in the batch
as you iterate the cursor, you can use the objsLeftInBatch() method, as in the following example:
var myCursor = db.inventory.find();
myCursor.objsLeftInBatch();
Cursor Information
The db.serverStatus() method returns a document that includes a metrics field. The metrics field con-
tains a metrics.cursor field with the following information:
number of timed out cursors since the last server restart
number of open cursors with the option DBQuery.Option.noTimeout set to prevent timeout after a period
of inactivity
number of pinned open cursors
total number of open cursors
Consider the following example which calls the db.serverStatus() method and accesses the metrics field
from the results and then the cursor field from the metrics field:
db.serverStatus().metrics.cursor
See also:
db.serverStatus()
Query Optimization
On this page
Create an Index to Support Read Operations (page 69)
Query Selectivity (page 70)
Covered Query (page 70)
Indexes improve the efficiency of read operations by reducing the amount of data that query operations need to process.
This simplifies the work associated with fulfilling queries within MongoDB.
If your application queries a collection on a particular field or set of fields, then an index on the queried field or a
compound index (page 495) on the set of fields can prevent the query from scanning the whole collection to find and
return the query results. For more information about indexes, see the complete documentation of indexes in MongoDB
(page 492).
Example
An application queries the inventory collection on the type field. The value of the type field is user-driven.
var typeValue = <someUserInput>;
db.inventory.find( { type: typeValue } );
To improve the performance of this query, add an ascending or a descending index to the inventory collection
on the type field. 2 In the mongo shell, you can create indexes using the db.collection.createIndex()
method:
db.inventory.createIndex( { type: 1 } )
This index can prevent the above query on type from scanning the whole collection to return the results.
To analyze the performance of the query with an index, see Analyze Query Performance (page 121).
In addition to optimizing read operations, indexes can support sort operations and allow for a more efficient storage
utilization. See db.collection.createIndex() and Indexing Tutorials (page 531) for more information about
index creation.
Query Selectivity
Query selectivity refers to how well the query predicate excludes or filters out documents in a collection. Query
selectivity can determine whether or not queries can use indexes effectively or even use indexes at all.
More selective queries match a smaller percentage of documents. For instance, an equality match on the unique _id
field is highly selective as it can match at most one document.
Less selective queries match a larger percentage of documents. Less selective queries cannot use indexes effectively
or even at all.
For instance, the inequality operators $nin and $ne are not very selective since they often match a large portion of
the index. As a result, in many cases, a $nin or $ne query with an index may perform no better than a $nin or $ne
query that must scan all documents in a collection.
The selectivity of regular expressions depends on the expressions themselves. For details, see regular expres-
sion and index use.
Covered Query
An index covers (page 70) a query when both of the following apply:
all the fields in the query (page 103) are part of an index, and
all the fields returned in the results are in the same index.
For example, a collection inventory has the following index on the type and item fields:
db.inventory.createIndex( { type: 1, item: 1 } )
This index will cover the following operation which queries on the type and item fields and returns only the item
field:
2 For single-field indexes, the selection between ascending and descending order is immaterial. For compound indexes, the selection is important.
db.inventory.find(
{ type: "food", item:/^c/ },
{ item: 1, _id: 0 }
)
For the specified index to cover the query, the projection document must explicitly specify _id: 0 to exclude the
_id field from the result since the index does not include the _id field.
Performance Because the index contains all fields required by the query, MongoDB can both match the query
conditions (page 103) and return the results using only the index.
Querying only the index can be much faster than querying documents outside of the index. Index keys are typically
smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk.
Limitations
However, the query can use the { "user.login": 1 } index to find matching documents.
Restrictions on Sharded Collection An index cannot cover a query on a sharded collection when run against a
mongos if the index does not contain the shard key, with the following exception for the _id index: If a query on a
sharded collection only specifies a condition on the _id field and returns only the _id field, the _id index can cover
the query when run against a mongos even if the _id field is not the shard key.
Changed in version 3.0: In previous versions, an index cannot cover (page 70) a query on a sharded collection when
run against a mongos.
explain To determine whether a query is a covered query, use the db.collection.explain() or the
explain() method and review the results.
db.collection.explain() provides information on the execution of other operations, such as
db.collection.update(). See db.collection.explain() for details.
For more information see Measure Index Use (page 545).
3 To index fields in embedded documents, use dot notation.
Query Plans
On this page
Query Optimization (page 72)
Query Plan Revision (page 73)
Cached Query Plan Interface (page 73)
Index Filters (page 73)
The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the
available indexes. The query system then uses this query plan each time the query runs.
The query optimizer only caches the plans for those query shapes that can have more than one viable plan.
The query optimizer occasionally reevaluates query plans as the content of the collection changes to ensure optimal
query plans. You can also specify which indexes the optimizer evaluates with Index Filters (page 73).
You can use the db.collection.explain() or the cursor.explain() method to view statistics about the
query plan for a given query. This information can help as you develop indexing strategies (page 573).
db.collection.explain() provides information on the execution of other operations, such as
db.collection.update(). See db.collection.explain() for details.
Query Optimization
As collections change over time, the query optimizer deletes the query plan and re-evaluates after any of the following
events:
The collection receives 1,000 write operations.
The reIndex rebuilds the index.
You add or drop an index.
The mongod process restarts.
Changed in version 2.6: explain() operations no longer read from or write to the query planner cache.
Index Filters
Distributed Queries
On this page
Read Operations to Sharded Clusters (page 74)
Read Operations to Replica Sets (page 77)
Sharded clusters allow you to partition a data set among a cluster of mongod instances in a way that is nearly trans-
parent to the application. For an overview of sharded clusters, see the Sharding (page 725) section of this manual.
For a sharded cluster, applications issue operations to one of the mongos instances associated with the cluster.
Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections
should include the collections shard key (page 739). When a query includes a shard key, the mongos can use cluster
metadata from the config database (page 734) to route the queries to shards.
If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter
gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations.
For replica set shards, read operations from secondary members of replica sets may not reflect the current state of the
primary. Read preferences that direct read operations to different servers may result in non-monotonic reads.
For more information on read operations in sharded clusters, see the Sharded Cluster Query Routing (page 744) and
Shard Keys (page 739) sections.
By default, clients reads from a replica sets primary; however, clients can specify a read preference (page 641) to
direct read operations to other members. For example, clients can configure read preferences to read from secondaries
or from nearest member to:
reduce latency in multi-data-center deployments,
improve read throughput by distributing high read-volumes (relative to write volume),
perform backup operations, and/or
allow reads until a new primary is elected (page 635).
Read operations from secondary members of replica sets may not reflect the current state of the primary. Read prefer-
ences that direct read operations to different servers may result in non-monotonic reads.
You can configure the read preferece on a per-connection or per-operation basis. For more information on read prefer-
ence or on the read preference modes, see Read Preference (page 641) and Read Preference Modes (page 721).
Write Operation Performance (page 92) Introduces the performance constraints and factors for writing data to Mon-
goDB deployments.
Bulk Write Operations (page 93) Provides an overview of MongoDBs bulk write operations.
On this page
Insert (page 78)
Update (page 82)
Delete (page 85)
Additional Methods (page 88)
A write operation is any operation that creates or modifies data in the MongoDB instance. In MongoDB, write
operations target a single collection. All write operations in MongoDB are atomic on the level of a single document.
There are three classes of write operations in MongoDB: insert (page 78), update (page 82), and delete (page 85).
Insert operations add new documents to a collection. Update operations modify existing documents, and delete oper-
ations delete documents from a collection. No insert, update, or delete can affect more than one document atomically.
For the update and remove operations, you can specify criteria, or filters, that identify the documents to update or
remove. These operations use the same query syntax to specify the criteria as read operations (page 64).
MongoDB allows applications to determine the acceptable level of acknowledgement required of write operations.
See Write Concern (page 141) for more information.
Insert
MongoDB provides the following methods for inserting documents into a collection:
db.collection.insertOne()
db.collection.insertMany()
db.collection.insert()
Example
The following operation inserts a new document into the users collection. The new document has three fields name,
age, and status. Since the document does not specify an _id field, MongoDB adds the _id field and a generated
value to the new document. See Insert Behavior (page 81).
db.users.insertOne(
{
name: "sue",
age: 26,
status: "pending"
}
)
Example
The following operation inserts three new documents into the users collection. Each document has three fields
name, age, and status. Since the documents do not specify an _id field, MongoDB adds the _id field and a
generated value to each document. See Insert Behavior (page 81).
db.users.insertMany(
[
{ name: "sue", age: 26, status: "pending" },
{ name: "bob", age: 25, status: "enrolled" },
{ name: "ann", age: 28, status: "enrolled" }
]
)
insert In MongoDB, the db.collection.insert() method adds new documents to a collection. It can take
either a single document or an array of documents to insert.
The following diagram highlights the components of a MongoDB insert operation:
Example
The following operation inserts a new document into the users collection. The new document has three fields name,
age, and status. Since the document does not specify an _id field, MongoDB adds the _id field and a generated
value to the new document. See Insert Behavior (page 81).
db.users.insert(
{
name: "sue",
age: 26,
status: "A"
}
)
Insert Behavior The _id field is required in every MongoDB document. The _id field is like the documents
primary key.
If you add a new document without the _id field, the client library or the mongod instance adds an _id field and
populates the field with a unique ObjectId. If you pass in an _id value that already exists, an exception is thrown.
The _id field is uniquely indexed by default in every collection.
Other Methods to Add Documents The updateOne(), updateMany(), and replaceOne() operations
accept the upsert parameter. When upsert : true, if no document in the collection matches the filter, a new
document is created based on the information passed to the operation. See Update Behavior with the upsert Option
(page 85).
Update
Example
This update operation on the users collection sets the status field to reject for the first document that matches
the filter of age less than 18. See Update Behavior (page 85).
db.users.updateOne(
{ age: { $lt: 18 } },
{ $set: { status: "reject" } }
)
Example
This update operation on the users collection sets the status field to reject for all documents that match the
filter of age less than 18. See Update Behavior (page 85).
db.users.updateMany(
{ age: { $lt: 18 } },
{ $set: { status: "reject" } }
)
Example
This replace operation on the users collection replaces the first document that matches the filter of name is sue
with a new document. See Replace Behavior (page 85).
db.users.replaceOne(
{ name: "sue" },
{ name: "amy", age : 25, score: "enrolled" }
)
update In MongoDB, the db.collection.update() method modifies existing documents in a collection. The
db.collection.update() method can accept query criteria to determine which documents to update as well as
an options document that affects its behavior, such as the multi option to update multiple documents.
Operations performed by an update are atomic within a single document. For example, you can safely use the $inc
and $mul operators to modify frequently-changed fields in concurrent applications.
The following diagram highlights the components of a MongoDB update operation:
Example
db.users.update(
{ age: { $gt: 18 } },
{ $set: { status: "A" } },
{ multi: true }
)
This update operation on the users collection sets the status field to A for the documents that match the criteria
of age greater than 18.
Delete
MongoDB provides the following methods for deleting documents from a collection:
db.collection.deleteOne()
db.collection.deleteMany()
db.collection.remove()
Example
This delete operation on the users collection deletes the first document where name is sue. See Delete Behavior
(page 87).
db.users.deleteOne(
{ status: "reject" }
)
Example
This delete operation on the users collection deletes all documents where status is reject. See Delete Behavior
(page 87).
db.users.deleteMany(
{ status: "reject" }
)
remove In MongoDB, the db.collection.remove() method deletes documents from a collection. The
db.collection.remove() method accepts query criteria to determine which documents to remove as well as
an options document that affects its behavior, such as the justOne option to remove only a single document.
The following diagram highlights the components of a MongoDB remove operation:
Example
db.users.remove(
{ status: "D" }
)
This delete operation on the users collection removes all documents that match the criteria of status equal to D.
For more information, see db.collection.remove() method and Remove Documents (page 114).
Delete Behavior deleteOne() will delete the first document that matches the filter.
db.collection.findOneAndDelete() offers sorting of the filter results, allowing a degree of con-
trol over which document is deleted.
Remove Behavior By default, db.collection.remove() method removes all documents that match its query.
If the optional justOne parameter is set to true, remove() will limit the delete operation to a single document.
Additional Methods
The db.collection.save() method can either update an existing document or insert a document if the docu-
ment cannot be found by the _id field. See db.collection.save() for more information and examples.
Bulk Write MongoDB provides the db.collection.bulkWrite() method for executing multiple write op-
erations in a group. Each write operation is still atomic on the level of a single document.
Example
The following bulkWrite() inserts several documents, performs an update, and then deletes several documents.
db.collection.bulkWrite(
[
{ insertOne : { "document" : { name : "sue", age : 26 } } },
{ insertOne : { "document" : { name : "joe", age : 24 } } },
{ insertOne : { "document" : { name : "ann", age : 25 } } },
{ insertOne : { "document" : { name : "bob", age : 27 } } },
{ updateMany: {
"filter" : { age : { $gt : 25} },
"update" : { $set : { "status" : "enrolled" } }
}
},
{ deleteMany : { "filter" : { "status" : { $exists : true } } } }
]
)
On this page
$isolated Operator (page 88)
Transaction-Like Semantics (page 89)
Concurrency Control (page 89)
In MongoDB, a write operation is atomic on the level of a single document, even if the operation modifies multiple
embedded documents within a single document.
When a single write operation modifies multiple documents, the modification of each document is atomic, but the
operation as a whole is not atomic and other operations may interleave. However, you can isolate a single write
operation that affects multiple documents using the $isolated operator.
$isolated Operator
Using the $isolated operator, a write operation that affects multiple documents can prevent other processes from
interleaving once the write operation modifies the first document. This ensures that no client sees the changes until the
write operation completes or errors out.
$isolated does not work with sharded clusters.
An isolated write operation does not provide all-or-nothing atomicity. That is, an error during the write operation
does not roll back all its changes that preceded the error.
Note: $isolated operator causes write operations to acquire an exclusive lock on the collection, even for
document-level locking storage engines such as WiredTiger. That is, $isolated operator will make WiredTiger
single-threaded for the duration of the operation.
Transaction-Like Semantics
Since a single document can contain multiple embedded documents, single-document atomicity is sufficient for many
practical use cases. For cases where a sequence of write operations must operate as if in a single transaction, you can
implement a two-phase commit (page 125) in your application.
However, two-phase commits can only offer transaction-like semantics. Using two-phase commit ensures data consis-
tency, but it is possible for applications to return intermediate data during the two-phase commit or rollback.
For more information on two-phase commit and rollback, see Perform Two Phase Commits (page 125).
Concurrency Control
Concurrency control allows multiple applications to run concurrently without causing data inconsistency or conflicts.
One approach is to create a unique index (page 514) on a field that can only have unique values. This prevents
insertions or updates from creating duplicate data. Create a unique index on multiple fields to force uniqueness on
that combination of field values. For examples of use cases, see update() and Unique Index and findAndModify() and
Unique Index.
Another approach is to specify the expected current value of a field in the query predicate for the write operations. For
an example, see Update if Current (page 132).
The two-phase commit pattern provides a variation where the query predicate includes the application identifier
(page 130) as well as the expected state of the data in the write operation.
See also:
Read Isolation, Consistency, and Recency (page 96)
On this page
Write Operations on Sharded Clusters (page 89)
Write Operations on Replica Sets (page 90)
For sharded collections in a sharded cluster, the mongos directs write operations from applications to the shards that
are responsible for the specific portion of the data set. The mongos uses the cluster metadata from the config database
MongoDB partitions data in a sharded collection into ranges based on the values of the shard key. Then, MongoDB
distributes these chunks to shards. The shard key determines the distribution of chunks to shards. This can affect the
performance of write operations in the cluster.
Important: Update operations that affect a single document must include the shard key or the _id field. Updates
that affect multiple documents are more efficient in some situations if they have the shard key, but can be broadcast to
all shards.
If the value of the shard key increases or decreases with every insert, all insert operations target a single shard. As a
result, the capacity of a single shard becomes the limit for the insert capacity of the sharded cluster.
For more information, see Sharded Cluster Tutorials (page 756) and Bulk Write Operations (page 93).
In replica sets, all write operations go to the sets primary. The primary applies the write operation and records the
operations on the primarys operation log or oplog. The oplog is a reproducible sequence of operations to the data
set. Secondary members of the set continuously replicate the oplog and apply the operations to themselves in an
asynchronous process.
For more information on replica sets and write operations, see Replication Introduction (page 613) and Write Concern
(page 141).
On this page
Indexes (page 92)
Document Growth and the MMAPv1 Storage Engine (page 92)
Storage Performance (page 93)
Additional Resources (page 93)
Indexes
After every insert, update, or delete operation, MongoDB must update every index associated with the collection in
addition to the data itself. Therefore, every index on a collection adds some amount of overhead for the performance
of write operations. 4
In general, the performance gains that indexes provide for read operations are worth the insertion penalty. However,
in order to optimize write performance when possible, be careful when creating new indexes and evaluate the existing
indexes to ensure that your queries actually use these indexes.
For indexes and queries, see Query Optimization (page 69). For more information on indexes, see Indexes (page 487)
and Indexing Strategies (page 573).
Some update operations can increase the size of the document; for instance, if an update adds a new field to the
document.
For the MMAPv1 storage engine, if an update operation causes a document to exceed the currently allocated record
size, MongoDB relocates the document on disk with enough contiguous space to hold the document. Updates that
require relocations take longer than updates that do not, particularly if the collection has indexes. If a collection has
indexes, MongoDB must update all index entries. Thus, for a collection with many indexes, the move will impact the
write throughput.
Changed in version 3.0.0: By default, MongoDB uses Power of 2 Sized Allocations (page 596) to add padding au-
tomatically (page 596) for the MMAPv1 storage engine. The Power of 2 Sized Allocations (page 596) ensures that
MongoDB allocates document space in sizes that are powers of 2, which helps ensure that MongoDB can efficiently
reuse free space created by document deletion or relocation as well as reduce the occurrences of reallocations in many
cases.
Although Power of 2 Sized Allocations (page 596) minimizes the occurrence of re-allocation, it does not eliminate
document re-allocation.
See MMAPv1 Storage Engine (page 595) for more information.
4 For inserts and updates to un-indexed fields, the overhead for sparse indexes (page 519) is less than for non-sparse indexes. Also for non-sparse
indexes, updates that do not change the record size have less indexing overhead.
Storage Performance
Hardware The capability of the storage system creates some important physical limits for the performance of Mon-
goDBs write operations. Many unique factors related to the storage system of the drive affect write performance,
including random access patterns, disk caches, disk readahead and RAID configurations.
Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads.
See
Production Notes (page 214) for recommendations regarding additional hardware and configuration options.
Journaling To provide durability in the event of a crash, MongoDB uses write ahead logging to an on-disk journal.
MongoDB writes the in-memory changes first to the on-disk journal files. If MongoDB should terminate or encounter
an error before committing the changes to the data files, MongoDB can use the journal files to apply the write operation
to the data files.
While the durability assurance provided by the journal typically outweigh the performance costs of the additional write
operations, consider the following interactions between the journal and performance:
If the journal and the data file reside on the same block device, the data files and the journal may have to contend
for a finite number of available I/O resources. Moving the journal to a separate device may increase the capacity
for write operations.
If applications specify write concerns (page 141) that include the j option (page 143), mongod will decrease
the duration between journal writes, which can increase the overall write load.
The duration between journal writes is configurable using the commitIntervalMs run-time option. De-
creasing the period between journal commits will increase the number of write operations, which can limit
MongoDBs capacity for write operations. Increasing the amount of time between journal commits may de-
crease the total number of write operation, but also increases the chance that the journal will not record a write
operation in the event of a failure.
For additional information on journaling, see Journaling (page 598).
Additional Resources
On this page
Overview (page 94)
Ordered vs Unordered Operations (page 94)
bulkWrite() Methods (page 94)
Strategies for Bulk Inserts to a Sharded Collection (page 96)
5 https://www.mongodb.com/products/consulting?jmp=docs#performance_evaluation
Overview
MongoDB provides clients the ability to perform write operations in bulk. Bulk write operations affect a single
collection. MongoDB allows applications to determine the acceptable level of acknowledgement required for bulk
write operations.
New in version 3.2.
The db.collection.bulkWrite() method provides the ability to perform bulk insert, update, and remove
operations. MongoDB also supports bulk insert through the db.collection.insertMany().
bulkWrite() Methods
try {
db.characters.bulkWrite(
[
{ insertOne :
{
"document" :
{
"_id" : 4, "char" : "Dithras", "class" : "barbarian", "lvl" : 4
}
}
},
{ insertOne :
{
"document" :
{
"_id" : 5, "char" : "Taeln", "class" : "fighter", "lvl" : 3
}
}
},
{ updateOne :
{
"filter" : { "char" : "Eldon" },
"update" : { $set : { "status" : "Critical Injury" } }
}
},
{ deleteOne :
{ "filter" : { "char" : "Brisbane"} }
},
{ replaceOne :
{
"filter" : { "char" : "Meldane" },
"replacement" : { "char" : "Tanys", "class" : "oracle", "lvl" : 4 }
}
}
]
);
}
catch (e) {
print(e);
}
}
}
Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster perfor-
mance. For bulk inserts, consider the following strategies:
Pre-Split the Collection If the sharded collection is empty, then the collection has only one initial chunk, which
resides on a single shard. MongoDB must then take time to receive data, create splits, and distribute the split chunks
to the available shards. To avoid this performance cost, you can pre-split the collection, as described in Split Chunks
in a Sharded Cluster (page 800).
Unordered Writes to mongos To improve write performance to sharded clusters, use bulkWrite() with the
optional parameter ordered set to false. mongos can attempt to send the writes to multiple shards simultaneously.
For empty collections, first pre-split the collection as described in Split Chunks in a Sharded Cluster (page 800).
Avoid Monotonic Throttling If your shard key increases monotonically during an insert, then all inserted data goes
to the last chunk in the collection, which will always end up on a single shard. Therefore, the insert capacity of the
cluster will never exceed the insert capacity of that single shard.
If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing
shard key, then consider the following modifications to your application:
Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order
with increasing sequence of values.
Swap the first and last 16-bit words to shuffle the inserts.
Example
The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so they are
no longer monotonically increasing.
using namespace mongo;
OID make_an_id() {
OID x = OID::gen();
const unsigned char *p = x.getData();
swap( (unsigned short&) p[0], (unsigned short&) p[10] );
return x;
}
void foo() {
// create an object
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
// now we may insert o into a sharded collection
}
See also:
Shard Keys (page 739) for information on choosing a sharded key. Also see Shard Key Internals (page 739) (in
particular, Choosing a Shard Key (page 763)).
On this page
Isolation Guarantees (page 97)
Consistency Guarantees (page 98)
Recency (page 99)
Isolation Guarantees
Read Uncommitted
In MongoDB, clients can see the results of writes before the writes are durable:
Regardless of write concern (page 141), other clients using "local" (page 144) (i.e. the default) readConcern
can see the result of a write operation before the write operation is acknowledged to the issuing client.
Clients using "local" (page 144) (i.e. the default) readConcern can read data which may be subsequently
rolled back (page 638).
Read uncommitted is the default isolation level and applies to mongod standalone instances as well as to replica sets
and sharded clusters.
Write operations are atomic with respect to a single document; i.e. if a write is updating multiple fields in the document,
a reader will never see the document with only some of the fields updated.
With a single mongod instance, a set of read and write operations to a single document is serializable. With replica
sets, only in the absence of a rollback, is a set of read and write operations to a single document serializable.
However, although the readers may not see a partially updated document, read uncommitted means that concurrent
readers may still see the updated document before the changes are durable.
When a single write operation modifies multiple documents, the modification of each document is atomic, but the
operation as a whole is not atomic and other operations may interleave. However, you can isolate a single write
operation that affects multiple documents using the $isolated operator.
Without isolating the multi-document write operations, MongoDB exhibits the following behavior:
1. Non-point-in-time read operations. Suppose a read operation begins at time t1 and starts reading documents. A
write operation then commits an update to one of the documents at some later time t2 . The reader may see the
updated version of the document, and therefore does not see a point-in-time snapshot of the data.
2. Non-serializable operations. Suppose a read operation reads a document d1 at time t1 and a write operation
updates d1 at some later time t3 . This introduces a read-write dependency such that, if the operations were to be
serialized, the read operation must precede the write operation. But also suppose that the write operation updates
document d2 at time t2 and the read operation subsequently reads d2 at some later time t4 . This introduces a
write-read dependency which would instead require the read operation to come after the write operation in a
serializable schedule. There is a dependency cycle which makes serializability impossible.
3. Dropped results for MMAPv1. For MMAPv1, reads may miss matching documents that are updated or deleted
during the course of the read operation. However, data that has not been modified during the operation will
always be visible.
Using the $isolated operator, a write operation that affects multiple documents can prevent other processes from
interleaving once the write operation modifies the first document. This ensures that no client sees the changes until the
write operation completes or errors out.
$isolated does not work with sharded clusters.
An isolated write operation does not provide all-or-nothing atomicity. That is, an error during the write operation
does not roll back all its changes that preceded the error.
Note: $isolated operator causes write operations to acquire an exclusive lock on the collection, even for
document-level locking storage engines such as WiredTiger. That is, $isolated operator will make WiredTiger
single-threaded for the duration of the operation.
See also:
Atomicity and Transactions (page 88)
Cursor Snapshot
MongoDB cursors can return the same document more than once in some situations. As a cursor returns documents
other operations may interleave with the query. If some of these operations are updates (page 77) that cause the
document to move (in the case of MMAPv1, caused by document growth) or that change the indexed field on the
index used by the query; then the cursor will return the same document more than once.
In very specific cases, you can isolate the cursor from returning the same document more than once by using the
cursor.snapshot() method. For more information, see How do I isolate cursors from intervening write opera-
tions? (page 833).
Consistency Guarantees
Monotonic Reads
MongoDB provides monotonic reads from a standalone mongod instance. Suppose an application performs a se-
quence of operations that consists of a read operation R1 followed later in the sequence by another read operation R2 .
If the application performs the sequence on a standalone mongod instance, the later read R2 never returns results that
reflect an earlier state than that returned from R1 ; i.e. R2 returns data that is monotonically increasing in recency from
R1 .
Changed in version 3.2: For replica sets and sharded clusters, MongoDB provides monotonic reads if read operations
specify Read Concern (page 143) "majority" and read preference primary (page 721).
In previous versions, MongoDB cannot make monotonic read guarantees from replica sets and sharded clusters.
Monotonic Writes
MongoDB provides monotonic write guarantees for standalone mongod instances, replica sets, and sharded clusters.
Suppose an application performs a sequence of operations that consists of a write operation W 1 followed later in the
sequence by a write operation W 2 . MongoDB guarantees that W 1 operation precedes W 2 .
Recency
The following tutorials provide instructions for querying and modifying data. For a higher-level overview of these
operations, see MongoDB CRUD Operations (page 61).
Insert Documents (page 99) Insert new documents into a collection.
Query Documents (page 103) Find documents in a collection using search criteria.
Modify Documents (page 110) Modify documents in a collection
Remove Documents (page 114) Remove documents from a collection.
Limit Fields to Return from a Query (page 115) Limit which fields are returned by a query.
Limit Number of Elements in an Array after an Update (page 118) Use $push with modifiers to sort and maintain
an array of fixed size.
Iterate a Cursor in the mongo Shell (page 120) Access documents returned by a find query by iterating the cursor,
either manually or using the iterator index.
Analyze Query Performance (page 121) Use query introspection (i.e. explain) to analyze the efficiency of queries
and determine how a query uses available indexes.
Perform Two Phase Commits (page 125) Use two-phase commits when writing data to multiple documents.
Update Document if Current (page 132) Update a document only if it has not changed since it was last read.
Create Tailable Cursor (page 133) Create tailable cursors for use in capped collections with high numbers of write
operations for which an index would be too expensive.
Create an Auto-Incrementing Sequence Field (page 134) Describes how to create an incrementing sequence num-
ber for the _id field using a Counters Collection or an Optimistic Loop.
Perform Quorum Reads on Replica Sets (page 138) Perform quorum reads using findAndModify.
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
On this page
Insert a Document (page 100)
Insert an Array of Documents (page 101)
Insert Multiple Documents with Bulk (page 102)
Additional Examples and Methods (page 103)
Insert a Document
Insert a document into a collection named inventory. The operation will create the collection if the collection does
not currently exist.
db.inventory.insert(
{
item: "ABC1",
details: {
model: "14Q3",
manufacturer: "XYZ Company"
},
stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ],
category: "clothing"
}
)
The operation returns a WriteResult object with the status of the operation. A successful insert of the document
returns the following object:
WriteResult({ "nInserted" : 1 })
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the
WriteResult object will contain the error information.
If the insert operation is successful, verify the insertion by querying the collection.
db.inventory.find()
The returned document shows that MongoDB added an _id field to the document. If a client inserts a document that
does not contain the _id field, MongoDB adds the field with the value set to a generated ObjectId7 . The ObjectId8
values in your documents will differ from the ones shown.
7 https://docs.mongodb.org/manual/reference/object-id
8 https://docs.mongodb.org/manual/reference/object-id
You can pass an array of documents to the db.collection.insert() method to insert multiple documents.
The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-
ments returns the following object:
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 3,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the
BulkWriteResult object will contain information regarding the error.
The inserted documents will each have an _id field added by MongoDB.
The operation returns an unordered operations builder which maintains a list of operations to perform. Unordered
operations means that MongoDB can execute in parallel as well as in nondeterministic order. If an error occurs during
the processing of one of the write operations, MongoDB will continue to process remaining write operations in the
list.
You can also initialize an ordered operations builder; see db.collection.initializeOrderedBulkOp()
for details.
Add two insert operations to the bulk object using the Bulk.insert() method.
bulk.insert(
{
item: "BE10",
details: { model: "14Q2", manufacturer: "XYZ Company" },
stock: [ { size: "L", qty: 5 } ],
category: "clothing"
}
);
bulk.insert(
{
item: "ZYT1",
details: { model: "14Q1", manufacturer: "ABC Company" },
stock: [ { size: "S", qty: 5 }, { size: "M", qty: 5 } ],
category: "houseware"
}
);
Call the execute() method on the bulk object to execute the operations in its list.
bulk.execute();
The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-
ments returns the following object:
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 2,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the
BulkWriteResult object will contain information regarding the error.
On this page
Select All Documents in a Collection (page 103)
Specify Equality Condition (page 104)
Specify Conditions Using Query Operators (page 104)
Specify AND Conditions (page 104)
Specify OR Conditions (page 104)
Specify AND as well as OR Conditions (page 105)
Embedded Documents (page 105)
Arrays (page 106)
9
In MongoDB, the db.collection.find() method retrieves documents from a collection. The
db.collection.find() method returns a cursor (page 67) to the retrieved documents.
This tutorial provides examples of read operations using the db.collection.find() method in the mongo
shell. In these examples, the retrieved documents contain all their fields. To restrict the fields to return in the retrieved
documents, see Limit Fields to Return from a Query (page 115).
Not specifying a query document to the find() is equivalent to specifying an empty query document. Therefore the
following operation is equivalent to the previous operation:
db.inventory.find()
9 The db.collection.findOne() method also performs a read operation to return a single document. Internally, the
db.collection.findOne() method is the db.collection.find() method with a limit of 1.
To specify equality condition, use the query document { <field>: <value> } to select all documents that
contain the <field> with the specified <value>.
The following example retrieves from the inventory collection all documents where the type field has the value
snacks:
db.inventory.find( { type: "snacks" } )
A query document can use the query operators to specify conditions in a MongoDB query.
The following example selects all documents in the inventory collection where the value of the type field is either
food or snacks:
db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } )
Although you can express this query using the $or operator, use the $in operator rather than the $or operator when
performing equality checks on the same field.
Refer to the https://docs.mongodb.org/manual/reference/operator/query document for the
complete list of query operators.
A compound query can specify conditions for more than one field in the collections documents. Implicitly, a logical
AND conjunction connects the clauses of a compound query so that the query selects the documents in the collection
that match all the conditions.
In the following example, the query document specifies an equality match on the field type and a less than ($lt)
comparison match on the field price:
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
This query selects all documents where the type field has the value food and the value of the price field is less
than 9.95. See comparison operators for other comparison operators.
Specify OR Conditions
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
In the following example, the query document selects all documents in the collection where the field qty has a value
greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95:
db.inventory.find(
{
$or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ]
}
)
With additional clauses, you can specify precise conditions for matching documents.
In the following example, the compound query document selects all documents in the collection where the value of
the type field is food and either the qty has a value greater than ($gt) 100 or the value of the price field is
less than ($lt) 9.95:
db.inventory.find(
{
type: 'food',
$or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ]
}
)
Embedded Documents
When the field holds an embedded document, a query can either specify an exact match on the embedded document
or specify a match by individual fields in the embedded document using the dot notation.
To specify an equality match on the whole embedded document, use the query document { <field>: <value>
} where <value> is the document to match. Equality matches on an embedded document require an exact match of
the specified <value>, including the field order.
In the following example, the query matches all documents where the value of the field producer is an embedded
document that contains only the field company with the value ABC123 and the field address with the value
123 Street, in the exact order:
db.inventory.find(
{
producer:
{
company: 'ABC123',
address: '123 Street'
}
}
)
Use the dot notation to match by specific fields in an embedded document. Equality matches for specific fields in
an embedded document will select documents in the collection where the embedded document contains the specified
fields with the specified values. The embedded document can contain additional fields.
In the following example, the query uses the dot notation to match all documents where the value of the field
producer is an embedded document that contains a field company with the value ABC123 and may contain
other fields:
db.inventory.find( { 'producer.company': 'ABC123' } )
Arrays
When the field holds an array, you can query for an exact array match or for specific values in the array. If the array
holds embedded documents, you can query for specific fields in the embedded documents using dot notation.
If you specify multiple conditions using the $elemMatch operator, the array must contain at least one element that
satisfies all the conditions. See Single Element Satisfies the Criteria (page 107).
If you specify multiple conditions without using the $elemMatch operator, then some combination of the array
elements, not necessarily a single element, must satisfy all the conditions; i.e. different elements in the array can
satisfy different parts of the conditions. See Combination of Elements Satisfies the Criteria (page 107).
Consider an inventory collection that contains the following documents:
{ _id: 5, type: "food", item: "aaa", ratings: [ 5, 8, 9 ] }
{ _id: 6, type: "food", item: "bbb", ratings: [ 5, 9 ] }
{ _id: 7, type: "food", item: "ccc", ratings: [ 9, 5, 8 ] }
To specify equality match on an array, use the query document { <field>: <value> } where <value> is
the array to match. Equality matches on the array require that the array field match exactly the specified <value>,
including the element order.
The following example queries for all documents where the field ratings is an array that holds exactly three ele-
ments, 5, 8, and 9, in this order:
db.inventory.find( { ratings: [ 5, 8, 9 ] } )
Equality matches can specify a single element in the array to match. These specifications match if the array contains
at least one element with the specified value.
The following example queries for all documents where ratings is an array that contains 5 as one of its elements:
db.inventory.find( { ratings: 5 } )
Equality matches can specify equality matches for an element at a particular index or position of the array using the
dot notation.
In the following example, the query uses the dot notation to match all documents where the ratings array contains
5 as the first element:
db.inventory.find( { 'ratings.0': 5 } )
Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on the elements of
an array such that at least one array element satisfies all the specified criteria.
The following example queries for documents where the ratings array contains at least one element that is greater
than ($gt) 5 and less than ($lt) 9:
db.inventory.find( { ratings: { $elemMatch: { $gt: 5, $lt: 9 } } } )
The operation returns the following documents, whose ratings array contains the element 8 which meets the crite-
ria:
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] }
{ "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] }
Combination of Elements Satisfies the Criteria The following example queries for documents where the
ratings array contains elements that in some combination satisfy the query conditions; e.g., one element can satisfy
the greater than 5 condition and another element can satisfy the less than 9 condition, or a single element can satisfy
both:
db.inventory.find( { ratings: { $gt: 5, $lt: 9 } } )
The document with the "ratings" : [ 5, 9 ] matches the query since the element 9 is greater than 5 (the
first condition) and the element 5 is less than 9 (the second condition).
{
_id: 101,
type: "fruit",
item: "jkl",
qty: 10,
price: 4.25,
ratings: [ 5, 9 ],
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ]
}
Match a Field in the Embedded Document Using the Array Index If you know the array index of the embedded
document, you can specify the document using the embedded documents position using the dot notation.
The following example selects all documents where the memos contains an array whose first element (i.e. index is 0)
is a document that contains the field by whose value is shipping:
db.inventory.find( { 'memos.0.by': 'shipping' } )
Match a Field Without Specifying Array Index If you do not know the index position of the document in the array,
concatenate the name of the field that contains the array, with a dot (.) and the name of the field in the embedded
document.
The following example selects all documents where the memos field contains an array that contains at least one
embedded document that contains the field by with the value shipping:
db.inventory.find( { 'memos.by': 'shipping' } )
Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on an array of em-
bedded documents such that at least one embedded document satisfies all the specified criteria.
The following example queries for documents where the memos array has at least one embedded document that
contains both the field memo equal to on time and the field by equal to shipping:
db.inventory.find(
{
memos:
{
$elemMatch:
{
memo: 'on time',
by: 'shipping'
}
}
}
)
Combination of Elements Satisfies the Criteria The following example queries for documents where the memos
array contains elements that in some combination satisfy the query conditions; e.g. one element satisfies the field
memo equal to on time condition and another element satisfies the field by equal to shipping condition, or
a single element can satisfy both criteria:
db.inventory.find(
{
'memos.memo': 'on time',
'memos.by': 'shipping'
}
)
type: "fruit",
item: "jkl",
qty: 10,
price: 4.25,
ratings: [ 5, 9 ],
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ]
See also:
Limit Fields to Return from a Query (page 115)
On this page
Update Specific Fields in a Document (page 110)
Replace the Document (page 112)
upsert Option (page 112)
Additional Examples and Methods (page 114)
MongoDB provides the update() method to update the documents of a collection. The method accepts as its
parameters:
an update conditions document to match the documents to update,
an update operations document to specify the modification to perform, and
an options document.
To specify the update condition, use the same structure and syntax as the query conditions.
By default, update() updates a single document. To update multiple documents, use the multi option.
To change a field value, MongoDB provides update operators10 , such as $set to modify values.
Some update operators, such as $set, will create the field if the field does not exist. See the individual update
operator11 reference.
For the document with item equal to "MNO2", use the $set operator to update the category field and the
details field to the specified values and the $currentDate operator to update the field lastModified with
the current date.
db.inventory.update(
{ item: "MNO2" },
{
$set: {
category: "apparel",
10 https://docs.mongodb.org/manual/reference/operator/update
11 https://docs.mongodb.org/manual/reference/operator/update
The update operation returns a WriteResult object which contains the status of the operation. A successful update
of the document returns the following object:
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
The nMatched field specifies the number of existing documents matched for the update, and nModified specifies
the number of existing documents modified.
To update a field within an embedded document, use the dot notation. When using the dot notation, enclose the whole
dotted field name in quotes.
The following updates the model field within the embedded details document.
db.inventory.update(
{ item: "ABC1" },
{ $set: { "details.model": "14Q2" } }
)
The update operation returns a WriteResult object which contains the status of the operation. A successful update
of the document returns the following object:
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
By default, the update() method updates a single document. To update multiple documents, use the multi option
in the update() method.
Update the category field to "apparel" and update the lastModified field to the current date for all docu-
ments that have category field equal to "clothing".
db.inventory.update(
{ category: "clothing" },
{
$set: { category: "apparel" },
$currentDate: { lastModified: true }
},
{ multi: true }
)
The update operation returns a WriteResult object which contains the status of the operation. A successful update
of the document returns the following object:
WriteResult({ "nMatched" : 3, "nUpserted" : 0, "nModified" : 3 })
To replace the entire content of a document except for the _id field, pass an entirely new document as the second
argument to update().
The replacement document can have different fields from the original document. In the replacement document, you
can omit the _id field since the _id field is immutable. If you do include the _id field, it must be the same value as
the existing value.
The following operation replaces the document with item equal to "BE10". The newly replaced document will only
contain the _id field and the fields in the replacement document.
db.inventory.update(
{ item: "BE10" },
{
item: "BE05",
stock: [ { size: "S", qty: 20 }, { size: "M", qty: 5 } ],
category: "apparel"
}
)
The update operation returns a WriteResult object which contains the status of the operation. A successful update
of the document returns the following object:
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
upsert Option
By default, if no document matches the update query, the update() method does nothing.
However, by specifying upsert: true, the update() method either updates matching document or documents, or
inserts a new document using the update specification if no matching document exists.
When you specify upsert: true for an update operation to replace a document and no matching documents
are found, MongoDB creates a new document using the equality conditions in the update conditions document, and
replaces this document, except for the _id field if specified, with the update document.
The following operation either updates a matching document by replacing it with a new document or adds a new
document if no matching document exists.
db.inventory.update(
{ item: "TBD1" },
{
item: "TBD1",
details: { "model" : "14Q4", "manufacturer" : "ABC Company" },
stock: [ { "size" : "S", "qty" : 25 } ],
category: "houseware"
},
{ upsert: true }
)
The update operation returns a WriteResult object which contains the status of the operation, including whether
the db.collection.update() method modified an existing document or added a new document.
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("53dbd684babeaec6342ed6c7")
})
Step 2: Specify upsert: true for the update specific fields operation.
When you specify upsert: true for an update operation that modifies specific fields and no matching documents
are found, MongoDB creates a new document using the equality conditions in the update conditions document, and
applies the modification as specified in the update document.
The following update operation either updates specific fields of a matching document or adds a new document if no
matching document exists.
db.inventory.update(
{ item: "TBD2" },
{
$set: {
details: { "model" : "14Q3", "manufacturer" : "IJK Co." },
category: "houseware"
}
},
{ upsert: true }
)
The update operation returns a WriteResult object which contains the status of the operation, including whether
the db.collection.update() method modified an existing document or added a new document.
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("53dbd7c8babeaec6342ed6c8")
})
{
"_id" : ObjectId("56a12ec8242ae5d73c07b15e"),
"item" : "TBD2",
"details" : {
"model" : "14Q3",
"manufacturer" : "IJK Co."
},
"category" : "houseware"
}
For more examples, see Update examples in the db.collection.update() reference page.
The db.collection.findAndModify() and the db.collection.save() method can also modify exist-
ing documents or insert a new one. See the individual reference pages for the methods for more information and
examples.
On this page
Remove All Documents (page 114)
Remove Documents that Match a Condition (page 114)
Remove a Single Document that Matches a Condition (page 115)
In MongoDB, the db.collection.remove() method removes documents from a collection. You can remove
all documents from a collection, remove all documents that match a condition, or limit the operation to remove just a
single document.
This tutorial provides examples of remove operations using the db.collection.remove() method in the mongo
shell.
To remove all documents from a collection, pass an empty query document {} to the remove() method. The
remove() method does not remove the indexes.
The following example removes all documents from the inventory collection:
db.inventory.remove({})
To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire
collection, including the indexes, and then recreate the collection and rebuild the indexes.
To remove the documents that match a deletion criteria, call the remove() method with the <query> parameter.
The following example removes all documents from the inventory collection where the type field equals food:
db.inventory.remove( { type : "food" } )
For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new collection
and then use drop() on the original collection.
To remove a single document, call the remove() method with the justOne parameter set to true or 1.
The following example removes one document from the inventory collection where the type field equals food:
db.inventory.remove( { type : "food" }, 1 )
To delete a single document sorted by some specified order, use the findAndModify() method.
On this page
Return All Fields in Matching Documents (page 115)
Return the Specified Fields and the _id Field Only (page 116)
Return Specified Fields Only (page 116)
Return All But the Excluded Field (page 116)
Return Specific Fields in Embedded Documents (page 116)
Suppress Specific Fields in Embedded Documents (page 117)
Projection for Array Fields (page 118)
The projection document limits the fields to return for all matching documents. The projection document can specify
the inclusion of fields or the exclusion of fields.
The specifications have the following forms:
Syntax Description
<field>: <1 or true> Specify the inclusion of a field.
<field>: <0 or false> Specify the suppression of the field.
Important: The _id field is, by default, included in the result set. To suppress the _id field from the result set,
specify _id: 0 in the projection document.
You cannot combine inclusion and exclusion semantics in a single projection with the exception of the _id field.
This tutorial offers various query examples that limit the fields to return for all matching documents. The examples in
this tutorial use a collection inventory and use the db.collection.find() method in the mongo shell. The
db.collection.find() method returns a cursor (page 67) to the retrieved documents. For examples on query
selection criteria, see Query Documents (page 103).
If you specify no projection, the find() method returns all fields of all documents that match the query.
db.inventory.find( { type: 'food' } )
This operation will return all documents in the inventory collection where the value of the type field is food.
The returned documents contain all fields.
A projection can explicitly include several fields. In the following operation, the find() method returns all docu-
ments that match the query. In the result set, only the item and qty fields and, by default, the _id field return in the
matching documents.
db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } )
You can remove the _id field from the results by specifying its exclusion in the projection, as in the following
example:
db.inventory.find( { type: 'food' }, { item: 1, qty: 1, _id:0 } )
This operation returns all documents that match the query. In the result set, only the item and qty fields return in
the matching documents.
To exclude a single field or group of fields you can use a projection in the following form:
db.inventory.find( { type: 'food' }, { type:0 } )
This operation returns all documents where the value of the type field is food. In the result set, the type field does
not return in the matching documents.
With the exception of the _id field you cannot combine inclusion and exclusion statements in projection documents.
Use the dot notation (page 189) to return specific fields inside an embedded document. For example, the inventory
collection contains the following document:
{
"_id" : 3,
"type" : "food",
"item" : "aaa",
"classification": { dept: "grocery", category: "chocolate" }
}
The following operation returns all documents that match the query. The specified projection returns only
the category field in the classification document. The returned category field remains inside the
classification document.
db.inventory.find(
{ type: 'food', _id: 3 },
{ "classification.category": 1, _id: 0 }
)
Use dot notation (page 189) to suppress specific fields inside an embedded document using a 0 instead of 1. For
example, the inventory collection contains the following document:
{
"_id" : 3,
"type" : "food",
"item" : "Super Dark Chocolate",
"classification" : { "dept" : "grocery", "category" : "chocolate"},
"vendor" : {
"primary" : {
"name" : "Marsupial Vending Co",
"address" : "Wallaby Rd",
"delivery" : ["M","W","F"]
},
"secondary":{
"name" : "Intl. Chocolatiers",
"address" : "Cocoa Plaza",
"delivery" : ["Sa"]
}
}
}
The following operation returns all documents where the value of the type field is food and the _id field is 3. The
projection suppresses only the category field in the classification document. The dept field remains inside
the classification document.
db.inventory.find(
{ type: 'food', _id: 3 },
{ "classification.category": 0}
)
You can suppress nested subdocuments at any depth using dot notation (page 189). The following specifies a projection
to suppress the delivery array only for the secondary document.
db.inventory.find(
{ "type" : "food" },
{ "vendor.secondary.delivery" : 0 }
)
This returns all documents except the delivery array in the secondary document
{
"_id" : 3,
"type" : "food",
"item" : "Super Dark Chocolate",
"classification" : { "dept" : "grocery", "category" : "chocolate"},
"vendor" : {
"primary" : {
"name" : "Bobs Vending",
"address" : "Wallaby Rd",
"delivery" : ["M","W","F"]
},
"secondary":{
"name" : "Intl. Chocolatiers",
"address" : "Cocoa Plaza"
}
}
}
For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, and
$.
For example, the inventory collection contains the following document:
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] }
Then the following operation uses the $slice projection operator to return just the first two elements in the ratings
array.
db.inventory.find( { _id: 5 }, { ratings: { $slice: 2 } } )
$elemMatch, $slice, and $ are the only way to project portions of an array. For instance, you cannot project a
portion of an array using the array index; e.g. { "ratings.0": 1 } projection will not project the array with
the first element.
See also:
Query Documents (page 103)
On this page
Synopsis (page 119)
Pattern (page 119)
Synopsis
Consider an application where users may submit many scores (e.g. for a test), but the application only needs to track
the top three test scores.
This pattern uses the $push operator with the $each, $sort, and $slice modifiers to sort and maintain an array
of fixed size.
Pattern
Note: When using the $sort modifier on the array element, access the field in the embedded document element
directly instead of using the dot notation on the array field.
After the operation, the document contains only the top 3 scores in the scores array:
{
"_id" : 1,
"scores" : [
{ "attempt" : 3, "score" : 7 },
{ "attempt" : 2, "score" : 8 },
{ "attempt" : 1, "score" : 10 }
]
}
See also:
$push operator,
$each modifier,
On this page
Manually Iterate the Cursor (page 120)
Iterator Index (page 121)
The db.collection.find() method returns a cursor. To access the documents, you need to iterate the cursor.
However, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the
cursor is automatically iterated up to 20 times to print up to the first 20 documents in the results. The following
describes ways to manually iterate the cursor to access the documents or to use the iterator index.
In the mongo shell, when you assign the cursor returned from the find() method to a variable using the var
keyword, the cursor does not automatically iterate.
12
You can call the cursor variable in the shell to iterate up to 20 times and print the matching documents, as in the
following example:
var myCursor = db.inventory.find( { type: 'food' } );
myCursor
You can also use the cursor method next() to access the documents, as in the following example:
var myCursor = db.inventory.find( { type: 'food' } );
while (myCursor.hasNext()) {
print(tojson(myCursor.next()));
}
As an alternative print operation, consider the printjson() helper method to replace print(tojson()):
var myCursor = db.inventory.find( { type: 'food' } );
while (myCursor.hasNext()) {
printjson(myCursor.next());
}
You can use the cursor method forEach() to iterate the cursor and access the documents, as in the following
example:
var myCursor = db.inventory.find( { type: 'food' } );
myCursor.forEach(printjson);
See JavaScript cursor methods and your driver documentation for more information on cursor methods.
12 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See mongo-shell-executing-
Iterator Index
In the mongo shell, you can use the toArray() method to iterate the cursor and return the documents in an array,
as in the following:
var myCursor = db.inventory.find( { type: 'food' } );
var documentArray = myCursor.toArray();
var myDocument = documentArray[3];
The toArray() method loads into RAM all documents returned by the cursor; the toArray() method exhausts
the cursor.
Additionally, some drivers provide access to the documents by using an index on the cursor (i.e.
cursor[index]). This is a shortcut for first calling the toArray() method and then using an index on the
resulting array.
Consider the following example:
var myCursor = db.inventory.find( { type: 'food' } );
var myDocument = myCursor[3];
On this page
Evaluate the Performance of a Query (page 121)
Compare Performance of Indexes (page 123)
Additional Resources (page 125)
The following query retrieves documents where the quantity field has a value between 100 and 200, inclusive:
db.inventory.find( { quantity: { $gte: 100, $lte: 200 } } )
To support the query on the quantity field, add an index on the quantity field:
db.inventory.createIndex( { quantity: 1 } )
To manually compare the performance of a query using more than one index, you can use the hint() method in
conjunction with the explain() method.
Consider the following query:
To support the query, add a compound index (page 495). With compound indexes (page 495), the order of the fields
matter.
For example, add the following two compound indexes. The first index orders by quantity field first, and then the
type field. The second index orders by type first, and then the quantity field.
db.inventory.createIndex( { quantity: 1, type: 1 } )
db.inventory.createIndex( { type: 1, quantity: 1 } )
db.inventory.find(
{ quantity: { $gte: 100, $lte: 300 }, type: "food" }
).hint({ type: 1, quantity: 1 }).explain("executionStats")
Additional Resources
On this page
Synopsis (page 126)
Background (page 126)
Pattern (page 126)
Recovering from Failure Scenarios (page 129)
Multiple Applications (page 131)
Using Two-Phase Commits in Production Applications (page 132)
Synopsis
This document provides a pattern for doing multi-document updates or multi-document transactions using a two-
phase commit approach for writing data to multiple documents. Additionally, you can extend this process to provide
a rollback-like (page 130) functionality.
Background
Operations on a single document are always atomic with MongoDB databases; however, operations that involve multi-
ple documents, which are often referred to as multi-document transactions, are not atomic. Since documents can be
fairly complex and contain multiple nested documents, single-document atomicity provides the necessary support
for many practical use cases.
Despite the power of single-document atomic operations, there are cases that require multi-document transactions.
When executing a transaction composed of sequential operations, certain issues arise, such as:
Atomicity: if one operation fails, the previous operation within the transaction must rollback to the previous
state (i.e. the nothing, in all or nothing).
Consistency: if a major failure (i.e. network, hardware) interrupts the transaction, the database must be able to
recover a consistent state.
For situations that require multi-document transactions, you can implement two-phase commit in your application to
provide support for these kinds of multi-document updates. Using two-phase commit ensures that data is consistent
and, in case of an error, the state that preceded the transaction is recoverable (page 130). During the procedure,
however, documents can represent pending data and states.
Note: Because only single-document operations are atomic with MongoDB, two-phase commits can only offer
transaction-like semantics. It is possible for applications to return intermediate data at intermediate points during the
two-phase commit or rollback.
Pattern
Overview
Consider a scenario where you want to transfer funds from account A to account B. In a relational database system,
you can subtract the funds from A and add the funds to B in a single multi-statement transaction. In MongoDB, you
can emulate a two-phase commit to achieve a comparable result.
The examples in this tutorial use the following two collections:
1. A collection named accounts to store account information.
2. A collection named transactions to store information on the fund transfer transactions.
Insert into the accounts collection a document for account A and a document for account B.
db.accounts.insert(
[
{ _id: "A", balance: 1000, pendingTransactions: [] },
{ _id: "B", balance: 1000, pendingTransactions: [] }
]
)
The operation returns a BulkWriteResult() object with the status of the operation. Upon successful insert, the
BulkWriteResult() has nInserted set to 2 .
For each fund transfer to perform, insert into the transactions collection a document with the transfer information.
The document contains the following fields:
source and destination fields, which refer to the _id fields from the accounts collection,
value field, which specifies the amount of transfer affecting the balance of the source and
destination accounts,
state field, which reflects the current state of the transfer. The state field can have the value of initial,
pending, applied, done, canceling, and canceled.
lastModified field, which reflects last modification date.
To initialize the transfer of 100 from account A to account B, insert into the transactions collection a document
with the transfer information, the transaction state of "initial", and the lastModified field set to the current
date:
db.transactions.insert(
{ _id: 1, source: "A", destination: "B", value: 100, state: "initial", lastModified: new Date() }
)
The operation returns a WriteResult() object with the status of the operation. Upon successful insert, the
WriteResult() object has nInserted set to 1.
Step 1: Retrieve the transaction to start. From the transactions collection, find a transaction in the initial
state. Currently the transactions collection has only one document, namely the one added in the Initialize
Transfer Record (page 127) step. If the collection contains additional documents, the query will return any transaction
with an initial state unless you specify additional query conditions.
var t = db.transactions.findOne( { state: "initial" } )
Type the variable t in the mongo shell to print the contents of the variable. The operation should print a document
similar to the following except the lastModified field should reflect date of your insert operation:
{ "_id" : 1, "source" : "A", "destination" : "B", "value" : 100, "state" : "initial", "lastModified"
Step 2: Update transaction state to pending. Set the transaction state from initial to pending and use the
$currentDate operator to set the lastModified field to the current date.
db.transactions.update(
{ _id: t._id, state: "initial" },
{
$set: { state: "pending" },
$currentDate: { lastModified: true }
}
)
The operation returns a WriteResult() object with the status of the operation. Upon successful update, the
nMatched and nModified displays 1.
In the update statement, the state: "initial" condition ensures that no other process has already updated this
record. If nMatched and nModified is 0, go back to the first step to get a different transaction and restart the
procedure.
Step 3: Apply the transaction to both accounts. Apply the transaction t to both accounts using the update()
method if the transaction has not been applied to the accounts. In the update condition, include the condition
pendingTransactions: { $ne: t._id } in order to avoid re-applying the transaction if the step is run
more than once.
To apply the transaction to the account, update both the balance field and the pendingTransactions field.
Update the source account, subtracting from its balance the transaction value and adding to its
pendingTransactions array the transaction _id.
db.accounts.update(
{ _id: t.source, pendingTransactions: { $ne: t._id } },
{ $inc: { balance: -t.value }, $push: { pendingTransactions: t._id } }
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Update the destination account, adding to its balance the transaction value and adding to its
pendingTransactions array the transaction _id .
db.accounts.update(
{ _id: t.destination, pendingTransactions: { $ne: t._id } },
{ $inc: { balance: t.value }, $push: { pendingTransactions: t._id } }
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Step 4: Update transaction state to applied. Use the following update() operation to set the transactions
state to applied and update the lastModified field:
db.transactions.update(
{ _id: t._id, state: "pending" },
{
$set: { state: "applied" },
$currentDate: { lastModified: true }
}
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Step 5: Update both accounts list of pending transactions. Remove the applied transaction _id from the
pendingTransactions array for both accounts.
Update the source account.
db.accounts.update(
{ _id: t.source, pendingTransactions: t._id },
{ $pull: { pendingTransactions: t._id } }
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Update the destination account.
db.accounts.update(
{ _id: t.destination, pendingTransactions: t._id },
{ $pull: { pendingTransactions: t._id } }
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Step 6: Update transaction state to done. Complete the transaction by setting the state of the transaction to
done and updating the lastModified field:
db.transactions.update(
{ _id: t._id, state: "applied" },
{
$set: { state: "done" },
$currentDate: { lastModified: true }
}
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
The most important part of the transaction procedure is not the prototypical example above, but rather the possibility
for recovering from the various failure scenarios when transactions do not complete successfully. This section presents
an overview of possible failures and provides steps to recover from these kinds of events.
Recovery Operations
The two-phase commit pattern allows applications running the sequence to resume the transaction and arrive at a
consistent state. Run the recovery operations at application startup, and possibly at regular intervals, to catch any
unfinished transactions.
The time required to reach a consistent state depends on how long the application needs to recover each transaction.
The following recovery procedures uses the lastModified date as an indicator of whether the pending transaction
requires recovery; specifically, if the pending or applied transaction has not been updated in the last 30 minutes,
the procedures determine that these transactions require recovery. You can use different conditions to make this
determination.
Transactions in Pending State To recover from failures that occur after step Update transaction state to pending.
(page ??) but before Update transaction state to applied. (page ??) step, retrieve from the transactions
collection a pending transaction for recovery:
And resume from step Apply the transaction to both accounts. (page ??)
Transactions in Applied State To recover from failures that occur after step Update transaction state to applied.
(page ??) but before Update transaction state to done. (page ??) step, retrieve from the transactions collection
an applied transaction for recovery:
var dateThreshold = new Date();
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30);
And resume from Update both accounts list of pending transactions. (page ??)
Rollback Operations
In some cases, you may need to roll back or undo a transaction; e.g., if the application needs to cancel the
transaction or if one of the accounts does not exist or stops existing during the transaction.
Transactions in Applied State After the Update transaction state to applied. (page ??) step, you should not
roll back the transaction. Instead, complete that transaction and create a new transaction (page 127) to reverse the
transaction by switching the values in the source and the destination fields.
Transactions in Pending State After the Update transaction state to pending. (page ??) step, but before the
Update transaction state to applied. (page ??) step, you can rollback the transaction using the following procedure:
Step 1: Update transaction state to canceling. Update the transaction state from pending to canceling.
db.transactions.update(
{ _id: t._id, state: "pending" },
{
$set: { state: "canceling" },
$currentDate: { lastModified: true }
}
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Step 2: Undo the transaction on both accounts. To undo the transaction on both accounts, reverse the transaction
t if the transaction has been applied. In the update condition, include the condition pendingTransactions:
t._id in order to update the account only if the pending transaction has been applied.
Update the destination account, subtracting from its balance the transaction value and removing the transaction
_id from the pendingTransactions array.
db.accounts.update(
{ _id: t.destination, pendingTransactions: t._id },
{
$inc: { balance: -t.value },
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to
1. If the pending transaction has not been previously applied to this account, no document will match the update
condition and nMatched and nModified will be 0.
Update the source account, adding to its balance the transaction value and removing the transaction _id from
the pendingTransactions array.
db.accounts.update(
{ _id: t.source, pendingTransactions: t._id },
{
$inc: { balance: t.value},
$pull: { pendingTransactions: t._id }
}
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to
1. If the pending transaction has not been previously applied to this account, no document will match the update
condition and nMatched and nModified will be 0.
Step 3: Update transaction state to canceled. To finish the rollback, update the transaction state from
canceling to cancelled.
db.transactions.update(
{ _id: t._id, state: "canceling" },
{
$set: { state: "cancelled" },
$currentDate: { lastModified: true }
}
)
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1.
Multiple Applications
Transactions exist, in part, so that multiple applications can create and run operations concurrently without causing
data inconsistency or conflicts. In our procedure, to update or retrieve the transaction document, the update conditions
include a condition on the state field to prevent reapplication of the transaction by multiple applications.
For example, applications App1 and App2 both grab the same transaction, which is in the initial state. App1
applies the whole transaction before App2 starts. When App2 attempts to perform the Update transaction state to
pending. (page ??) step, the update condition, which includes the state: "initial" criterion, will not match
any document, and the nMatched and nModified will be 0. This should signal to App2 to go back to the first step
to restart the procedure with a different transaction.
When multiple applications are running, it is crucial that only one application can handle a given transaction at any
point in time. As such, in addition including the expected state of the transaction in the update condition, you can
also create a marker in the transaction document itself to identify the application that is handling the transaction. Use
findAndModify() method to modify the transaction and get it back in one step:
t = db.transactions.findAndModify(
{
query: { state: "initial", application: { $exists: false } },
update:
{
$set: { state: "pending", application: "App1" },
$currentDate: { lastModified: true }
},
new: true
}
)
Amend the transaction operations to ensure that only applications that match the identifier in the application field
apply the transaction.
If the application App1 fails during transaction execution, you can use the recovery procedures (page 129), but appli-
cations should ensure that they own the transaction before applying the transaction. For example to find and resume
the pending job, use a query that resembles the following:
var dateThreshold = new Date();
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30);
db.transactions.find(
{
application: "App1",
state: "pending",
lastModified: { $lt: dateThreshold }
}
)
The example transaction above is intentionally simple. For example, it assumes that it is always possible to roll back
operations to an account and that account balances can hold negative values.
Production implementations would likely be more complex. Typically, accounts need information about current bal-
ance, pending credits, and pending debits.
For all transactions, ensure that you use the appropriate level of write concern (page 141) for your deployment.
On this page
Overview (page 132)
Pattern (page 133)
Example (page 133)
Modifications to the Pattern (page 133)
Overview
The Update if Current pattern is an approach to concurrency control (page 89) when multiple applications have access
to the data.
Pattern
The pattern queries for the document to update. Then, for each field to modify, the pattern includes the field and its
value in the returned document in the query predicate for the update operation. This way, the update only modifies the
document fields if the fields have not changed since the query.
Example
Consider the following example in the mongo shell. The example updates the quantity and the reordered fields
of a document only if the fields have not changed since the query.
Changed in version 2.6: The db.collection.update() method now returns a WriteResult() object that
contains the status of the operation. Previous versions required an extra db.getLastErrorObj() method call.
var myDocument = db.products.findOne( { sku: "abc123" } );
if ( myDocument ) {
var oldQuantity = myDocument.quantity;
var oldReordered = myDocument.reordered;
if ( results.hasWriteError() ) {
print( "unexpected error updating document: " + tojson(results) );
}
else if ( results.nMatched === 0 ) {
print( "No matching document for " +
"{ _id: "+ myDocument._id.toString() +
", quantity: " + oldQuantity +
", reordered: " + oldReordered
+ " } "
);
}
}
Another approach is to add a version field to the documents. Applications increment this field upon each update
operation to the documents. You must be able to ensure that all clients that connect to your database include the
version field in the query predicate. To associate increasing numbers with documents in a collection, you can use
one of the methods described in Create an Auto-Incrementing Sequence Field (page 134).
For more approaches, see Concurrency Control (page 89).
On this page
Overview (page 134)
Overview
By default, MongoDB will automatically close a cursor when the client has exhausted all results in the cursor. How-
ever, for capped collections (page 228) you may use a Tailable Cursor that remains open after the client exhausts
the results in the initial cursor. Tailable cursors are conceptually equivalent to the tail Unix command with the -f
option (i.e. with follow mode). After clients insert new additional documents into a capped collection, the tailable
cursor will continue to retrieve documents.
Use tailable cursors on capped collections that have high write volumes where indexes arent practical. For instance,
MongoDB replication (page 613) uses tailable cursors to tail the primarys oplog.
Note: If your query is on an indexed field, do not use tailable cursors, but instead, use a regular cursor. Keep track of
the last value of the indexed field returned by the query. To retrieve the newly added documents, query the collection
again using the last value of the indexed field in the query criteria, as in the following example:
db.<collection>.find( { indexedField: { $gt: <lastvalue> } } )
On this page
Synopsis (page 134)
Considerations (page 135)
Procedures (page 135)
Synopsis
MongoDB reserves the _id field in the top level of all documents as a primary key. _id must be unique, and always
has an index with a unique constraint (page 514). However, except for the unique constraint you can use any value for
the _id field in your collections. This tutorial describes two methods for creating an incrementing sequence number
for the _id field using the following:
Considerations
Generally in MongoDB, you would not use an auto-increment pattern for the _id field, or any field, because it does
not scale for databases with large numbers of documents. Typically the default value ObjectId is more ideal for the
_id.
Procedures
Counter Collection Implementation Use a separate counters collection to track the last number sequence used.
The _id field contains the sequence name and the seq field contains the last value of the sequence.
1. Insert into the counters collection, the initial value for the userid:
db.counters.insert(
{
_id: "userid",
seq: 0
}
)
2. Create a getNextSequence function that accepts a name of the sequence. The function uses the
findAndModify() method to atomically increment the seq value and return this new value:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true
}
);
return ret.seq;
}
db.users.insert(
{
_id: getNextSequence("userid"),
name: "Bob D."
}
)
findAndModify Behavior When findAndModify() includes the upsert: true option and the query
field(s) is not uniquely indexed, the method could insert a document multiple times in certain circumstances. For
instance, if multiple clients each invoke the method with the same query condition and these methods complete the
find phase before any of methods perform the modify phase, these methods could insert the same document.
In the counters collection example, the query field is the _id field, which always has a unique index. Consider
that the findAndModify() includes the upsert: true option, as in the following modified example:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true,
upsert: true
}
);
return ret.seq;
}
If multiple clients were to invoke the getNextSequence() method with the same name parameter, then the
methods would observe one of the following behaviors:
Exactly one findAndModify() would successfully insert a new document.
Zero or more findAndModify() methods would update the newly inserted document.
Zero or more findAndModify() methods would fail when they attempted to insert a duplicate.
If the method fails due to a unique index constraint violation, retry the method. Absent a delete of the document, the
retry should not fail.
Optimistic Loop
In this pattern, an Optimistic Loop calculates the incremented _id value and attempts to insert a document with the
calculated _id value. If the insert is successful, the loop ends. Otherwise, the loop will iterate through possible _id
values until the insert is successful.
1. Create a function named insertDocument that performs the insert if not present loop. The function wraps
the insert() method and takes a doc and a targetCollection arguments.
Changed in version 2.6: The db.collection.insert() method now returns a writeresults-insert object
that contains the status of the operation. Previous versions required an extra db.getLastErrorObj()
method call.
function insertDocument(doc, targetCollection) {
while (1) {
doc._id = seq;
if( results.hasWriteError() ) {
if( results.writeError.code == 11000 /* dup key */ )
continue;
else
print( "unexpected error inserting data: " + tojson( results ) );
}
break;
}
}
insertDocument(
{
name: "Grace H."
},
myCollection
);
insertDocument(
{
name: "Ted R."
},
myCollection
)
The while loop may iterate many times in collections with larger insert volumes.
Overview
When reading from the primary of a replica set, it is possible to read data that is stale or not durable, depending
on the read concern used 14 . With a read concern level of "local" (page 144), a client can read data before it is
durable; that is, before they have propagated to enough replica set members to avoid a rollback. A read concern level
of "majority" (page 144) guarantees durable reads but may return stale data that has been overwritten by another
write operation.
This tutorial outlines a procedure that uses db.collection.findAndModify() to read data that is not stale and
cannot be rolled back. To do so, the procedure uses the findAndModify() method with a write concern (page 141)
to modify a dummy field in a document. Specifically, the procedure requires that:
db.collection.findAndModify() use an exact match query, and a unique index (page 514) must exist
to satisfy the query.
findAndModify() must actually modify a document; i.e. result in a change to the document.
findAndModify() must use the write concern { w: "majority" } (page 142).
Important: The quorum read procedure has a substantial cost over simply using a read concern of "majority"
(page 144) because it incurs write latency rather than read latency. This technique should only be used if staleness is
absolutely intolerable.
Prerequisites
This tutorial reads from a collection named products. Initialize the collection using the following operation.
14 In some circumstances (page 722), two nodes in a replica set may transiently believe that they are the primary, but at most, one of them
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
db.products.insert( [
{
_id: 1,
sku: "xyz123",
description: "hats",
available: [ { quantity: 25, size: "S" }, { quantity: 50, size: "M" } ],
_dummy_field: 0
},
{
_id: 2,
sku: "abc123",
description: "socks",
available: [ { quantity: 10, size: "L" } ],
_dummy_field: 0
},
{
_id: 3,
sku: "ijk123",
description: "t-shirts",
available: [ { quantity: 30, size: "M" }, { quantity: 5, size: "L" } ],
_dummy_field: 0
}
] )
The documents in this collection contain a dummy field named _dummy_field that will be incre-
mented by the db.collection.findAndModify() in the tutorial. If the field does not exist, the
db.collection.findAndModify() operation will add the field to the document. The purpose of the field
is to ensure that the db.collection.findAndModify() results in a modification to the document.
Procedure
Create a unique index on the fields that will be used to specify an exact match in the
db.collection.findAndModify() operation.
This tutorial will use an exact match on the sku field. As such, create a unique index on the sku field.
db.products.createIndex( { sku: 1 }, { unique: true } )
Use the db.collection.findAndModify() method to make a trivial update to the document you want to read
and return the modified document. A write concern of { w: "majority" } (page 142) is required. To specify
the document to read, you must use an exact match query that is supported by a unique index.
The following findAndModify() operation specifies an exact match on the uniquely indexed field sku and incre-
ments the field named _dummy_field in the matching document. While not necessary, the write concern for this
command also includes a wtimeout (page 143) value of 5000 milliseconds to prevent the operation from blocking
forever if the write cannot propagate to a majority of voting members.
var updatedDocument = db.products.findAndModify(
{
query: { sku: "abc123" },
update: { $inc: { _dummy_field: 1 } },
new: true,
writeConcern: { w: "majority", wtimeout: 5000 }
},
);
Even in situations where two nodes in the replica set believe that they are the primary, only one will be able to complete
the write with w: "majority" (page 142). As such, the findAndModify() method with "majority"
(page 142) write concern will be successful only when the client has connected to the true primary to perform the
operation.
Since the quorum read procedure only increments a dummy field in the document, you can safely repeat invocations
of findAndModify(), adjusting the wtimeout (page 143) as necessary.
On this page
Query Cursor Methods (page 140)
Query and Data Manipulation Collection Methods (page 140)
MongoDB CRUD Reference Documentation (page 141)
Name Description
cursor.count() Modifies the cursor to return the number of documents in the result set rather than the
documents themselves.
cursor.explain()Reports on the query execution plan for a cursor.
cursor.hint() Forces MongoDB to use a specific index for a query.
cursor.limit() Constrains the size of a cursors result set.
cursor.next() Returns the next document in a cursor.
cursor.skip() Returns a cursor that begins returning results only after passing or skipping a number of
documents.
cursor.sort() Returns results ordered according to a sort specification.
cursor.toArray()Returns an array that contains all documents returned by the cursor.
Name Description
db.collection.count() Wraps count to return a count of the number of documents in a collection or
matching a query.
Returns an array of documents that have distinct values for the specified field.
db.collection.distinct()
db.collection.find() Performs a query on a collection and returns a cursor object.
db.collection.findOne()Performs a query and returns a single document.
db.collection.insert() Creates a new document in a collection.
db.collection.remove() Deletes documents from a collection.
db.collection.save() Provides a wrapper around an insert() and update() to insert new
documents.
db.collection.update() Modifies a document in a collection.
Write Concern (page 141) Description of the write operation acknowledgements returned by MongoDB.
Read Concern (page 143) Description of the readConcern option.
SQL to MongoDB Mapping Chart (page 145) An overview of common database operations showing both the Mon-
goDB operations and SQL statements.
The bios Example Collection (page 151) Sample data for experimenting with MongoDB. insert(), update()
and find() pages use the data for some of their examples.
Write Concern
On this page
Write Concern Specification (page 141)
Write concern describes the level of acknowledgement requested from MongoDB for write operations to a standalone
mongod or to replica sets (page 613) or to sharded clusters (page 725). In sharded clusters, mongos instances will
pass the write concern on to the shards.
Changed in version 3.2: For replica sets using protocolVersion: 1 (page 711) and running with the journal
enabled:
w: "majority" (page 142) implies j: true (page 142).
Secondary members acknowledge replicated write operations after the secondary members have written to their
respective on-disk journals, regardless of the j (page 142) option used for the write on the primary.
Changed in version 2.6: A new protocol for write operations (page 986) integrates write concerns with the write oper-
ations and eliminates the need to call the getLastError command. Previous versions required a getLastError
command immediately after a write operation to specify the write concern.
the w (page 141) option to request acknowledgment that the write operation has propagated to a specified number
of mongod instances or to mongod instances with specified tags.
the j (page 142) option to request acknowledgement that the write operation has been written to the journal, and
wtimeout (page 143) option to specify a time limit to prevent write operations from blocking indefinitely.
w Option The w option requests acknowledgement that the write operation has propagated to a specified number of
mongod instances or to mongod instances with specified tags.
Using the w option, the following w: <value> write concerns are available:
Note: Standalone mongod instances and primaries of replica sets acknowledge write operations after applying the
write in memory, unless j:true (page 142).
Changed in version 3.2: For replica sets using protocolVersion: 1 (page 711), secondaries acknowledge write
operations after the secondary members have written to their respective on-disk journals (page 598), regardless of the
j (page 142) option.
Value Description
Requests acknowledgement that the write operation has
<number>
propagated to the specified number of mongod in-
stances. For example:
w: 1 Requests acknowledgement that the write op-
eration has propagated to the standalone mongod
or the primary in a replica set. w: 1 is the de-
fault write concern for MongoDB.
w: 0 Requests no acknowledgment of the write op-
eration. However, w: 0 may return information
about socket exceptions and networking errors to
the application.
If you specify w: 0 but include j: true
(page 142), the j: true (page 142) prevails to
request acknowledgement from the standalone
mongod or the primary of a replica set.
Numbers greater than 1 are valid only for replica sets
to request acknowledgement from specified number of
members, including the primary.
Changed in version 3.2
"majority"
Requests acknowledgment that write operations have
propagated to the majority of voting nodes 15 , includ-
ing the primary, and have been written to the on-disk
journal (page 598) for these nodes.
For replica sets using protocolVersion: 1
(page 711), w: "majority" (page 142) implies j:
true (page 142). So, unlike w: <number>, with w:
"majority" (page 142), the primary also writes to the
on-disk journal before acknowledging the write.
After the write operation returns with a w:
"majority" (page 142) acknowledgement to
the client, the client can read the result of that write
with a "majority" (page 144) readConcern.
Requests acknowledgement that the write operations
<tag set>
have propagated to a replica set member with the speci-
fied tag (page 691).
j Option The j (page 142) option requests acknowledgement from MongoDB that the write operation has been
written to the journal (page 598).
wtimeout This option specifies a time limit, in milliseconds, for the write concern. wtimeout is only applicable
for w values greater than 1.
wtimeout causes write operations to return with an error after the specified limit, even if the required write concern
will eventually succeed. When these write operations return, MongoDB does not undo successful data modifications
performed before the write concern exceeded the wtimeout time limit.
If you do not specify the wtimeout option and the level of write concern is unachievable, the write operation will
block indefinitely. Specifying a wtimeout value of 0 is equivalent to a write concern without the wtimeout option.
Read Concern
On this page
Storage Engine and Drivers Support (page 143)
Read Concern Levels (page 144)
readConcern Option (page 144)
For the WiredTiger storage engine (page 587), the readConcern option allows clients to choose a level of isolation
for their reads. You can specify a read concern of "majority" to read data that has been written to a majority of
replica set members and thus cannot be rolled back.
With the MMAPv1 storage engine (page 595), you can only specify a readConcern option of "local".
Tip
By default, MongoDB uses a readConcern of "local" which does not guarantee that the read data would not be
rolled back.
You can specify a readConcern of "majority" to read data that has been written to a majority of replica set
members and thus cannot be rolled back.
level Description
Default. The query returns the instances most recent
"local"
copy of data. Provides no guarantee that the data has
been written to a majority of the replica set members.
The query returns the instances most recent copy of data
"majority"
confirmed as written to a majority of members in the
replica set.
To use a read concern level of "majority"
(page 144), you must use the WiredTiger stor-
age engine and start the mongod instances
with the --enableMajorityReadConcern
command line option (or the
replication.enableMajorityReadConcern
setting if using a configuration file).
Only replica sets using protocol version 1
(page 711) support "majority" (page 144) read con-
cern. Replica sets running protocol version 0 do not sup-
port "majority" (page 144) read concern.
To ensure that a single thread can read its own
writes, use "majority" (page 144) read concern and
"majority" (page 142) write concern against the pri-
mary of the replica set.
Regardless of the read concern level, the most recent data on a node may not reflect the most recent version of the data
in the system.
readConcern Option
For the level field, specify either the string "majority" or "local".
The readConcern option is available for the following operations:
find command
aggregate command and the db.collection.aggregate() method
distinct command
count command
parallelCollectionScan command
geoNear command
geoSearch command
To specify the read concern for the mongo shell method db.collection.find(), use the
cursor.readConcern() method.
On this page
Terminology and Concepts (page 145)
Executables (page 145)
Examples (page 146)
Additional Resources (page 150)
In addition to the charts that follow, you might want to consider the Frequently Asked Questions (page 823) section for
a selection of common questions about MongoDB.
The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology
and concepts.
SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document or BSON document
column field
index index
table joins embedded documents and linking
primary key primary key
Specify any unique column or column combination as In MongoDB, the primary key is automatically set to
primary key. the _id field.
aggregation (e.g. group by) aggregation pipeline
See the SQL to Aggregation Mapping Chart
(page 482).
Executables
The following table presents some database executables and the corresponding MongoDB executables. This table is
not meant to be exhaustive.
MongoDB MySQL Oracle Informix DB2
Database Server mongod mysqld oracle IDS DB2 Server
Database Client mongo mysql sqlplus DB-Access DB2 Client
Examples
The following table presents the various SQL statements and the corresponding MongoDB statements. The examples
in the table assume the following conditions:
The SQL examples assume a table named users.
The MongoDB examples assume a collection named users that contain documents of the following prototype:
{
_id: ObjectId("509a8fb2f3f4948bd2f983a0"),
user_id: "abc123",
age: 55,
status: 'A'
}
Create and Alter The following table presents the various SQL statements related to table-level actions and the
corresponding MongoDB statements.
Insert The following table presents the various SQL statements related to inserting records into tables and the cor-
responding MongoDB statements.
Select The following table presents the various SQL statements related to reading records from tables and the corre-
sponding MongoDB statements.
SELECT * db.users.find(
FROM users { status: "A" }
WHERE status = "A" )
SELECT * db.users.find(
FROM users { status: { $ne: "A" } }
WHERE status != "A" )
SELECT * db.users.find(
FROM users { status: "A",
WHERE status = "A" age: 50 }
AND age = 50 )
SELECT * db.users.find(
FROM users { $or: [ { status: "A" } ,
WHERE status = "A" { age: 50 } ] }
OR age = 50 )
SELECT * db.users.find(
FROM users { age: { $gt: 25 } }
WHERE age > 25 )
SELECT * db.users.find(
FROM users { age: { $lt: 25 } }
WHERE age < 25 )
SELECT * db.users.find(
FROM users { age: { $gt: 25, $lte: 50 } }
WHERE age > 25 )
AND age <= 50
Update Records The following table presents the various SQL statements related to updating existing records in
tables and the corresponding MongoDB statements.
SQL Update Statements MongoDB update() Statements
UPDATE users db.users.update(
SET status = "C" { age: { $gt: 25 } },
WHERE age > 25 { $set: { status: "C" } },
{ multi: true }
)
Delete Records The following table presents the various SQL statements related to deleting records from tables and
the corresponding MongoDB statements.
SQL Delete Statements MongoDB remove() Statements
DELETE FROM users db.users.remove( { status: "D" } )
WHERE status = "D"
Additional Resources
The bios collection provides example data for experimenting with MongoDB. Many of this guides examples on
insert, update and read operations create or query data from the bios collection.
The following documents comprise the bios collection. In the examples, the data might be different, as the examples
themselves make changes to the data.
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : ISODate("1924-12-03T05:00:00Z"),
"death" : ISODate("2007-03-17T04:00:00Z"),
"contribs" : [
"Fortran",
"ALGOL",
"Backus-Naur Form",
"FP"
],
"awards" : [
{
"award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{
"award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
{
"_id" : ObjectId("51df07b094c6acd67e492f41"),
"name" : {
"first" : "John",
"last" : "McCarthy"
},
22 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
23 https://www.mongodb.com/products/consulting?jmp=docs#database_modernization
"birth" : ISODate("1927-09-04T04:00:00Z"),
"death" : ISODate("2011-12-24T05:00:00Z"),
"contribs" : [
"Lisp",
"Artificial Intelligence",
"ALGOL"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1971,
"by" : "ACM"
},
{
"award" : "Kyoto Prize",
"year" : 1988,
"by" : "Inamori Foundation"
},
{
"award" : "National Medal of Science",
"year" : 1990,
"by" : "National Science Foundation"
}
]
}
{
"_id" : 3,
"name" : {
"first" : "Grace",
"last" : "Hopper"
},
"title" : "Rear Admiral",
"birth" : ISODate("1906-12-09T05:00:00Z"),
"death" : ISODate("1992-01-01T05:00:00Z"),
"contribs" : [
"UNIVAC",
"compiler",
"FLOW-MATIC",
"COBOL"
],
"awards" : [
{
"award" : "Computer Sciences Man of the Year",
"year" : 1969,
"by" : "Data Processing Management Association"
},
{
"award" : "Distinguished Fellow",
"year" : 1973,
"by" : " British Computer Society"
},
{
"award" : "W. W. McDowell Award",
"year" : 1976,
"by" : "IEEE Computer Society"
},
{
{
"_id" : 4,
"name" : {
"first" : "Kristen",
"last" : "Nygaard"
},
"birth" : ISODate("1926-08-27T04:00:00Z"),
"death" : ISODate("2002-08-10T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
"award" : "Turing Award",
"year" : 2001,
"by" : "ACM"
},
{
"award" : "IEEE John von Neumann Medal",
"year" : 2001,
"by" : "IEEE"
}
]
}
{
"_id" : 5,
"name" : {
"first" : "Ole-Johan",
"last" : "Dahl"
},
"birth" : ISODate("1931-10-12T04:00:00Z"),
"death" : ISODate("2002-06-29T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
"award" : "Turing Award",
"year" : 2001,
"by" : "ACM"
},
{
"award" : "IEEE John von Neumann Medal",
"year" : 2001,
"by" : "IEEE"
}
]
}
{
"_id" : 6,
"name" : {
"first" : "Guido",
"last" : "van Rossum"
},
"birth" : ISODate("1956-01-31T05:00:00Z"),
"contribs" : [
"Python"
],
"awards" : [
{
"award" : "Award for the Advancement of Free Software",
"year" : 2001,
"by" : "Free Software Foundation"
},
{
"award" : "NLUUG Award",
"year" : 2003,
"by" : "NLUUG"
}
]
}
{
"_id" : ObjectId("51e062189c6ae665454e301d"),
"name" : {
"first" : "Dennis",
"last" : "Ritchie"
},
"birth" : ISODate("1941-09-09T04:00:00Z"),
"death" : ISODate("2011-10-12T04:00:00Z"),
"contribs" : [
"UNIX",
"C"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1983,
"by" : "ACM"
},
{
"award" : "National Medal of Technology",
"year" : 1998,
"by" : "United States"
},
{
"award" : "Japan Prize",
"year" : 2011,
"by" : "The Japan Prize Foundation"
}
]
}
{
"_id" : 8,
"name" : {
"first" : "Yukihiro",
"aka" : "Matz",
"last" : "Matsumoto"
},
"birth" : ISODate("1965-04-14T04:00:00Z"),
"contribs" : [
"Ruby"
],
"awards" : [
{
"award" : "Award for the Advancement of Free Software",
"year" : "2011",
"by" : "Free Software Foundation"
}
]
}
{
"_id" : 9,
"name" : {
"first" : "James",
"last" : "Gosling"
},
"birth" : ISODate("1955-05-19T04:00:00Z"),
"contribs" : [
"Java"
],
"awards" : [
{
"award" : "The Economist Innovation Award",
"year" : 2002,
"by" : "The Economist"
},
{
"award" : "Officer of the Order of Canada",
"year" : 2007,
"by" : "Canada"
}
]
}
{
"_id" : 10,
"name" : {
"first" : "Martin",
"last" : "Odersky"
},
"contribs" : [
"Scala"
]
}
Data Models
Data in MongoDB has a flexible schema. Collections do not enforce document structure. This flexibility gives you
data-modeling choices to match your application and its performance requirements.
Data Modeling Introduction (page 157) An introduction to data modeling in MongoDB.
Document Validation (page 160) MongoDB provides the capability to validate documents during updates and inser-
tions.
Data Modeling Concepts (page 162) The core documentation detailing the decisions you must make when determin-
ing a data model, and discussing considerations that should be taken into account.
Data Model Examples and Patterns (page 167) Examples of possible data models that you can use to structure your
MongoDB documents.
Data Model Reference (page 185) Reference material for data modeling for developers of MongoDB applications.
On this page
Document Structure (page 157)
Atomicity of Write Operations (page 158)
Document Growth (page 159)
Data Use and Performance (page 159)
Additional Resources (page 159)
Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a tables
schema before inserting data, MongoDBs collections do not enforce document structure. This flexibility facilitates
the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity,
even if the data has substantial variation. In practice, however, the documents in a collection share a similar structure.
The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the
database engine, and the data retrieval patterns. When designing data models, always consider the application usage
of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself.
The key decision in designing data models for MongoDB applications revolves around the structure of documents and
how the application represents relationships between data. There are two tools that allow applications to represent
157
MongoDB Documentation, Release 3.2.4
References
References store the relationships between data by including links or references from one document to another. Appli-
cations can resolve these references (page 189) to access the related data. Broadly, these are normalized data models.
See Normalized Data Models (page 164) for the strengths and weaknesses of using references.
Embedded Data
Embedded documents capture relationships between data by storing related data in a single document structure. Mon-
goDB documents make it possible to embed document structures in a field or array within a document. These denor-
malized data models allow applications to retrieve and manipulate related data in a single database operation.
See Embedded Data Models (page 163) for the strengths and weaknesses of embedding documents.
In MongoDB, write operations are atomic at the document level, and no single write operation can atomically affect
more than one document or more than one collection. A denormalized data model with embedded data combines
all related data for a represented entity in a single document. This facilitates atomic write operations since a single
write operation can insert or update the data for an entity. Normalizing the data would split the data across multiple
collections and would require multiple write operations that are not atomic collectively.
However, schemas that facilitate atomic writes may limit ways that applications can use the data or may limit ways to
modify applications. The Atomicity Considerations (page 165) documentation describes the challenge of designing a
schema that balances flexibility and atomicity.
Some updates, such as pushing elements to an array or adding new fields, increase a documents size.
For the MMAPv1 storage engine, if the document size exceeds the allocated space for that document, MongoDB
relocates the document on disk. When using the MMAPv1 storage engine, growth consideration can affect the decision
to normalize or denormalize data. See Document Growth Considerations (page 165) for more about planning for and
managing document growth for MMAPv1.
When designing a data model, consider how applications will use your database. For instance, if your application only
uses recently inserted documents, consider using Capped Collections (page 228). Or if your application needs are
mainly read operations to a collection, adding indexes to support common queries can improve performance.
See Operational Factors and Data Models (page 165) for more information on these and other operational considera-
tions that affect data model designs.
On this page
Behavior (page 160)
Restrictions (page 162)
Bypass Document Validation (page 162)
Additional Information (page 162)
MongoDB also provides the validationLevel option, which determines how strictly MongoDB applies valida-
tion rules to existing documents during an update, and the validationAction option, which determines whether
MongoDB should error and reject documents that violate the validation rules or warn about the violations in the
log but allow invalid documents.
4.2.1 Behavior
Validation occurs during updates and inserts. When you add validation to a collection, existing documents do not
undergo validation checks until modification.
Existing Documents
You can control how MongoDB handles existing documents using the validationLevel option.
By default, validationLevel is strict and MongoDB applies validation rules to all inserts and updates. Setting
validationLevel to moderate applies validation rules to inserts and to updates to existing documents that fulfill
the validation criteria. With the moderate level, updates to existing documents that do not fulfill the validation
criteria are not checked for validity.
Example
Consider the following documents in a contacts collection:
{
"_id": "125876"
"name": "Anne",
"phone": "+1 555 123 456",
"city": "London",
"status": "Complete"
},
{
"_id": "860000",
"name": "Ivan",
"city": "Vancouver"
}
The contacts collection now has a validator with the moderate validationLevel. If you attempted to update the
document with _id of 125876, MongoDB would apply validation rules since the existing document matches the
criteria. In contrast, MongoDB will not apply validation rules to updates to the document with _id of 860000 as it
does not meet the validation rules.
The validationAction option determines how MongoDB handles documents that violate the validation rules.
By default, validationAction is error and MongoDB rejects any insertion or update that violates the validation
criteria. When validationAction is set to warn, MongoDB logs any violations but allows the insertion or update
to proceed.
Example
The following example creates a contacts collection with a validator that specifies that inserted or updated docu-
ments should match at least one of three following conditions:
the phone field is a string
the email field matches the regular expression
the status field is either Unknown or Incomplete.
db.createCollection( "contacts",
{
validator: { $or:
[
{ phone: { $type: "string" } },
{ email: { $regex: /@mongodb\.com$/ } },
{ status: { $in: [ "Unknown", "Incomplete" ] } }
]
},
validationAction: "warn"
}
)
With the validator in place, the following insert operation fails the validation rules, but since the
validationAction is warn, the write operation logs the failure and succeeds.
db.contacts.insert( { name: "Amanda", status: "Updated" } )
The log includes the full namespace of the collection and the document that failed the validation rules, as well as the
time of the operation:
2015-10-15T11:20:44.260-0400 W STORAGE [conn3] Document would fail validation collection: example.co
4.2.2 Restrictions
You cannot specify a validator for collections in the admin, local, and config databases.
You cannot specify a validator for system.* collections.
User can bypass document validation using the bypassDocumentValidation option. For a list of commands
that support the bypassDocumentValidation option, see Document Validation (page 883).
For deployments that have enabled access control, to bypass document validation, the authenticated user must have
bypassDocumentValidation (page 429) action. The built-in roles dbAdmin (page 416) and restore
(page 420) provide this action.
See also:
collMod, db.createCollection(), db.getCollectionInfos().
On this page
Embedded Data Models (page 163)
Normalized Data Models (page 164)
Additional Resources (page 164)
Effective data models support your application needs. The key consideration for the structure of your documents is
the decision to embed (page 163) or to use references (page 164).
With MongoDB, you may embed related data in a single structure or document. These schema are generally known
as denormalized models, and take advantage of MongoDBs rich documents. Consider the following diagram:
Embedded data models allow applications to store related pieces of information in the same database record. As a
result, applications may need to issue fewer queries and updates to complete common operations.
In general, use embedded data models when:
you have contains relationships between entities. See Model One-to-One Relationships with Embedded Doc-
uments (page 168).
you have one-to-many relationships between entities. In these relationships the many or child documents
always appear with or are viewed in the context of the one or parent documents. See Model One-to-Many
Relationships with Embedded Documents (page 169).
In general, embedding provides better performance for read operations, as well as the ability to request and retrieve
related data in a single database operation. Embedded data models make it possible to update related data in a single
atomic write operation.
However, embedding related data in documents may lead to situations where documents grow after creation. With the
MMAPv1 storage engine, document growth can impact write performance and lead to data fragmentation.
In version 3.0.0, MongoDB uses Power of 2 Sized Allocations (page 596) as the default allocation strategy for
MMAPv1 in order to account for document growth, minimizing the likelihood of data fragmentation. See Power of 2
Sized Allocations (page 596) for details. Furthermore, documents in MongoDB must be smaller than the maximum
BSON document size. For bulk binary data, consider GridFS (page 603).
To interact with embedded documents, use dot notation to reach into embedded documents. See query for data in
arrays (page 106) and query data in embedded documents (page 105) for more examples on accessing data in arrays
and embedded documents.
Normalized data models describe relationships using references (page 189) between documents.
Additional Resources
On this page
Document Growth (page 165)
Atomicity (page 165)
Sharding (page 166)
Indexes (page 166)
Large Number of Collections (page 166)
Data Lifecycle Management (page 167)
Modeling application data for MongoDB depends on both the data itself, as well as the characteristics of MongoDB
itself. For example, different data models may allow applications to use more efficient queries, increase the throughput
of insert and update operations, or distribute activity to a sharded cluster more effectively.
These factors are operational or address requirements that arise outside of the application but impact the performance
of MongoDB based applications. When developing a data model, analyze all of your applications read operations
(page 64) and write operations (page 77) in conjunction with the following considerations.
Document Growth
Atomicity
In MongoDB, operations are atomic at the document level. No single write operation can change more than one
document. Operations that modify more than a single document in a collection still operate on one document at a time.
4 http://www.mongodb.com/presentations/webinar-time-series-data-mongodb?jmp=docs
5 http://www.mongodb.com/presentations/socialite-open-source-status-feed-part-2-managing-social-graph?jmp=docs
6 https://www.mongodb.com/products/consulting?jmp=docs#rapid_start
7 https://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports
8
Ensure that your application stores all fields with atomic dependency requirements in the same document. If the
application can tolerate non-atomic updates for two pieces of data, you can store these data in separate documents.
A data model that embeds related data in a single document facilitates these kinds of atomic operations. For data mod-
els that store references between related pieces of data, the application must issue separate read and write operations
to retrieve and modify these related pieces of data.
See Model Data for Atomic Operations (page 181) for an example data model that provides atomic updates for a single
document.
Sharding
MongoDB uses sharding to provide horizontal scaling. These clusters support deployments with large data sets and
high-throughput operations. Sharding allows users to partition a collection within a database to distribute the collec-
tions documents across a number of mongod instances or shards.
To distribute data and application traffic in a sharded collection, MongoDB uses the shard key (page 739). Selecting
the proper shard key (page 739) has significant implications for performance, and can enable or prevent query isolation
and increased write capacity. It is important to consider carefully the field or fields to use as the shard key.
See Sharding Introduction (page 725) and Shard Keys (page 739) for more information.
Indexes
Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and for
all operations that return sorted results. MongoDB automatically creates a unique index on the _id field.
As you create indexes, consider the following behaviors of indexes:
Each index requires at least 8 kB of data space.
Adding an index has some negative performance impact for write operations. For collections with high write-
to-read ratio, indexes are expensive since each insert must also update any indexes.
Collections with high read-to-write ratio often benefit from additional indexes. Indexes do not affect un-indexed
read operations.
When active, each index consumes disk space and memory. This usage can be significant and should be tracked
for capacity planning, especially for concerns over working set size.
See Indexing Strategies (page 573) for more information on indexes as well as Analyze Query Performance (page 121).
Additionally, the MongoDB database profiler (page 249) may help identify inefficient queries.
In certain situations, you might choose to store related information in several collections rather than in a single collec-
tion.
Consider a sample collection logs that stores log documents for various environment and applications. The logs
collection contains documents of the following form:
{ log: "dev", ts: ..., info: ... }
{ log: "debug", ts: ..., info: ...}
8 Document-level atomic operations include all operations within a single MongoDB document record: operations that affect multiple embedded
If the total number of documents is low, you may group documents into collection by type. For logs, consider main-
taining distinct log collections, such as logs_dev and logs_debug. The logs_dev collection would contain
only the documents related to the dev environment.
Generally, having a large number of collections has no significant performance penalty and results in very good
performance. Distinct collections are very important for high-throughput batch processing.
When using models that have a large number of collections, consider the following behaviors:
Each collection has a certain minimum overhead of a few kilobytes.
Each index, including the index on _id, requires at least 8 kB of data space.
For each database, a single namespace file (i.e. <database>.ns) stores all meta-data for that database, and
each index and collection has its own entry in the namespace file. MongoDB places limits on the size
of namespace files.
MongoDB using the mmapv1 storage engine has limits on the number of namespaces. You may
wish to know the current number of namespaces in order to determine how many additional namespaces the
database can support. To get the current number of namespaces, run the following in the mongo shell:
db.system.namespaces.count()
The limit on the number of namespaces depend on the <database>.ns size. The namespace file defaults to
16 MB.
To change the size of the new namespace file, start the server with the option --nssize <new size MB>.
For existing databases, after starting up the server with --nssize, run the db.repairDatabase() com-
mand from the mongo shell. For impacts and considerations on running db.repairDatabase(), see
repairDatabase.
Data modeling decisions should take data lifecycle management into consideration.
The Time to Live or TTL feature (page 231) of collections expires documents after a period of time. Consider using
the TTL feature if your application requires some data to persist in the database for a limited period of time.
Additionally, if your application only uses recently inserted documents, consider Capped Collections (page 228).
Capped collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support opera-
tions that insert and read documents based on insertion order.
The following documents provide overviews of various data modeling patterns and common schema design consider-
ations:
Model Relationships Between Documents (page 168) Examples for modeling relationships between documents.
Model One-to-One Relationships with Embedded Documents (page 168) Presents a data model that uses em-
bedded documents (page 163) to describe one-to-one relationships between connected data.
Model One-to-Many Relationships with Embedded Documents (page 169) Presents a data model that uses
embedded documents (page 163) to describe one-to-many relationships between connected data.
Model One-to-Many Relationships with Document References (page 170) Presents a data model that uses
references (page 164) to describe one-to-many relationships between documents.
Model Tree Structures (page 172) Examples for modeling tree structures.
Model Tree Structures with Parent References (page 173) Presents a data model that organizes documents in
a tree-like structure by storing references (page 164) to parent nodes in child nodes.
Model Tree Structures with Child References (page 175) Presents a data model that organizes documents in a
tree-like structure by storing references (page 164) to child nodes in parent nodes.
See Model Tree Structures (page 172) for additional examples of data models for tree structures.
Model Specific Application Contexts (page 181) Examples for models for specific application contexts.
Model Data for Atomic Operations (page 181) Illustrates how embedding fields related to an atomic update
within the same document ensures that the fields are in sync.
Model Data to Support Keyword Search (page 182) Describes one method for supporting keyword search by
storing keywords in an array in the same document as the text field. Combined with a multi-key index, this
pattern can support applications keyword search operations.
Model One-to-One Relationships with Embedded Documents (page 168) Presents a data model that uses embedded
documents (page 163) to describe one-to-one relationships between connected data.
Model One-to-Many Relationships with Embedded Documents (page 169) Presents a data model that uses embed-
ded documents (page 163) to describe one-to-many relationships between connected data.
Model One-to-Many Relationships with Document References (page 170) Presents a data model that uses refer-
ences (page 164) to describe one-to-many relationships between documents.
On this page
Overview (page 168)
Pattern (page 168)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that uses embedded (page 163) documents to describe relationships between
connected data.
Pattern
Consider the following example that maps patron and address relationships. The example illustrates the advantage of
embedding over referencing if you need to view one data entity in context of the other. In this one-to-one relationship
between patron and address data, the address belongs to the patron.
In the normalized data model, the address document contains a reference to the patron document.
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
If the address data is frequently retrieved with the name information, then with referencing, your application needs
to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the
patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
}
With the embedded data model, your application can retrieve the complete patron information with one query.
On this page
Overview (page 169)
Pattern (page 169)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that uses embedded (page 163) documents to describe relationships between
connected data.
Pattern
Consider the following example that maps patron and multiple address relationships. The example illustrates the
advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-
many relationship between patron and address data, the patron has multiple address entities.
In the normalized data model, the address documents contain a reference to the patron document.
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
If your application frequently retrieves the address data with the name information, then your application needs
to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data
entities in the patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
]
}
With the embedded data model, your application can retrieve the complete patron information with one query.
On this page
Overview (page 171)
Pattern (page 171)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that uses references (page 164) between documents to describe relationships
between connected data.
Pattern
Consider the following example that maps publisher and book relationships. The example illustrates the advantage of
referencing over embedding to avoid repetition of the publisher information.
Embedding the publisher document inside the book document would lead to repetition of the publisher data, as the
following documents show:
{
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
}
{
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher: {
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
}
To avoid repetition of the publisher data, use references and keep the publisher information in a separate collection
from the book collection.
When using references, the growth of the relationships determine where to store the reference. If the number of books
per publisher is small with limited growth, storing the book reference inside the publisher document may sometimes
be useful. Otherwise, if the number of books per publisher is unbounded, this data model would lead to mutable,
growing arrays, as in the following example:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
To avoid mutable, growing arrays, store the publisher reference inside the book document:
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
MongoDB allows various ways to use tree data structures to model large hierarchical or nested data relationships.
Model Tree Structures with Parent References (page 173) Presents a data model that organizes documents in a tree-
like structure by storing references (page 164) to parent nodes in child nodes.
Model Tree Structures with Child References (page 175) Presents a data model that organizes documents in a tree-
like structure by storing references (page 164) to child nodes in parent nodes.
Model Tree Structures with an Array of Ancestors (page 176) Presents a data model that organizes documents in a
tree-like structure by storing references (page 164) to parent nodes and an array that stores all ancestors.
Model Tree Structures with Materialized Paths (page 178) Presents a data model that organizes documents in a tree-
like structure by storing full relationship paths between documents. In addition to the tree node, each document
stores the _id of the nodes ancestors or path as a string.
Model Tree Structures with Nested Sets (page 179) Presents a data model that organizes documents in a tree-like
structure using the Nested Sets pattern. This optimizes discovering subtrees at the expense of tree mutability.
On this page
Overview (page 173)
Pattern (page 174)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references
(page 164) to parent nodes in children nodes.
Pattern
The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores
the id of the nodes parent.
Consider the following hierarchy of categories:
The following example models the tree using Parent References, storing the reference to the parent category in the
field parent:
db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
You can create an index on the field parent to enable fast search by the parent node:
db.categories.createIndex( { parent: 1 } )
You can query by the parent field to find its immediate children nodes:
db.categories.find( { parent: "Databases" } )
The Parent Links pattern provides a simple solution to tree storage but requires multiple queries to retrieve subtrees.
On this page
Overview (page 175)
Pattern (page 175)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references
(page 164) in the parent-nodes to children nodes.
Pattern
The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an
array the id(s) of the nodes children.
Consider the following hierarchy of categories:
The following example models the tree using Child References, storing the reference to the nodes children in the field
children:
db.categories.insert( { _id: "MongoDB", children: [] } )
db.categories.insert( { _id: "dbm", children: [] } )
db.categories.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } )
db.categories.insert( { _id: "Languages", children: [] } )
db.categories.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } )
db.categories.insert( { _id: "Books", children: [ "Programming" ] } )
The query to retrieve the immediate children of a node is fast and straightforward:
db.categories.findOne( { _id: "Databases" } ).children
You can create an index on the field children to enable fast search by the child nodes:
db.categories.createIndex( { children: 1 } )
You can query for a node in the children field to find its parent node as well as its siblings:
db.categories.find( { children: "MongoDB" } )
The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are
necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple
parents.
On this page
Overview (page 176)
Pattern (page 177)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents using references
(page 164) to parent nodes and an array that stores all ancestors.
Pattern
The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in
an array the id(s) of the nodes ancestors or path.
Consider the following hierarchy of categories:
The following example models the tree using Array of Ancestors. In addition to the ancestors field, these docu-
ments also store the reference to the immediate parent category in the parent field:
db.categories.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "
db.categories.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Data
db.categories.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming
db.categories.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming
db.categories.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } )
db.categories.insert( { _id: "Books", ancestors: [ ], parent: null } )
The query to retrieve the ancestors or path of a node is fast and straightforward:
db.categories.findOne( { _id: "MongoDB" } ).ancestors
You can create an index on the field ancestors to enable fast search by the ancestors nodes:
db.categories.createIndex( { ancestors: 1 } )
You can query by the field ancestors to find all its descendants:
The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node
by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working
with subtrees.
The Array of Ancestors pattern is slightly slower than the Materialized Paths (page 178) pattern but is more straight-
forward to use.
On this page
Overview (page 178)
Pattern (page 178)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing full
relationship paths between documents.
Pattern
The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as
a string the id(s) of the nodes ancestors or path. Although the Materialized Paths pattern requires additional steps of
working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such
as finding nodes by partial paths.
Consider the following hierarchy of categories:
The following example models the tree using Materialized Paths, storing the path in the field path; the path string
uses the comma , as a delimiter:
db.categories.insert( { _id: "Books", path: null } )
db.categories.insert( { _id: "Programming", path: ",Books," } )
db.categories.insert( { _id: "Databases", path: ",Books,Programming," } )
db.categories.insert( { _id: "Languages", path: ",Books,Programming," } )
db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } )
db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } )
You can query to retrieve the whole tree, sorting by the field path:
db.categories.find().sort( { path: 1 } )
You can use regular expressions on the path field to find the descendants of Programming:
db.categories.find( { path: /,Programming,/ } )
You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy:
On this page
Overview (page 180)
Pattern (page 180)
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 162)
for a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree like structure that optimizes discovering subtrees at the
expense of tree mutability.
Pattern
The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application
visits each node in the tree twice; first during the initial trip, and second during the return trip. The Nested Sets pattern
stores each tree node in a document; in addition to the tree node, document stores the id of nodes parent, the nodes
initial stop in the left field, and its return stop in the right field.
Consider the following hierarchy of categories:
The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the
tree structure. As such, this pattern is best for static trees that do not change.
Model Data for Atomic Operations (page 181) Illustrates how embedding fields related to an atomic update within
the same document ensures that the fields are in sync.
Model Data to Support Keyword Search (page 182) Describes one method for supporting keyword search by storing
keywords in an array in the same document as the text field. Combined with a multi-key index, this pattern can
support applications keyword search operations.
Model Monetary Data (page 183) Describes two methods to model monetary data in MongoDB.
Model Time Data (page 185) Describes how to deal with local time in MongoDB.
On this page
Pattern (page 181)
Pattern
Then to update with new checkout information, you can use the db.collection.update() method to atomically
update both the available field and the checkout field:
db.books.update (
{ _id: 123456789, available: { $gt: 0 } },
{
$inc: { available: -1 },
The operation returns a WriteResult() object that contains information on the status of the operation:
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
The nMatched field shows that 1 document matched the update condition, and nModified shows that the operation
updated 1 document.
If no document matched the update condition, then nMatched and nModified would be 0 and would indicate that
you could not check out the book.
On this page
Pattern (page 182)
Limitations of Keyword Indexes (page 183)
Note: Keyword search is not the same as text search or full text search, and does not provide stemming or other
text-processing features. See the Limitations of Keyword Indexes (page 183) section for more information.
In 2.4, MongoDB provides a text search feature. See Text Indexes (page 508) for more information.
If your application needs to perform queries on the content of a field that holds text you can perform exact matches
on the text or use $regex to use regular expression pattern matches. However, for many operations on text, these
methods do not satisfy application requirements.
This pattern describes one method for supporting keyword search using MongoDB to support application search
functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key
index (page 497), this pattern can support applications keyword search operations.
Pattern
To add structures to your document to support keyword-based queries, create an array field in your documents and add
the keywords as strings in the array. You can then create a multi-key index (page 497) on the array and create queries
that select values from the array.
Example
Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array
topics, and you add as many keywords as needed for a given volume.
For the Moby-Dick volume you might have the following document:
{ title : "Moby-Dick" ,
author : "Herman Melville" ,
published : 1851 ,
ISBN : 0451526996 ,
topics : [ "whaling" , "allegory" , "revenge" , "American" ,
"novel" , "nautical" , "voyage" , "Cape Cod" ]
}
The multi-key index creates separate index entries for each keyword in the topics array. For example the index
contains one entry for whaling and another for allegory.
You then query based on the keywords. For example:
db.volumes.findOne( { topics : "voyage" }, { title: 1 } )
Note: An array with a large number of elements, such as one with several hundreds or thousands of keywords will
incur greater indexing costs on insertion.
MongoDB can support keyword searches using specific data models and multi-key indexes (page 497); however, these
keyword indexes are not sufficient or comparable to full-text products in the following respects:
Stemming. Keyword queries in MongoDB can not parse keywords for root or related words.
Synonyms. Keyword-based search features must provide support for synonym or related queries in the applica-
tion layer.
Ranking. The keyword look ups described in this document do not provide a way to weight results.
Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for key-
word indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be
more efficient for some kinds of content and workloads.
On this page
Overview (page 183)
Use Cases for Exact Precision Model (page 184)
Use Cases for Arbitrary Precision Model (page 184)
Exact Precision (page 184)
Arbitrary Precision (page 184)
Overview
MongoDB stores numeric data as either IEEE 754 standard 64-bit floating point numbers or as 32-bit or 64-bit signed
integers. Applications that handle monetary data often require capturing fractional units of currency. However, arith-
metic on floating point numbers, as implemented in modern hardware, often does not conform to requirements for
monetary arithmetic. In addition, some fractional numeric quantities, such as one third and one tenth, have no exact
representation in binary floating point numbers.
Note: Arithmetic mentioned on this page refers to server-side arithmetic performed by mongod or mongos, and not
to client-side arithmetic.
Exact Precision (page 184) which multiplies the monetary value by a power of 10.
Arbitrary Precision (page 184) which uses two fields for the value: one field to store the exact monetary value
as a non-numeric and another field to store a floating point approximation of the value.
If you regularly need to perform server-side arithmetic on monetary data, the exact precision model may be appropriate.
For instance:
If you need to query the database for exact, mathematically valid matches, use Exact Precision (page 184).
If you need to be able to do server-side arithmetic, e.g., $inc, $mul, and aggregation framework
arithmetic, use Exact Precision (page 184).
If there is no need to perform server-side arithmetic on monetary data, modeling monetary data using the arbitrary
precision model may be suitable. For instance:
If you need to handle arbitrary or unforeseen number of precision, see Arbitrary Precision (page 184).
If server-side approximations are sufficient, possibly with client-side post-processing, see Arbitrary Precision
(page 184).
Exact Precision
Arbitrary Precision
To model monetary data using the arbitrary precision model, store the value in two fields:
1. In one field, encode the exact monetary value as a non-numeric data type; e.g., BinData or a string.
2. In the second field, store a double-precision floating point approximation of the exact value.
The following example uses the arbitrary precision model to store 9.99 USD for the price and 0.25 USD for the
fee:
{
price: { display: "9.99", approx: 9.9900000000000002, currency: "USD" },
fee: { display: "0.25", approx: 0.2499999999999999, currency: "USD" }
}
With some care, applications can perform range and sort queries on the field with the numeric approximation. How-
ever, the use of the approximation field for the query and sort operations requires that applications perform client-side
post-processing to decode the non-numeric representation of the exact value and then filter out the returned documents
based on the exact monetary value.
For use cases of this model, see Use Cases for Arbitrary Precision Model (page 184).
On this page
Overview (page 185)
Example (page 185)
Overview
MongoDB stores times in UTC (page 197) by default, and will convert any local time representations into this form.
Applications that must operate or report on some unmodified local time value may store the time zone alongside the
UTC timestamp, and compute the original local time in their application logic.
Example
In the MongoDB shell, you can store both the current date and the current clients offset from UTC.
var now = new Date();
db.data.save( { date: now,
offset: now.getTimezoneOffset() } );
You can reconstruct the original local time by applying the saved offset:
var record = db.data.findOne();
var localNow = new Date( record.date.getTime() - ( record.offset * 60000 ) );
Documents (page 186) MongoDB stores all data in documents, which are JSON-style data structures composed of
field-and-value pairs.
Database References (page 189) Discusses manual references and DBRefs, which MongoDB can use to represent
relationships between documents.
ObjectId (page 192) A 12-byte BSON type that MongoDB uses as the default value for its documents _id field if
the _id field is not specified.
BSON Types (page 194) Outlines the unique BSON types used by MongoDB. See BSONspec.org9 for the complete
BSON specification.
4.5.1 Documents
On this page
Document Format (page 186)
Document Structure (page 186)
Field Names (page 187)
Field Value Limit (page 187)
Document Limitations (page 187)
The _id Field (page 188)
Dot Notation (page 189)
Additional Resources (page 189)
MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs:
{ "item": "pencil", "qty": 500, "type": "no.2" }
Document Format
MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON
documents, though it contains more data types than JSON. For the BSON spec, see bsonspec.org10 . See also BSON
Types (page 194).
The mongo JavaScript shell and the MongoDB language drivers translate between BSON and the language-
specific document representation.
Document Structure
MongoDB documents are composed of field-and-value pairs and have the following structure:
{
field1: value1,
field2: value2,
field3: value3,
9 http://bsonspec.org/
10 http://bsonspec.org/
...
fieldN: valueN
}
The value of a field can be any of the BSON data types (page 194), including other documents, arrays, and arrays of
documents. The following document contains values of varying types:
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : NumberLong(1250000)
}
Field Names
For indexed collections (page 487), the values for the indexed fields have a Maximum Index Key Length limit.
See Maximum Index Key Length for details.
Document Limitations
MongoDB preserves the order of the document fields following write operations except for the following cases:
The _id field is always the first field in the document.
Updates that include renaming of field names may result in the reordering of fields in the document.
Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document.
Before version 2.6, MongoDB did not actively preserve the order of the fields in a document.
Warning: To ensure functioning replication, do not store values that are of the BSON regular expression
type in the _id field.
The following are common options for storing values for _id:
Use an ObjectId (page 192).
Use a natural unique identifier, if available. This saves space and avoids an additional index.
Generate an auto-incrementing number. See Create an Auto-Incrementing Sequence Field (page 134).
Generate a UUID in your application code. For a more efficient storage of the UUID values in the collection
and in the _id index, store the UUID as a value of the BSON BinData type.
Index keys that are of the BinData type are more efficiently stored in the index if:
the binary subtype value is in the range of 0-7 or 128-135, and
the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.
Use your drivers BSON UUID facility to generate UUIDs. Be aware that driver implementations may imple-
ment UUID serialization and deserialization logic differently, which may not be fully compatible with other
drivers. See your driver documentation11 for information concerning UUID interoperability.
Note: Most MongoDB driver clients will include the _id field and generate an ObjectId before sending the insert
operation to MongoDB; however, if the client sends a document without an _id field, the mongod will add the _id
field and generate the ObjectId.
11 https://api.mongodb.org/
Dot Notation
MongoDB uses the dot notation to access the elements of an array and to access the fields of an embedded document.
To access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and
zero-based index position, and enclose in quotes:
'<array>.<index>'
See also $ positional operator for update operations and $ projection operator when array index position is unknown.
To access a field of an embedded document with dot-notation, concatenate the embedded document name with the dot
(.) and the field name, and enclose in quotes:
'<embedded document>.<field>'
See also:
Embedded Documents (page 105) for dot notation examples with embedded documents.
Arrays (page 106) for dot notation examples with arrays.
Additional Resources
On this page
Manual References (page 190)
DBRefs (page 190)
MongoDB does not support joins. In MongoDB some data is denormalized, or stored with related data in documents to
remove the need for joins. However, in some cases it makes sense to store related information in separate documents,
typically in different collections or databases.
MongoDB applications use one of two methods for relating documents:
Manual references (page 190) where you save the _id field of one document in another document as a reference.
Then your application can run a second query to return the related data. These references are simple and
sufficient for most use cases.
DBRefs (page 190) are references from one document to another using the value of the first documents _id
field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents
located in multiple collections to be more easily linked with documents from a single collection.
To resolve DBRefs, your application must perform additional queries to return the referenced documents. Many
drivers have helper methods that form the query for the DBRef automatically. The drivers 13 do not auto-
matically resolve DBRefs into documents.
DBRefs provide a common format and type to represent relationships among documents. The DBRef format
also provides common semantics for representing links between documents if your database must interact with
multiple frameworks and tools.
Unless you have a compelling reason to use DBRefs, use manual references instead.
12 https://www.mongodb.com/blog/post/thinking-documents-part-1?jmp=docs
13 Some community supported drivers may have alternate behavior and may resolve a DBRef into a document automatically.
Manual References
Background
Using manual references is the practice of including one documents _id field in another document. The application
can then issue a second query to resolve the referenced fields as needed.
Process
Consider the following operation to insert two documents, using the _id field of the first document as a reference in
the second document:
original_id = ObjectId()
db.places.insert({
"_id": original_id,
"name": "Broadway Center",
"url": "bc.example.net"
})
db.people.insert({
"name": "Erin",
"places_id": original_id,
"url": "bc.example.net/Erin"
})
Then, when a query returns the document from the people collection you can, if needed, make a second query for
the document referenced by the places_id field in the places collection.
Use
For nearly every case where you want to store a relationship between two documents, use manual references
(page 190). The references are simple to create and your application can resolve references as needed.
The only limitation of manual linking is that these references do not convey the database and collection names. If you
have documents in a single collection that relate to documents in more than one collection, you may need to consider
using DBRefs.
DBRefs
Background
DBRefs are a convention for representing a document, rather than a specific reference type. They include the name of
the collection, and in some cases the database name, in addition to the value from the _id field.
Format
$id
The $id field contains the value of the _id field in the referenced document.
$db
Optional.
Contains the name of the database where the referenced document resides.
Only some drivers support $db references.
Example
DBRef documents resemble the following document:
{ "$ref" : <value>, "$id" : <value>, "$db" : <value> }
The DBRef in this example points to a document in the creators collection of the users database that has
ObjectId("5126bc054aed4daf9e2ab772") in its _id field.
Note: The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef.
C The C driver contains no support for DBRefs. You can traverse references manually.
C++ The C++ driver contains no support for DBRefs. You can traverse references manually.
C# The C# driver supports DBRefs using the MongoDBRef14 class and FetchDBRef and
FetchDBRefAs methods.
Haskell The Haskell driver contains no support for DBRefs. You can traverse references manually.
Java The DBRef15 class provides support for DBRefs from Java.
JavaScriptThe mongo shells JavaScript interface provides a DBRef.
Node.js The Node.js driver supports DBRefs using the DBRef16 class and the dereference17 method.
Perl The Perl driver supports DBRefs using the MongoDB::DBRef18 class. You can traverse references
manually.
PHP The PHP driver supports DBRefs, including the optional $db reference, using the MongoDBRef19
class.
Python The Python driver supports DBRefs using the DBRef20 class and the dereference21 method.
Ruby The Ruby driver supports DBRefs using the DBRef22 class and the dereference23 method.
Scala The Scala driver contains no support for DBRefs. You can traverse references manually.
14 https://api.mongodb.org/csharp/current/html/T_MongoDB_Driver_MongoDBRef.htm
15 https://api.mongodb.org/java/current/com/mongodb/DBRef.html
16 http://mongodb.github.io/node-mongodb-native/api-bson-generated/db_ref.html
17 http://mongodb.github.io/node-mongodb-native/api-generated/db.html#dereference
18 https://metacpan.org/pod/MongoDB::DBRef
19 http://www.php.net/manual/en/class.mongodbref.php/
Use
In most cases you should use the manual reference (page 190) method for connecting two or more related documents.
However, if you need to reference documents from multiple collections, consider using DBRefs.
4.5.3 ObjectId
On this page
Overview (page 192)
ObjectId() (page 192)
Examples (page 193)
Overview
Important: The relationship between the order of ObjectId values and generation time is not strict within a
single second. If multiple systems, or multiple processes or threads on a single system generate values, within a
single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also
result in non-strict ordering even for values because client drivers generate ObjectId values.
Also consider the Documents (page 186) section for related information on MongoDBs document orientation.
ObjectId()
The mongo shell provides the ObjectId() wrapper class to generate a new ObjectId, and to provide the following
helper attribute and methods:
20 https://api.mongodb.org/python/current/api/bson/dbref.html
21 https://api.mongodb.org/python/current/api/pymongo/database.html#pymongo.database.Database.deref eren ce
22 https://api.mongodb.org/ruby/current/BSON/DBRef.html
23 https://api.mongodb.org/ruby/current/Mongo/DB.html#dereference-instance_method
str
The hexadecimal string representation of the object.
getTimestamp()
Returns the timestamp portion of the object as a Date.
toString()
Returns the JavaScript representation in the form of a string literal ObjectId(...).
Changed in version 2.2: In previous versions toString() returns the hexadecimal string representation,
which as of version 2.2 can be retrieved by the str property.
valueOf()
Returns the representation of the object as a hexadecimal string. The returned string is the str attribute.
Changed in version 2.2: In previous versions, valueOf() returns the object.
Examples
To generate a new ObjectId using the ObjectId() constructor with a unique hexadecimal string:
y = ObjectId("507f191e810c19729de860ea")
To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows:
To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows:
ObjectId("507f191e810c19729de860ea").getTimestamp()
To return the hexadecimal string representation of an ObjectId(), use the valueOf() method as follows:
ObjectId("507f191e810c19729de860ea").valueOf()
To return the string representation of an ObjectId() object (in the form of a string literal ObjectId(...)), use
the toString() method as follows:
ObjectId("507f191e810c19729de860ea").toString()
On this page
Comparison/Sort Order (page 195)
ObjectId (page 196)
String (page 196)
Timestamps (page 196)
Date (page 197)
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The
BSON specification is located at bsonspec.org24 .
BSON supports the following data types as values in documents. Each data type has a corresponding number and
string alias that can be used with the $type operator to query documents by BSON type.
24 http://bsonspec.org/
Comparison/Sort Order
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectId
9. Boolean
10. Date
11. Timestamp
12. Regular Expression
13. MaxKey (internal type)
MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion
before comparison.
Changed in version 3.0.0: Date objects sort before Timestamp objects. Previously Date and Timestamp objects sorted
together.
The comparison treats a non-existent field as it would an empty BSON Object. As such, a sort on the a field in
documents { } and { a: null } would treat the documents as equivalent in sort order.
With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than
comparison or a descending sort compares the largest element of the arrays. As such, when comparing a field whose
value is a single-element array (e.g. [ 1 ]) with non-array fields (e.g. 2), the comparison is between 1 and 2. A
comparison of an empty array (e.g. [ ]) treats the empty array as less than null or a missing field.
MongoDB sorts BinData in the following order:
1. First, the length or size of the data.
2. Then, by the BSON one-byte subtype.
3. Finally, by the data, performing a byte-by-byte comparison.
The following sections describe special considerations for particular BSON types.
ObjectId
ObjectIds are: small, likely unique, fast to generate, and ordered. These values consists of 12-bytes, where the first
four bytes are a timestamp that reflect the ObjectIds creation. Refer to the ObjectId (page 192) documentation for
more information.
String
BSON strings are UTF-8. In general, drivers for each programming language convert from the languages string format
to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in
BSON strings with ease. 25 In addition, MongoDB $regex queries support UTF-8 in the regex string.
Timestamps
BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date (page 197)
type. Timestamp values are a 64 bit value where:
the first 32 bits are a time_t value (seconds since the Unix epoch)
the second 32 bits are an incrementing ordinal for operations within a given second.
Within a single mongod instance, timestamp values are always unique.
In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON
timestamp value.
Note: The BSON timestamp type is for internal MongoDB use. For most cases, in application development, you will
want to use the BSON date type. See Date (page 197) for more information.
If you insert a document containing an empty BSON timestamp in a top-level field, the MongoDB server will replace
that empty timestamp with the current timestamp value. For example, if you create an insert a document with a
timestamp value, as in the following operation:
var a = new Timestamp();
db.test.insert( { ts: a } );
25 Given strings using UTF-8 character sets, using sort() on strings will be reasonably correct. However, because internally sort() uses the
C++ strcmp api, the sort order may handle some characters incorrectly.
Then, the db.test.find() operation will return a document that resembles the following:
{ "_id" : ObjectId("542c2b97bac0595474108b48"), "ts" : Timestamp(1412180887, 1) }
If ts were a field in an embedded document, the server would have left it as an empty timestamp value.
Changed in version 2.6: Previously, the server would only replace empty timestamp values in the first two fields,
including _id, of an inserted document. Now MongoDB will replace any top-level field.
Date
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This
results in a representable date range of about 290 million years into the past and future.
The official BSON specification26 refers to the BSON Date type as the UTC datetime.
27
Changed in version 2.0: BSON Date type is signed. Negative values represent dates before 1970.
Example
Construct a Date using the new Date() constructor in the mongo shell:
var mydate1 = new Date()
Example
Construct a Date using the ISODate() constructor in the mongo shell:
var mydate2 = ISODate()
Example
Return the Date value as string:
mydate1.toString()
Example
Return the month portion of the Date value; months are zero-indexed, so that January is month 0:
mydate1.getMonth()
26 http://bsonspec.org/#/specification
27 Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date
fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates
before 1970 are relevant to your application.
Administration
The administration documentation addresses the ongoing operation and maintenance of MongoDB instances and de-
ployments. This documentation includes both high level overviews of these concerns as well as tutorials that cover
specific procedures and processes for operating MongoDB.
Administration Concepts (page 199) Core conceptual documentation of operational practices for managing Mon-
goDB deployments and systems.
MongoDB Backup Methods (page 200) Describes approaches and considerations for backing up a MongoDB
database.
Monitoring for MongoDB (page 203) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Production Notes (page 214) A collection of notes that describe best practices and considerations for the oper-
ations of MongoDB instances and deployments.
Continue reading from Administration Concepts (page 199) for additional documentation of MongoDB admin-
istration.
Administration Tutorials (page 240) Tutorials that describe common administrative procedures and practices for op-
erations for MongoDB instances and deployments.
Configuration, Maintenance, and Analysis (page 241) Describes routine management operations, including
configuration and performance analysis.
Backup and Recovery (page 266) Outlines procedures for data backup and restoration with mongod instances
and deployments.
Continue reading from Administration Tutorials (page 240) for more tutorials of common MongoDB mainte-
nance operations.
Administration Reference (page 295) Reference and documentation of internal mechanics of administrative features,
systems and functions and operations.
See also:
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica
Set Tutorials (page 655) and Sharded Cluster Tutorials (page 756) for additional tutorials and information.
The core administration documents address strategies and practices used in the operation of MongoDB systems and
deployments.
199
MongoDB Documentation, Release 3.2.4
Operational Strategies (page 200) Higher level documentation of key concepts for the operation and maintenance of
MongoDB deployments.
MongoDB Backup Methods (page 200) Describes approaches and considerations for backing up a MongoDB
database.
Monitoring for MongoDB (page 203) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Run-time Database Configuration (page 209) Outlines common MongoDB configurations and examples of
best-practice configurations for common use cases.
Continue reading from Operational Strategies (page 200) for additional documentation.
Data Management (page 226) Core documentation that addresses issues in data management, organization, mainte-
nance, and lifecycle management.
Data Center Awareness (page 226) Presents the MongoDB features that allow application developers and
database administrators to configure their deployments to be more data center aware or allow operational
and location-based separation.
Capped Collections (page 228) Capped collections provide a special type of size-constrained collections that
preserve insertion order and can support high volume inserts.
Expire Data from Collections by Setting TTL (page 231) TTL collections make it possible to automatically
remove data from a collection based on the value of a timestamp and are useful for managing data like
machine generated event data that are only useful for a limited period of time.
Optimization Strategies for MongoDB (page 232) Techniques for optimizing application performance with Mon-
goDB.
Continue reading from Optimization Strategies for MongoDB (page 232) for additional documentation.
These documents address higher level strategies for common administrative tasks and requirements with respect to
MongoDB deployments.
MongoDB Backup Methods (page 200) Describes approaches and considerations for backing up a MongoDB
database.
Monitoring for MongoDB (page 203) An overview of monitoring tools, diagnostic strategies, and approaches to
monitoring replica sets and sharded clusters.
Run-time Database Configuration (page 209) Outlines common MongoDB configurations and examples of best-
practice configurations for common use cases.
Production Notes (page 214) A collection of notes that describe best practices and considerations for the operations
of MongoDB instances and deployments.
On this page
Backup by Copying Underlying Data Files (page 201)
Backup with mongodump (page 201)
MongoDB Cloud Manager Backup (page 202)
Ops Manager Backup Software (page 203)
Further Reading (page 203)
Additional Resources (page 203)
When deploying MongoDB in production, you should have a strategy for capturing and restoring backups in the case
of data loss events. There are several ways to back up MongoDB clusters:
Backup by Copying Underlying Data Files (page 201)
Backup a Database with mongodump (page 273)
MongoDB Cloud Manager Backup (page 202)
Ops Manager Backup Software (page 203)
The mongodump tool reads data from a MongoDB database and creates high fidelity BSON files. The
mongorestore tool can populate a MongoDB database with the data from these BSON files.
1 https://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2
Use Cases mongodump and mongorestore are simple and efficient for backing up small MongoDB deploy-
ments, for partial backup and restores based on a query, syncing from production to staging or development environ-
ments, or changing the storage engine of a standalone.
However, these tools can be problematic for capturing backups of larger systems, sharded clusters, or replica sets. For
alternatives, see MongoDB Cloud Manager Backup (page 202) or Ops Manager Backup Software (page 203).
Data Exclusion mongodump excludes the content of the local database in its output.
mongodump only captures the documents in the database in its backup data and does not include index data.
mongorestore or mongod must then rebuild the indexes after restoring data.
Data Compression Handling When run against a mongod instance that uses the WiredTiger (page 587) storage
engine, mongodump outputs uncompressed data.
Performance mongodump can adversely affect the performance of the mongod. If your data is larger than system
memory, the mongodump will push the working set out of memory.
If applications modify data while mongodump is creating a backup, mongodump will compete for resources with
those applications.
To mitigate the impact of mongodump on the performance of the replica set, use mongodump to capture backups
from a secondary (page 618) member of a replica set.
Applications can continue to modify data while mongodump captures the output. For replica sets, mongodump
provides the --oplog option to include in its output oplog entries that occur during the mongodump operation.
This allows the corresponding mongorestore operation to replay the captured oplog. To restore a backup created
with --oplog, use mongorestore with the --oplogReplay option.
However, for replica sets, consider MongoDB Cloud Manager Backup (page 202) or Ops Manager Backup Software
(page 203).
See Back Up and Restore with MongoDB Tools (page 272), Backup a Small Sharded Cluster with mongodump
(page 278), and Backup a Sharded Cluster with Database Dumps (page 281) for more information.
The MongoDB Cloud Manager2 supports the backing up and restoring of MongoDB deployments.
MongoDB Cloud Manager continually backs up MongoDB replica sets and sharded clusters by reading the oplog data
from your MongoDB deployment.
MongoDB Cloud Manager Backup offers point in time recovery of MongoDB replica sets and a consistent snapshot
of sharded clusters.
MongoDB Cloud Manager achieves point in time recovery by storing oplog data so that it can create a restore for
any moment in time in the last 24 hours for a particular replica set or sharded cluster. Sharded cluster snapshots are
difficult to achieve with other MongoDB backup methods.
To restore a MongoDB deployment from an MongoDB Cloud Manager Backup snapshot, you download a compressed
archive of your MongoDB data files and distribute those files before restarting the mongod processes.
To get started with MongoDB Cloud Manager Backup, sign up for MongoDB Cloud Manager3 . For documentation
on MongoDB Cloud Manager, see the MongoDB Cloud Manager documentation4 .
2 https://cloud.mongodb.com/?jmp=docs
3 https://cloud.mongodb.com/?jmp=docs
4 https://docs.cloud.mongodb.com/
MongoDB Subscribers can install and run the same core software that powers MongoDB Cloud Manager Backup
(page 202) on their own infrastructure. Ops Manager, an on-premise solution, has similar functionality to the cloud
version and is available with Enterprise Advanced subscriptions.
For more information about Ops Manager, see the MongoDB Enterprise Advanced5 page and the Ops Manager Man-
ual6 .
Further Reading
Backup and Restore with Filesystem Snapshots (page 266) An outline of procedures for creating MongoDB data set
backups using system-level file snapshot tool, such as LVM or native storage appliance tools.
Restore a Replica Set from MongoDB Backups (page 270) Describes procedure for restoring a replica set from an
archived backup such as a mongodump or MongoDB Cloud Manager7 Backup file.
Back Up and Restore with MongoDB Tools (page 272) Describes a procedure for exporting the contents of a
database to either a binary dump or a textual exchange format, and for importing these files into a database.
Backup and Restore Sharded Clusters (page 277) Detailed procedures and considerations for backing up sharded
clusters and single shards.
Recover Data after an Unexpected Shutdown (page 289) Recover data from MongoDB data files that were not prop-
erly closed or have an invalid state.
Additional Resources
On this page
Monitoring Strategies (page 204)
MongoDB Reporting Tools (page 204)
Process Logging (page 207)
Diagnosing Performance Issues (page 208)
Replication and Monitoring (page 208)
Sharding and Monitoring (page 208)
Additional Resources (page 209)
5 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
6 https://docs.opsmanager.mongodb.com/current/
7 https://cloud.mongodb.com/?jmp=docs
8 https://www.mongodb.com/lp/white-paper/backup-disaster-recovery?jmp=docs
9 http://www.mongodb.com/blog/post/backup-vs-replication-why-do-you-need-both?jmp=docs
10 https://www.mongodb.com/products/consulting?jmp=docs#s_product_readiness
Monitoring is a critical component of all database administration. A firm grasp of MongoDBs reporting will allow you
to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDBs
normal operational parameters will allow you to diagnose problems before they escalate to failures.
This document presents an overview of the available monitoring utilities and the reporting statistics available in Mon-
goDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.
Note: MongoDB Cloud Manager11 , a hosted service, and Ops Manager12 , an on-premise solution, provide monitor-
ing, backup, and automation of MongoDB instances. See the MongoDB Cloud Manager documentation13 and Ops
Manager documentation14 for more information.
Monitoring Strategies
There are three methods for collecting data about the state of a running MongoDB instance:
First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities.
Second, database commands return statistics regarding the current database state with greater fidelity.
Third, MongoDB Cloud Manager15 , a hosted service, and Ops Manager, an on-premise solution available in
MongoDB Enterprise Advanced16 , provide monitoring to collect data from running MongoDB deployments as
well as providing visualization and alerts based on that data.
Each strategy can help answer different questions and is useful in different contexts. These methods are complemen-
tary.
This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the
kinds of questions that each method is best suited to help you address.
Utilities The MongoDB distribution includes a number of utilities that quickly return statistics about instances
performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.
mongostat mongostat captures and returns the counts of database operations by type (e.g. insert, query, update,
delete, etc.). These counts report on the load distribution on the server.
Use mongostat to understand the distribution of operation types and to inform capacity planning. See the
mongostat manual for details.
mongotop mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports
these statistics on a per collection basis.
Use mongotop to check if your database activity and use match your expectations. See the mongotop manual
for details.
11 https://cloud.mongodb.com/?jmp=docs
12 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
13 https://docs.cloud.mongodb.com/
14 https://docs.opsmanager.mongodb.com?jmp=docs
15 https://cloud.mongodb.com/?jmp=docs
16 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
HTTP Console Deprecated since version 3.2: HTTP interface for MongoDB
MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple web page. The
web interface is accessible at localhost:<port>, where the <port> number is 1000 more than the mongod
port .
For example, if a locally running mongod is using the default port 27017, access the HTTP console at
http://localhost:28017.
Commands MongoDB includes a number of commands that report on the state of the database.
These data may provide a finer level of granularity than the utilities discussed above. Consider using their output
in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the
activity of your instance. The db.currentOp method is another useful tool for identifying the database instances
in-progress operations.
serverStatus The serverStatus command, or db.serverStatus() from the shell, returns a general
overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access.
The command returns quickly and does not impact MongoDB performance.
serverStatus outputs an account of the state of a MongoDB instance. This command is rarely run directly. In
most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MongoDB
Cloud Manager17 and Ops Manager18 . Nevertheless, all administrators should be familiar with the data provided by
serverStatus.
dbStats The dbStats command, or db.stats() from the shell, returns a document that addresses storage use
and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database,
and object, collection, and index counters.
Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare
use between databases and to determine the average document size in a database.
collStats The collStats or db.collection.stats() from the shell that provides statistics that resem-
ble dbStats on the collection level, including a count of the objects in the collection, the size of the collection, the
amount of disk space used by the collection, and information about its indexes.
Third Party Tools A number of third party monitoring tools have support for MongoDB, either directly, or through
their own plugins.
17 https://cloud.mongodb.com/?jmp=docs
18 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
Self Hosted Monitoring Tools These are monitoring tools that you must install, configure and maintain on your
own servers. Most are open source.
Tool Plugin Description
Ganglia19 mongodb-ganglia20 Python script to report operations per second,
memory usage, btree statistics, master/slave status
and current connections.
Ganglia gmond_python_modules21 Parses output from the serverStatus and
replSetGetStatus commands.
Motop22 None Realtime monitoring tool for MongoDB servers.
Shows current operations ordered by durations
every second.
mtop23 None A top like tool.
Munin24 mongo-munin25 Retrieves server statistics.
Munin mongomon26 Retrieves collection statistics (sizes, index sizes,
and each (configured) collection count for one
DB).
Munin munin-plugins Ubuntu PPA27 Some additional munin plugins not in the main
distribution.
Nagios28 nagios-plugin-mongodb29 A simple Nagios check script, written in Python.
Also consider dex30 , an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes
to make indexing recommendations.
See also:
Ops Manager, an on-premise solution available in MongoDB Enterprise Advanced31 .
Hosted (SaaS) Monitoring Tools These are monitoring tools provided as a hosted service, usually through a paid
subscription.
19 http://sourceforge.net/apps/trac/ganglia/wiki
20 https://github.com/quiiver/mongodb-ganglia
21 https://github.com/ganglia/gmond_python_modules
22 https://github.com/tart/motop
23 https://github.com/beaufour/mtop
24 http://munin-monitoring.org/
25 https://github.com/erh/mongo-munin
26 https://github.com/pcdummy/mongomon
27 https://launchpad.net/ chris-lea/+archive/munin-plugins
28 http://www.nagios.org/
29 https://github.com/mzupan/nagios-plugin-mongodb
30 https://github.com/mongolab/dex
31 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
Name Notes
MongoDB Cloud Manager32 MongoDB Cloud Manager is a cloud-based suite of services for managing
MongoDB deployments. MongoDB Cloud Manager provides monitoring,
backup, and automation functionality. For an on-premise solution, see also Ops
Manager, available in MongoDB Enterprise Advanced33 .
Scout34 Several plugins, including MongoDB Monitoring35 , MongoDB Slow
Queries36 , and MongoDB Replica Set Monitoring37 .
Server Density38 Dashboard for MongoDB39 , MongoDB specific alerts, replication failover
timeline and iPhone, iPad and Android mobile apps.
Application Performance IBM has an Application Performance Management SaaS offering that includes
Management40 monitor for MongoDB and other applications and middleware.
New Relic41 New Relic offers full support for application performance management. In
addition, New Relic Plugins and Insights enable you to view monitoring
metrics from Cloud Manager in New Relic.
Datadog42 Infrastructure monitoring43 to visualize the performance of your MongoDB
deployments.
Process Logging
During normal operation, mongod and mongos instances report a live account of all server activity and operations to
either standard output or a log file. The following runtime settings control these options.
quiet. Limits the amount of information written to the log or output.
verbosity. Increases the amount of information written to the log or output. You can also modify the logging
verbosity during runtime with the logLevel parameter or the db.setLogLevel() method in the shell.
path. Enables logging to a file, rather than the standard output. You must specify the full path to the log file
when adjusting this setting.
logAppend. Adds information to a log file instead of overwriting the file.
Note: You can specify these configuration operations as the command line arguments to mongod or mongos
For example:
mongod -v --logpath /var/log/mongodb/server1.log --logappend
Starts a mongod instance in verbose mode, appending data to the log file at
/var/log/mongodb/server1.log/.
As you develop and operate applications with MongoDB, you may want to analyze the performance of the database
as the application. Analyzing MongoDB Performance (page 232) discusses some of the operational factors that can
influence performance.
Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor
replication lag. Replication lag refers to the amount of time that it takes to copy (i.e. replicate) a write operation
on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as
replication lag grows:
First, operations that occurred during the period of lag are not replicated to one or more secondaries. If youre
using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.
Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform
an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon
under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.
Note: The size of the oplog is only configurable during the first run using the --oplogSize argument to the
mongod command, or preferably, the oplogSizeMB setting in the MongoDB configuration file. If you do not
specify this on the command line before running with the --replSet option, mongod will create a default
sized oplog.
By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about
changing the oplog size, see the Change the Size of the Oplog (page 684)
The replSetGetStatus reference provides a more in-depth overview view of this output. In general, watch the
value of optimeDate, and pay particular attention to the time difference between the primary and the secondary
members.
In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB
instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes
and that sharding operations are functioning appropriately.
See also:
See the Sharding Concepts (page 731) documentation for more information.
Config Servers The config database maintains a map identifying which documents are on which shards. The cluster
updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding
operations become unavailable, such as moving chunks and starting mongos instances. However, clusters remain
accessible from already-running mongos instances.
Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should mon-
itor your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart.
MongoDB Cloud Manager44 and Ops Manager45 monitor config servers and can create notifications if a config server
becomes inaccessible. See the MongoDB Cloud Manager documentation46 and Ops Manager documentation47 for
more information.
Balancing and Chunk Distribution The most effective sharded cluster deployments evenly balance chunks among
the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks
are always optimally distributed among the shards.
Issue the db.printShardingStatus() or sh.status() command to the mongos by way of the mongo
shell. This returns an overview of the entire cluster including the database name, and a list of the chunks.
Stale Locks In nearly every case, all locks used by the balancer are automatically released when they become stale.
However, because any long lasting lock can block future balancing, its important to ensure that all locks are legitimate.
To check the lock status of the database, connect to a mongos instance using the mongo shell. Issue the following
command sequence to switch to the config database and display all outstanding locks on the shard database:
use config
db.locks.find()
For active deployments, the above query can provide insights. The balancing process, which originates on a randomly
selected mongos, takes a special balancer lock that prevents other balancing activity from transpiring. Use the
following command, also to the config database, to check the status of the balancer lock.
db.locks.find( { _id : "balancer" } )
If this lock exists, make sure that the balancer process is actively using this lock.
Additional Resources
On this page
Configure the Database (page 210)
Security Considerations (page 211)
Replication and Sharding Configuration (page 212)
Run Multiple Database Instances on the Same System (page 213)
Diagnostic Configurations (page 214)
The command line and configuration file interfaces provide MongoDB administrators with a large num-
ber of options and settings for controlling the operation of the database system. This document provides an overview
of common configurations and examples of best-practice configurations for common use cases.
44 https://cloud.mongodb.com/?jmp=docs
45 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
46 https://docs.cloud.mongodb.com/
47 https://docs.opsmanager.mongodb.com/current/application
48 https://www.mongodb.com/products/consulting?jmp=docs#s_product_readiness
While both interfaces provide access to the same collection of options and settings, this document primarily uses
the configuration file interface. If you run MongoDB using a init script or if you installed from a package for your
operating system, you likely already have a configuration file located at /etc/mongod.conf. Confirm this by
checking the contents of the /etc/init.d/mongod or /etc/rc.d/mongod script to ensure that the init scripts
start the mongod with the appropriate configuration file.
To start a MongoDB instance using this configuration file, issue a command in the following form:
mongod --config /etc/mongod.conf
mongod -f /etc/mongod.conf
Modify the values in the /etc/mongod.conf file on your system to control the configuration of your database
instance.
Consider the following basic configuration which uses the YAML format:
processManagement:
fork: true
net:
bindIp: 127.0.0.1
port: 27017
storage:
dbPath: /srv/mongodb
systemLog:
destination: file
path: "/var/log/mongodb/mongod.log"
logAppend: true
storage:
journal:
enabled: true
For most standalone servers, this is a sufficient base configuration. It makes several assumptions, but consider the
following explanation:
fork is true, which enables a daemon mode for mongod, which detaches (i.e. forks) the MongoDB from
the current session and allows you to run the database as a conventional server.
bindIp is 127.0.0.1, which forces the server to only listen for requests on the localhost IP. Only bind to
secure interfaces that the application-level systems can access with access control provided by system network
filtering (i.e. firewall).
New in version 2.6: mongod installed from official .deb (page 20) and .rpm (page 7) packages have the
bind_ip configuration set to 127.0.0.1 by default.
port is 27017, which is the default MongoDB port for database instances. MongoDB can bind to any port.
You can also filter access based on port using network filtering tools.
Note: UNIX-like systems require superuser privileges to attach processes to ports lower than 1024.
quiet is true. This disables all but the most critical entries in output/log file, and is not recommended for
production systems. If you do set this option, you can use setParameter to modify this setting during run
time.
dbPath is /srv/mongodb, which specifies where MongoDB will store its data files. /srv/mongodb and
/var/lib/mongodb are popular locations. The user account that mongod runs under will need read and
write access to this directory.
systemLog.path is /var/log/mongodb/mongod.log which is where mongod will write its output.
If you do not set this value, mongod writes all output to standard output (e.g. stdout.)
logAppend is true, which ensures that mongod does not overwrite an existing log file following the server
start operation.
storage.journal.enabled is true, which enables journaling. Journaling ensures single instance write-
durability. 64-bit builds of mongod enable journaling by default. Thus, this setting may be redundant.
Given the default configuration, some of these values may be redundant. However, in many situations explicitly stating
the configuration increases overall system intelligibility.
Security Considerations
The following collection of configuration options are useful for limiting access to a mongod instance. Consider the
following settings, shown in both YAML and older configuration file format:
In YAML format
security:
authorization: enabled
net:
bindIp: 127.0.0.1,10.8.0.10,192.168.4.24
Replication Configuration Replica set configuration is straightforward, and only requires that the replSetName
have a value that is consistent among all members of the set. Consider the following:
In YAML format
replication:
replSetName: set0
Use descriptive names for sets. Once configured, use the mongo shell to add hosts to the replica set.
See also:
Replica set reconfiguration.
To enable authentication for the replica set, add the following keyFile option:
In YAML format
security:
keyFile: /srv/mongodb/keyfile
Setting keyFile enables authentication and specifies a key file for the replica set member use to when authenticating
to each other. The content of the key file is arbitrary, but must be the same on all members of the replica set and
mongos instances that connect to the set. The keyfile must be less than one kilobyte in size and may only contain
characters in the base64 set and the file must not have group or world permissions on UNIX systems.
See also:
The Replica Set Security (page 329) section for information on configuring authentication with replica sets.
The Replication (page 613) document for more information on replication in MongoDB and replica set configuration
in general.
Sharding Configuration Sharding requires a number of mongod instances with different configurations. The con-
fig servers store the clusters metadata, while the cluster distributes data among one or more shard servers.
To set up one or three config server instances as normal (page 210) mongod instances, and then add the following
configuration option:
In YAML format
sharding:
clusterRole: configsvr
net:
bindIp: 10.8.0.12
port: 27001
50 https://docs.mongodb.org/v2.4/reference/configuration-options
51 https://docs.mongodb.org/v2.4/reference/configuration-options
bind_ip = 10.8.0.12
port = 27001
This creates a config server running on the private IP address 10.8.0.12 on port 27001. Make sure that there are
no port conflicts, and that your config server is accessible from all of your mongos and mongod instances.
To set up shards, configure two or more mongod instance using your base configuration (page 210), with the
shardsvr value for the sharding.clusterRole setting:
sharding:
clusterRole: shardsvr
Finally, to establish the cluster, configure at least one mongos process with the following settings:
In YAML format:
sharding:
configDB: 10.8.0.12:27001
chunkSize: 64
You can specify multiple configDB instances by specifying hostnames and ports in the form of a comma separated
list.
55
In general, avoid modifying the chunkSize from the default value of 64, and ensure this setting is consistent
among all mongos instances.
See also:
The Sharding (page 725) section of the manual for more information on sharding and cluster configuration.
In many cases running multiple instances of mongod on a single system is not recommended. On some types of
deployments 56 and for testing purposes you may need to run more than one mongod on a single system.
In these cases, use a base configuration (page 210) for each instance, but consider the following configuration values:
In YAML format:
52 https://docs.mongodb.org/v2.4/reference/configuration-options
53 https://docs.mongodb.org/v2.4/reference/configuration-options
54 https://docs.mongodb.org/v2.4/reference/configuration-options
55 Chunk size is 64 megabytes by default, which provides the ideal balance between the most even distribution of data, for which smaller chunk
sizes are best, and minimizing chunk migration, for which larger chunk sizes are optimal.
56 Single-tenant systems with SSD or other high performance disks may provide acceptable performance levels for multiple mongod instances.
Additionally, you may find that multiple databases with small working sets may function acceptably on a single system.
storage:
dbPath: /srv/mongodb/db0/
processManagement:
pidFilePath: /srv/mongodb/db0.pid
The dbPath value controls the location of the mongod instances data directory. Ensure that each database has a
distinct and well labeled data directory. The pidFilePath controls where mongod process places its process id
file. As this tracks the specific mongod file, it is crucial that file be unique and well labeled to make it easy to start
and stop these processes.
Create additional init scripts and/or adjust your existing MongoDB configuration and init script as needed to control
these processes.
Diagnostic Configurations
The following configuration options control various mongod behaviors for diagnostic purposes:
operationProfiling.mode sets the database profiler (page 234) level. The profiler is not active by
default because of the possible impact on the profiler itself on performance. Unless this setting is on, queries
are not profiled.
operationProfiling.slowOpThresholdMs configures the threshold which determines whether a
query is slow for the purpose of the logging system and the profiler (page 234). The default value is 100
milliseconds. Set a lower value if the database profiler does not return useful results or a higher value to only
log the longest running queries.
systemLog.verbosity controls the amount of logging output that mongod write to the log. Only use this
option if you are experiencing an issue that is not reflected in the normal logging level.
Changed in version 3.0: You can also specify verbosity level for specific components using the
systemLog.component.<name>.verbosity setting. For the available components, see component
verbosity settings.
For more information, see also Database Profiling (page 234) and Analyzing MongoDB Performance (page 232).
Production Notes
57 https://docs.mongodb.org/v2.4/reference/configuration-options
On this page
MongoDB Binaries (page 215)
MongoDB dbPath (page 216)
Concurrency (page 216)
Data Consistency (page 217)
Networking (page 217)
Hardware Considerations (page 218)
Architecture (page 221)
Compression (page 221)
Platform Specific Considerations (page 221)
Performance Monitoring (page 225)
Backups (page 225)
Additional Resources (page 225)
This page details system configurations that affect MongoDB, especially in production.
Note: MongoDB Cloud Manager58 , a hosted service, and Ops Manager59 , an on-premise solution, provide monitor-
ing, backup, and automation of MongoDB instances. See the MongoDB Cloud Manager documentation60 and Ops
Manager documentation61 for more information.
MongoDB Binaries
Recommended Platforms We recommend the following operating systems for production use:
Amazon Linux
Debian 7.1
Red Hat / CentOS 6.2+
SLES 11+
Ubuntu LTS 12.04
58 https://cloud.mongodb.com/?jmp=docs
59 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
60 https://docs.cloud.mongodb.com/
61 https://docs.opsmanager.mongodb.com?jmp=docs
Use the Latest Stable Packages Be sure you have the latest stable release.
All releases are available on the Downloads62 page. The Downloads63 page is a good place to verify the current stable
release, even if you are installing via a package manager.
Note: Starting in MongoDB 3.2, 32-bit binaries are deprecated and will be unavailable in future releases.
MongoDB dbPath
Changed in version 3.2: As of MongoDB 3.2, MongoDB uses the WiredTiger (page 587) storage engine by default.
Changed in version 3.0: MongoDB includes support for two storage engines: MMAPv1 (page 595), the storage engine
available in previous versions of MongoDB, and WiredTiger (page 587).
The files in the dbPath directory must correspond to the configured storage engine. mongod will not start if dbPath
contains data files created by a storage engine other than the one specified by --storageEngine.
Concurrency
MMAPv1 Changed in version 3.0: Beginning with MongoDB 3.0, MMAPv1 (page 595) provides collection-level
locking: All collections have a unique readers-writer lock that allows multiple clients to modify documents in different
collections at the same time.
For MongoDB versions 2.2 through 2.6 series, each database has a readers-writer lock that allows concurrent read ac-
cess to a database, but gives exclusive access to a single write operation per database. See the Concurrency (page 835)
page for more information. In earlier versions of MongoDB, all write operations contended for a single readers-writer
lock for the entire mongod instance.
WiredTiger WiredTiger (page 587) supports concurrent access by readers and writers to the documents in a collec-
tion. Clients can read documents while write operations are in progress, and multiple threads can modify different
documents in a collection at the same time.
See also:
Allocate Sufficient RAM and CPU (page 218)
62 http://www.mongodb.org/downloads
63 http://www.mongodb.org/downloads
Data Consistency
Journaling MongoDB uses write ahead logging to an on-disk journal. Journaling guarantees that MongoDB can
quickly recover write operations (page 77) that were written to the journal but not written to data files in cases where
mongod terminated as a result of a crash or other serious failure.
Leave journaling enabled in order to ensure that mongod will be able to recover its data files and keep the data files
in a valid state following a crash. See Journaling (page 598) for more information.
Write Concern Write concern (page 141) describes the level of acknowledgement requested from MongoDB for
write operations. The level of the write concerns affects how quickly the write operation returns. When write oper-
ations have a weak write concern, they return quickly. With stronger write concerns, clients must wait after sending
a write operation until MongoDB confirms the write operation at the requested write concern level. With insufficient
write concerns, write operations may appear to a client to have succeeded, but may not persist in some cases of server
failure.
See the Write Concern (page 141) document for more information about choosing an appropriate write concern level
for your deployment.
Networking
Use Trusted Networking Environments Always run MongoDB in a trusted environment, with network rules that
prevent access from all unknown machines, systems, and networks. As with any sensitive system that is dependent on
network access, your MongoDB deployment should only be accessible to specific systems that require access, such as
application servers, monitoring services, and other MongoDB components.
Note: By default, authorization (page 331) is not enabled, and mongod assumes a trusted environment. Enable
authorization mode as needed. For more information on authentication mechanisms supported in MongoDB as
well as authorization in MongoDB, see Authentication (page 317) and Role-Based Access Control (page 331).
For additional information and considerations on security, refer to the documents in the Security Section (page 315),
specifically:
Security Checklist (page 315)
MongoDB Configuration Hardening (page 341)
Hardening Network Infrastructure (page 343)
Network Security Tutorials (page 382)
For Windows users, consider the Windows Server Technet Article on TCP Configuration64 when deploying MongoDB
on Windows.
64 http://technet.microsoft.com/en-us/library/dd349797.aspx
Disable HTTP Interface MongoDB provides an HTTP interface to check the status of the server and, optionally,
run queries. The HTTP interface is disabled by default. Do not enable the HTTP interface in production environments.
Deprecated since version 3.2: HTTP interface for MongoDB
See HTTP Status Interface (page 342).
Manage Connection Pool Sizes To avoid overloading the connection resources of a single mongod or mongos
instance, ensure that clients maintain reasonable connection pool sizes. Adjust the connection pool size to suit your
use case, beginning at 110-115% of the typical number of concurrent database requests.
The connPoolStats command returns information regarding the number of open connections to the current
database for mongos and mongod instances in sharded clusters.
See also Allocate Sufficient RAM and CPU (page 218).
Hardware Considerations
MongoDB is designed specifically with commodity hardware in mind and has few hardware requirements or limita-
tions. MongoDBs core components run on little-endian hardware, primarily x86/x86_64 processors. Client libraries
(i.e. drivers) can run on big or little endian systems.
MMAPv1 Due to its concurrency model, the MMAPv1 storage engine does not require many CPU cores . As such,
increasing the number of cores can help but does not provide significant return.
Increasing the amount of RAM accessible to MongoDB may help reduce the frequency of page faults.
WiredTiger The WiredTiger storage engine is multithreaded and can take advantage of many CPU cores. Specif-
ically, the total number of active threads (i.e. concurrent operations) relative to the number of CPUs can impact
performance:
Throughput increases as the number of concurrent active operations increases up to the number of CPUs.
Throughput decreases as the number of concurrent active operations exceeds the number of CPUs by some
threshold amount.
The threshold amount depends on your application. You can determine the optimum number of concurrent active
operations for your application by experimenting and measuring throughput. The output from mongostat provides
statistics on the number of active reads/writes in the (ar|aw) column.
With WiredTiger, MongoDB utilizes both the WiredTiger cache and the filesystem cache.
Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For systems with up to 10 GB of RAM, the new default setting is less than or equal to the 3.0 default setting (For
MongoDB 3.0, the WiredTiger cache uses either 1 GB or half of the installed physical RAM, whichever is larger).
For systems with more than 10 GB of RAM, the new default setting is greater than the 3.0 setting.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or
by other processes. Data in the filesystem cache is compressed.
The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node
contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM
available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less
than the amount of RAM available in the container. The exact amount depends on the other processes running in the
container.
To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the
serverStatus command.
See also:
Concurrency (page 216)
Use Solid State Disks (SSDs) MongoDB has good results and a good price-performance ratio with SATA SSD
(Solid State Disk).
Use SSD if available and economical. Spinning disks can be performant, but SSDs capacity for random I/O operations
works well with the update model of MMAPv1.
Commodity (SATA) spinning drives are often a good option, as the random I/O performance increase with more
expensive spinning drives is not that dramatic (only on the order of 2x). Using SSDs or increasing RAM may be more
effective in increasing I/O throughput.
MongoDB and NUMA Hardware Running MongoDB on a system with Non-Uniform Access Memory (NUMA)
can cause a number of operational problems, including slow performance for periods of time and high system process
usage.
When running MongoDB servers and clients on NUMA hardware, you should configure a memory interleave policy so
that the host behaves in a non-NUMA fashion. MongoDB checks NUMA settings on start up when deployed on Linux
(since version 2.0) and Windows (since version 2.6) machines. If the NUMA configuration may degrade performance,
MongoDB prints a warning.
See also:
The MySQL swap insanity problem and the effects of NUMA65 post, which describes the effects of NUMA
on databases. The post introduces NUMA and its goals, and illustrates how these goals are not compatible
with production databases. Although the blog post addresses the impact of NUMA for MySQL, the issues for
MongoDB are similar.
NUMA: An Overview66 .
65 http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
66 https://queue.acm.org/detail.cfm?id=2513149
Configuring NUMA on Windows On Windows, memory interleaving must be enabled through the machines
BIOS. Please consult your system documentation for details.
Configuring NUMA on Linux When running MongoDB on Linux, you may instead use the numactl command
and start the MongoDB programs (mongod, including the config servers (page 734); mongos; or clients) in the
following manner:
numactl --interleave=all <path>
where <path> is the path to the program you are starting. Then, disable zone reclaim in the proc settings using the
following command:
echo 0 > /proc/sys/vm/zone_reclaim_mode
To fully disable NUMA behavior, you must perform both operations. For more information, see the Documentation
for /proc/sys/vm/*67 .
Swap Assign swap space for your systems. Allocating swap space can avoid issues with memory contention and
can prevent the OOM Killer on Linux systems from killing mongod.
For the MMAPv1 storage engine, the method mongod uses to map files to memory ensures that the operating system
will never store MongoDB data in swap space. On Windows systems, using MMAPv1 requires extra swap space due
to commitment limits. For details, see MongoDB on Windows (page 223).
For the WiredTiger storage engine, given sufficient memory pressure, WiredTiger may store data in swap space .
Remote Filesystems With the MMAPv1 storage engine, the Network File System protocol (NFS) is not recom-
mended as you may see performance problems when both the data files and the journal files are hosted on NFS. You
may experience better performance if you place the journal on local or iscsi volumes.
With the WiredTiger storage engine, WiredTiger objects may be stored on remote file systems if the remote file system
conforms to ISO/IEC 9945-1:1996 (POSIX.1). Because remote file systems are often slower than local file systems,
using a remote file system for storage may degrade performance.
If you decide to use NFS, add the following NFS options to your /etc/fstab file: bg, nolock, and noatime.
Separate Components onto Different Storage Devices For improved performance, consider separating your
databases data, journal, and logs onto different storage devices, based on your applications access and write pat-
tern.
For the WiredTiger storage engine, you can also store the indexes on a different storage device. See
storage.wiredTiger.engineConfig.directoryForIndexes.
Note: Using different storage devices will affect your ability to create snapshot-style backups of your data, since the
67 http://www.kernel.org/doc/Documentation/sysctl/vm.txt
Scheduling for Virtual Devices Local block devices attached to virtual machine instances via the hypervisor should
use a noop scheduler for best performance. The noop scheduler allows the operating system to defer I/O scheduling to
the underlying hypervisor.
Architecture
Replica Sets See the Replica Set Architectures (page 626) document for an overview of architectural considerations
for replica set deployments.
Sharded Clusters See the Sharded Cluster Production Architecture (page 737) document for an overview of rec-
ommended sharded cluster architectures for production deployments.
See also:
Design Notes (page 238)
Compression
WiredTiger can compress collection data using either snappy or zlib compression library. snappy provides a lower
compression rate but has little performance cost, whereas zlib provides better compression rate but has a higher
performance cost.
By default, WiredTiger uses snappy compression library. To change the compression setting, see
storage.wiredTiger.collectionConfig.blockCompressor.
WiredTiger uses prefix compression on all indexes by default.
Note: MongoDB uses the GNU C Library68 (glibc) if available on a system. MongoDB requires version at least
glibc-2.12-1.2.el6 to avoid a known bug with earlier versions. For best results use at least version 2.13.
MongoDB on Linux
Kernel and File Systems When running MongoDB in production on Linux, it is recommended that you use Linux
kernel version 2.6.36 or later.
With the MMAPv1 storage engine, MongoDB preallocates its database files before using them and often creates large
files. As such, you should use the XFS and EXT4 file systems. If possible, use XFS as it generally performs better
with MongoDB.
With the WiredTiger storage engine, use of XFS is strongly recommended to avoid performance issues that have
been observed when using EXT4 with WiredTiger.
In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel.
In general, if you use the EXT4 file system, use at least version 2.6.23 of the Linux Kernel.
68 http://www.gnu.org/software/libc/
Some Linux distributions require different versions of the kernel to support using XFS and/or EXT4:
Linux Distribution Filesystem Kernel Version
CentOS 5.5 ext4, xfs 2.6.18-194.el5
CentOS 5.6 ext4, xfs 2.6.18-3.0.el5
CentOS 5.8 ext4, xfs 2.6.18-308.8.2.el5
CentOS 6.1 ext4, xfs 2.6.32-131.0.15.el6.x86_64
RHEL 5.6 ext4 2.6.18-3.0
RHEL 6.0 xfs 2.6.32-71
Ubuntu 10.04.4 LTS ext4, xfs 2.6.32-38-server
Amazon Linux AMI release 2012.03 ext4 3.2.12-3.2.4.amzn1.x86_64
fsync() on Directories
Important: MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual
Boxs shared folders do not support this operation.
Recommended Configuration For the MMAPv1 storage engine and the WiredTiger storage engines, consider the
following recommendations:
Turn off atime for the storage volume containing the database files.
Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the sug-
gestions in the ulimit (page 295) document. A low ulimit will affect MongoDB when under heavy use and can
produce errors and lead to failed connections to MongoDB processes and loss of service.
Disable Transparent Huge Pages, as MongoDB performs better with normal (4096 bytes) virtual memory pages.
See Transparent Huge Pages Settings (page 241).
Disable NUMA in your BIOS. If that is not possible, see MongoDB on NUMA Hardware (page 219).
Configure SELinux on Red Hat. For more information, see Configure SELinux for MongoDB (page 9) and
Configure SELinux for MongoDB Enterprise (page 36).
For the MMAPv1 storage engine:
Ensure that readahead settings for the block devices that store the database files are appropriate. For random
access use patterns, set low readahead values. A readahead of 32 (16 kB) often works well.
For a standard block device, you can run sudo blockdev --report to get the readahead settings and
sudo blockdev --setra <value> <device> to change the readahead settings. Refer to your spe-
cific operating system manual for more information.
For all MongoDB deployments:
Use the Network Time Protocol (NTP) to synchronize time among your hosts. This is especially important in
sharded clusters.
MongoDB and TLS/SSL Libraries On Linux platforms, you may observe one of the following statements in the
MongoDB log:
<path to SSL libs>/libssl.so.<version>: no version information available (required by /usr/bin/mongod
<path to SSL libs>/libcrypto.so.<version>: no version information available (required by /usr/bin/mon
These warnings indicate that the systems TLS/SSL libraries are different from the TLS/SSL libraries that the mongod
was compiled against. Typically these messages do not require intervention; however, you can use the following
operations to determine the symbol versions that mongod expects:
These operations will return output that resembles one the of the following lines:
0000000000000000 DF *UND* 0000000000000000 libssl.so.10 SSL_write
0000000000000000 DF *UND* 0000000000000000 OPENSSL_1.0.0 SSL_write
The last two strings in this output are the symbol version and symbol name. Compare these values with the values
returned by the following operations to detect symbol version mismatches:
objdump -T <path to TLS/SSL libs>/libssl.so.1*
objdump -T <path to TLS/SSL libs>/libcrypto.so.1*
This procedure is neither exact nor exhaustive: many symbols used by mongod from the libcrypto library do not
begin with CRYPTO_.
MongoDB on Windows
Install Hotfix for MongoDB 2.6.6 and Later Microsoft has released a hotfix for Windows 7 and Windows Server
2008 R2, KB273128469 , that repairs a bug in these operating systems use of memory-mapped files that adversely
affects the performance of MongoDB using the MMAPv1 storage engine.
Install this hotfix to obtain significant performance improvements on MongoDB 2.6.6 and later releases in the 2.6
series, which use MMAPv1 exclusively, and on 3.0 and later when using MMAPv1 as the storage engine.
Configure Windows Page File For MMAPv1 Configure the page file such that the minimum and maximum page
file size are equal and at least 32 GB. Use a multiple of this size if, during peak usage, you expect concurrent writes to
many databases or collections. However, the page file size does not need to exceed the maximum size of the database.
A large page file is needed as Windows requires enough space to accommodate all regions of memory mapped files
made writable during peak usage, regardless of whether writes actually occur.
The page file is not used for database storage and will not receive writes during normal MongoDB operation. As such,
the page file will not affect performance, but it must exist and be large enough to accommodate Windows commitment
rules during peak database use.
Note: Dynamic page file sizing is too slow to accommodate the rapidly fluctuating commit charge of an active
MongoDB deployment. This can result in transient overcommitment situations that may lead to abrupt server shutdown
with a VirtualProtect error 1455.
MongoDB 3.0 Using WiredTiger For MongoDB instances using the WiredTiger storage engine, performance on
Windows is comparable to performance on Linux.
MongoDB on Virtual Environments This section describes considerations when running MongoDB in some of the
more common virtual environments.
For all platforms, consider Scheduling for Virtual Devices (page 221).
69 http://support.microsoft.com/kb/2731284
EC2 MongoDB is compatible with EC2. MongoDB Cloud Manager70 provides integration with Amazon Web
Services (AWS) and lets you deploy new EC2 instances directly from MongoDB Cloud Manager. See Configure AWS
Integration71 for more details.
Azure For all MongoDB deployments using Azure, you must mount the volume that hosts the mongod instances
dbPath with the Host Cache Preference READ/WRITE.
This applies to all Azure deployments, using any guest operating system.
If your volumes have inappropriate cache settings, MongoDB may eventually shut down with the following error:
[DataFileSync] FlushViewOfFile for <data file> failed with error 1 ...
[DataFileSync] Fatal Assertion 16387
These shut downs do not produce data loss when storage.journal.enabled is set to true. You can safely
restart mongod at any time following this event.
The performance characteristics of MongoDB may change with READ/WRITE caching enabled.
The TCP keepalive on the Azure load balancer is 240 seconds by default, which can cause it to silently drop connec-
tions if the TCP keepalive on your Azure systems is greater than this value. You should set tcp_keepalive_time
to 120 to ameliorate this problem.
On Linux systems:
To view the keep alive setting, you can use one of the following commands:
sysctl net.ipv4.tcp_keepalive_time
Or:
cat /proc/sys/net/ipv4/tcp_keepalive_time
Or:
echo <value> | sudo tee /proc/sys/net/ipv4/tcp_keepalive_time
These operations do not persist across system reboots. To persist the setting, add the following line to
/etc/sysctl.conf:
net.ipv4.tcp_keepalive_time = <value>
On Linux, mongod and mongos processes limit the keepalive to a maximum of 300 seconds (5 minutes) on
their own sockets by overriding keepalive values greater than 5 minutes.
For Windows systems:
To view the keep alive setting, issue the following command:
reg query HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v KeepAliveTime
The registry value is not present by default. The system default, used if the value is absent, is 7200000 millisec-
onds or 0x6ddd00 in hexadecimal.
70 https://cloud.mongodb.com/?jmp=docs
71 https://docs.cloud.mongodb.com/tutorial/configure-aws-settings/
To change the KeepAliveTime value, use the following command in an Administrator Command Prompt,
where <value> is expressed in hexadecimal (e.g. 0x0124c0 is 120000):
reg add HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ /v KeepAliveTime /d <value>
Windows users should consider the Windows Server Technet Article on KeepAliveTime72 for more information
on setting keep alive for MongoDB deployments on Windows systems.
Performance Monitoring
iostat On Linux, use the iostat command to check if disk I/O is a bottleneck for your database. Specify a number
of seconds when running iostat to avoid displaying stats covering the time since server boot.
For example, the following command will display extended statistics and the time for each displayed report, with
traffic in MB/s, at one second intervals:
iostat -xmt 1
bwm-ng bwm-ng73 is a command-line tool for monitoring network use. If you suspect a network-based bottleneck,
you may use bwm-ng to begin your diagnostic process.
Backups
To make backups of your MongoDB database, please refer to MongoDB Backup Methods Overview (page 200).
Additional Resources
Blog Post: Capacity Planning and Hardware Provisioning for MongoDB In Ten Minutes74
Whitepaper: MongoDB Multi-Data Center Deployments75
72 https://technet.microsoft.com/en-us/library/cc957549.aspx
73 http://www.gropp.org/?id=projects&sub=bwm-ng
74 https://www.mongodb.com/blog/post/capacity-planning-and-hardware-provisioning-mongodb-ten-minutes?jmp=docs
75 http://www.mongodb.com/lp/white-paper/multi-dc?jmp=docs
These document introduce data management practices and strategies for MongoDB deployments, including strategies
for managing multi-data center deployments, managing larger file stores, and data lifecycle tools.
Data Center Awareness (page 226) Presents the MongoDB features that allow application developers and database
administrators to configure their deployments to be more data center aware or allow operational and location-
based separation.
Capped Collections (page 228) Capped collections provide a special type of size-constrained collections that preserve
insertion order and can support high volume inserts.
Expire Data from Collections by Setting TTL (page 231) TTL collections make it possible to automatically remove
data from a collection based on the value of a timestamp and are useful for managing data like machine generated
event data that are only useful for a limited period of time.
On this page
Further Reading (page 228)
Additional Resource (page 228)
MongoDB provides a number of features that allow application developers and database administrators to customize
the behavior of a sharded cluster or replica set deployment so that MongoDB may be more data center aware, or
allow operational and location-based separation.
MongoDB also supports segregation based on functional parameters, to ensure that certain mongod instances are
only used for reporting workloads or that certain high-frequency portions of a sharded collection only exist on specific
shards.
The following documents, found either in this section or other sections of this manual, provide information on cus-
tomizing a deployment for operation- and location-based separation:
Operational Segregation in MongoDB Deployments (page 227) MongoDB lets you specify that certain application
operations use certain mongod instances.
Tag Aware Sharding (page 748) Tags associate specific ranges of shard key values with specific shards for use in
managing deployment patterns.
Manage Shard Tags (page 808) Use tags to associate specific ranges of shard key values with specific shards.
76 https://www.mongodb.com/lp/white-paper/mongodb-security-architecture?jmp=docs
77 https://www.mongodb.com/lp/whitepaper/architecture-guide?jmp=docs
78 http://www.mongodb.com/presentations/webinar-mongodb-administration-101?jmp=docs
79 https://www.mongodb.com/products/consulting?jmp=docs#s_product_readiness
On this page
Operational Overview (page 227)
Additional Resource (page 227)
Operational Overview MongoDB includes a number of features that allow database administrators and developers
to segregate application operations to MongoDB deployments by functional or geographical groupings.
This capability provides data center awareness, which allows applications to target MongoDB deployments with
consideration of the physical location of the mongod instances. MongoDB supports segmentation of operations
across different dimensions, which may include multiple data centers and geographical regions in multi-data center
deployments, racks, networks, or power circuits in single data center deployments.
MongoDB also supports segregation of database operations based on functional or operational parameters, to ensure
that certain mongod instances are only used for reporting workloads or that certain high-frequency portions of a
sharded collection only exist on specific shards.
Specifically, with MongoDB, you can:
ensure write operations propagate to specific members of a replica set, or to specific members of replica sets.
ensure that specific members of a replica set respond to queries.
ensure that specific ranges of your shard key balance onto and reside on specific shards.
combine the above features in a single distributed deployment, on a per-operation (for read and write operations)
and collection (for chunk distribution in sharded clusters distribution) basis.
For full documentation of these features, see the following documentation in the MongoDB Manual:
Read Preferences (page 641), which controls how drivers help applications target read operations to members
of a replica set.
Write Concerns (page 141), which controls how MongoDB ensures that write operations propagate to members
of a replica set.
Replica Set Tags (page 691), which control how applications create and interact with custom groupings of replica
set members to create custom application-specific read preferences and write concerns.
Tag Aware Sharding (page 748), which allows MongoDB administrators to define an application-specific bal-
ancing policy, to control how documents belonging to specific ranges of a shard key distribute to shards in the
sharded cluster.
See also:
Before adding operational segregation features to your application and MongoDB deployment, become familiar with
all documentation of replication (page 613), and sharding (page 725).
Additional Resource
Whitepaper: MongoDB Multi-Data Center Deployments80
Webinar: Multi-Data Center Deployment81
80 http://www.mongodb.com/lp/white-paper/multi-dc?jmp=docs
81 https://www.mongodb.com/presentations/webinar-multi-data-center-deployment?jmp=docs
Further Reading
The Write Concern (page 141) and Read Preference (page 641) documents, which address capabilities related
to data center awareness.
Deploy a Geographically Redundant Replica Set (page 662).
Additional Resource
Capped Collections
On this page
Overview (page 228)
Behavior (page 228)
Restrictions and Recommendations (page 229)
Procedures (page 229)
Overview
Capped collections are fixed-size collections that support high-throughput operations that insert and retrieve docu-
ments based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its
allocated space, it makes room for new documents by overwriting the oldest documents in the collection.
See createCollection() or create for more information on creating capped collections.
Behavior
Insertion Order Capped collections guarantee preservation of the insertion order. As a result, queries do not need
an index to return documents in insertion order. Without this indexing overhead, capped collections can support higher
insertion throughput.
Automatic Removal of Oldest Documents To make room for new documents, capped collections automatically
remove the oldest documents in the collection without requiring scripts or explicit remove operations.
For example, the oplog.rs collection that stores a log of the operations in a replica set uses a capped collection.
Consider the following potential use cases for capped collections:
Store log information generated by high-volume systems. Inserting documents in a capped collection without
an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in
first-in-first-out property maintains the order of events, while managing storage use.
Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would
either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write
penalty for the required index or indexes.
82 http://www.mongodb.com/lp/white-paper/multi-dc?jmp=docs
83 https://www.mongodb.com/presentations/webinar-multi-data-center-deployment?jmp=docs
Updates If you plan to update documents in a capped collection, create an index so that these update operations do
not require a table scan.
With MMAPv1, you can only make in-place updates of documents. If the update operation causes a document to grow
beyond the documents original size, the update operation will fail.
Replica Sets with MMAPv1 Secondaries If you update a document in a capped collection to a size smaller than its
original size and a secondary resyncs from the primary, the secondary will replicate and allocate space based on the
current smaller document size.
If the primary then receives an update which increases the document back to its original size, the primary will accept the
update. However, for MMAPv1, the secondary will fail with a failing update: objects in a capped
ns cannot grow error message.
To prevent this error, create your secondary from a snapshot of one of the other up-to-date members of the replica
set. Follow the :doc: tutorial on filesystem snapshots </tutorial/backup-with-filesystem-snapshots> to seed your new
secondary.
Seeding the secondary with a filesystem snapshot is the only way to guarantee the primary and secondary binary files
are compatible. MongoDB Cloud Manager Backup snapshots are insufficient in this situation since you need more
than the content of the secondary to match the primary.
Document Deletion You cannot delete documents from a capped collection. To remove all documents from a
collection, use the drop() method to drop the collection and recreate the capped collection.
Query Efficiency Use natural ordering to retrieve the most recently inserted elements from the collection efficiently.
This is (somewhat) analogous to tail on a log file.
Aggregation $out The aggregation pipeline operator $out cannot write results to a capped collection.
Procedures
Create a Capped Collection You must create capped collections explicitly using the createCollection()
method, which is a helper in the mongo shell for the create command. When creating a capped collection you must
specify the maximum size of the collection in bytes, which MongoDB will pre-allocate for the collection. The size of
the capped collection includes a small amount of space for internal overhead.
db.createCollection( "log", { capped: true, size: 100000 } )
If the size field is less than or equal to 4096, then the collection will have a cap of 4096 bytes. Otherwise, MongoDB
will raise the provided size to make it an integer multiple of 256.
Additionally, you may also specify a maximum number of documents for the collection using the max field as in the
following document:
Important: The size argument is always required, even when you specify max number of documents. MongoDB
will remove older documents if a collection reaches the maximum size limit before it reaches the maximum document
count.
See
createCollection() and create.
Query a Capped Collection If you perform a find() on a capped collection with no ordering specified, MongoDB
guarantees that the ordering of results is the same as the insertion order.
To retrieve documents in reverse insertion order, issue find() along with the sort() method with the $natural
parameter set to -1, as shown in the following example:
db.cappedCollection.find().sort( { $natural: -1 } )
Check if a Collection is Capped Use the isCapped() method to determine if a collection is capped, as follows:
db.collection.isCapped()
Convert a Collection to Capped You can convert a non-capped collection to a capped collection with the
convertToCapped command:
db.runCommand({"convertToCapped": "mycoll", size: 100000});
The size parameter specifies the size of the capped collection in bytes.
Warning: This command obtains a global write lock and will block other operations until it has completed.
Changed in version 2.2: Before 2.2, capped collections did not have an index on _id unless you specified
autoIndexId to the create, after 2.2 this became the default.
Automatically Remove Data After a Specified Period of Time For additional flexibility when expiring data, con-
sider MongoDBs TTL indexes, as described in Expire Data from Collections by Setting TTL (page 231). These indexes
allow you to expire and remove data from normal collections using a special type, based on the value of a date-typed
field and a TTL value for the index.
TTL Collections (page 231) are not compatible with capped collections.
Tailable Cursor You can use a tailable cursor with capped collections. Similar to the Unix tail -f command,
the tailable cursor tails the end of a capped collection. As new documents are inserted into the capped collection,
you can use the tailable cursor to continue retrieving documents.
See Create Tailable Cursor (page 133) for information on creating a tailable cursor.
On this page
Procedures (page 231)
Procedures
To create a TTL index (page 512), use the db.collection.createIndex() method with the
expireAfterSeconds option on a field whose value is either a date (page 197) or an array that contains date
values (page 197).
Note: The TTL index is a single field index. Compound indexes do not support the TTL property. For more
information on TTL indexes, see TTL Indexes (page 512).
Expire Documents after a Specified Number of Seconds To expire data after a specified number of seconds has
passed since the indexed field, create a TTL index on a field that holds values of BSON date type or an array of BSON
date-typed objects and specify a positive non-zero value in the expireAfterSeconds field. A document will
expire when the number of seconds in the expireAfterSeconds field has passed since the time specified in its
indexed field. 84
For example, the following operation creates an index on the log_events collections createdAt field and spec-
ifies the expireAfterSeconds value of 3600 to set the expiration time to be one hour after the time specified by
createdAt.
db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
When adding documents to the log_events collection, set the createdAt field to the current time:
db.log_events.insert( {
"createdAt": new Date(),
"logEvent": 2,
"logMessage": "Success!"
} )
MongoDB will automatically delete documents from the log_events collection when the documents createdAt
value 1 is older than the number of seconds specified in expireAfterSeconds.
See also:
84 If the field contains an array of BSON date-typed objects, data expires if at least one of BSON date-typed object is older than the number of
$currentDate operator
Expire Documents at a Specific Clock Time To expire documents at a specific clock time, begin by creating a
TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify an
expireAfterSeconds value of 0. For each document in the collection, set the indexed date field to a value
corresponding to the time the document should expire. If the indexed date field contains a date in the past, MongoDB
considers the document expired.
For example, the following operation creates an index on the log_events collections expireAt field and specifies
the expireAfterSeconds value of 0:
db.log_events.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } )
For each document, set the value of expireAt to correspond to the time the document should expire. For instance,
the following insert() operation adds a document that should expire at July 22, 2013 14:00:00.
db.log_events.insert( {
"expireAt": new Date('July 22, 2013 14:00:00'),
"logEvent": 2,
"logMessage": "Success!"
} )
MongoDB will automatically delete documents from the log_events collection when the documents expireAt
value is older than the number of seconds specified in expireAfterSeconds, i.e. 0 seconds older in this case. As
such, the data expires at the specified expireAt value.
There are many factors that can affect database performance and responsiveness including index use, query structure,
data models and application design, as well as operational factors such as architecture and system configuration.
This section describes techniques for optimizing application performance with MongoDB.
Analyzing MongoDB Performance (page 232) Discusses some of the factors that can influence MongoDBs perfor-
mance.
Evaluate Performance of Current Operations (page 235) MongoDB provides introspection tools that describe the
query execution process, to allow users to test queries and build more efficient queries.
Optimize Query Performance (page 236) Introduces the use of projections (page 66) to reduce the amount of data
MongoDB sends to clients.
Design Notes (page 238) A collection of notes related to the architecture, design, and administration of MongoDB-
based applications.
On this page
Locking Performance (page 233)
Memory and the MMAPv1 Storage Engine (page 233)
Number of Connections (page 234)
Database Profiling (page 234)
Additional Resources (page 235)
As you develop and operate applications with MongoDB, you may need to analyze the performance of the application
and its database. When you encounter degraded performance, it is often a function of database access strategies,
hardware availability, and the number of open database connections.
Some users may experience performance limitations as a result of inadequate or inappropriate indexing strategies, or
as a consequence of poor schema design patterns. Locking Performance (page 233) discusses how these can impact
MongoDBs internal locking.
Performance issues may indicate that the database is operating at capacity and that it is time to add additional capacity
to the database. In particular, the applications working set should fit in the available physical memory. See Memory
and the MMAPv1 Storage Engine (page 233) for more information on the working set.
In some cases performance issues may be temporary and related to abnormal traffic load. As discussed in Number of
Connections (page 234), scaling can help relax excessive traffic.
Database Profiling (page 234) can help you to understand what operations are causing degradation.
Locking Performance
MongoDB uses a locking system to ensure data set consistency. If certain operations are long-running or a queue
forms, performance will degrade as requests and operations wait for the lock.
Lock-related slowdowns can be intermittent. To see if the lock has been affecting your performance, refer to the
server-status-locks section and the globalLock section of the serverStatus output.
Dividing locks.timeAcquiringMicros by locks.acquireWaitCount can give an approximate average
wait time for a particular lock mode.
locks.deadlockCount provide the number of times the lock acquisitions encountered deadlocks.
If globalLock.currentQueue.total is consistently high, then there is a chance that a large number of re-
quests are waiting for a lock. This indicates a possible concurrency issue that may be affecting performance.
If globalLock.totalTime is high relative to uptime, the database has existed in a lock state for a significant
amount of time.
Long queries can result from ineffective use of indexes; non-optimal schema design; poor query structure; system
architecture issues; or insufficient RAM resulting in page faults (page 234) and disk reads.
Memory Use With the MMAPv1 (page 595) storage engine, MongoDB uses memory-mapped files to store data.
Given a data set of sufficient size, the mongod process will allocate all available memory on the system for its use.
While this is intentional and aids performance, the memory mapped files make it difficult to determine if the amount
of RAM is sufficient for the data set.
The memory usage statuses metrics of the serverStatus output can provide insight into MongoDBs memory use.
The mem.resident field provides the amount of resident memory in use. If this exceeds the amount of system
memory and there is a significant amount of data on disk that isnt in RAM, you may have exceeded the capacity of
your system.
You can inspect mem.mapped to check the amount of mapped memory that mongod is using. If this value is greater
than the amount of system memory, some operations will require a page faults to read data from disk.
Page Faults With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts
of its data files that are not currently located in physical memory. In contrast, operating system page faults happen
when physical memory is exhausted and pages of physical memory are swapped to disk.
MongoDB reports its triggered page faults as the total number of page faults in one second. To check for page faults,
see the extra_info.page_faults value in the serverStatus output.
Rapid increases in the MongoDB page fault counter may indicate that the server has too little physical memory. Page
faults also can occur while accessing large data sets or scanning an entire collection.
A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults
typically indicate that MongoDB is reading too much data from disk.
MongoDB can often yield read locks after a page fault, allowing other database processes to read while mongod
loads the next page into memory. Yielding the read lock following a page fault improves concurrency, and also
improves overall throughput in high volume systems.
Increasing the amount of RAM accessible to MongoDB may help reduce the frequency of page faults. If this is not
possible, you may want to consider deploying a sharded cluster or adding shards to your deployment to distribute load
among mongod instances.
See What are page faults? (page 854) for more information.
Number of Connections
In some cases, the number of connections between the applications and the database can overwhelm the ability of the
server to handle requests. The following fields in the serverStatus document can provide insight:
globalLock.activeClients contains a counter of the total number of clients with active operations in
progress or queued.
connections is a container for the following two fields:
connections.current the total number of current clients that connect to the database instance.
connections.available the total number of unused connections available for new clients.
If there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this
is the case, then you will need to increase the capacity of your deployment.
For read-heavy applications, increase the size of your replica set and distribute read operations to secondary members.
For write-heavy applications, deploy sharding and add one or more shards to a sharded cluster to distribute load
among mongod instances.
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported
MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently.
Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or
other configuration error.
Unless constrained by system-wide limits, MongoDB has no limit on incoming connections. On Unix-based systems,
you can modify system limits using the ulimit command, or by editing your systems /etc/sysctl file. See
UNIX ulimit Settings (page 295) for more information.
Database Profiling
MongoDBs Profiler is a database profiling system that can help identify inefficient queries and operations.
The following profiling levels are available:
Level Setting
0 Off. No profiling
1 On. Only includes slow operations
2 On. Includes all operations
Enable the profiler by setting the profile value using the following command in the mongo shell:
db.setProfilingLevel(1)
The slowOpThresholdMs setting defines what constitutes a slow operation. To set the threshold above
which the profiler considers operations slow (and thus, included in the level 1 profiling data), you can configure
slowOpThresholdMs at runtime as an argument to the db.setProfilingLevel() operation.
See
The documentation of db.setProfilingLevel() for more information.
By default, mongod records all slow queries to its log, as defined by slowOpThresholdMs.
Note: Because the database profiler can negatively impact performance, only enable profiling for strategic intervals
and as minimally as possible on production systems.
You may enable profiling on a per-mongod basis. This setting will not propagate across a replica set or sharded
cluster.
You can view the output of the profiler in the system.profile collection of your database by issuing the show
profile command in the mongo shell, or with the following operation:
db.system.profile.find( { millis : { $gt : 100 } } )
This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (100, in this
example) is above the slowOpThresholdMs threshold.
You must use the $query operator to access the query field of documents within system.profile.
Additional Resources
On this page
Use the Database Profiler to Evaluate Operations Against the Database (page 236)
Use db.currentOp() to Evaluate mongod Operations (page 236)
Use explain to Evaluate Query Performance (page 236)
Additional Resources (page 236)
MongoDB provides a database profiler that shows performance characteristics of each operation against the database.
Use the profiler to locate any queries or write operations that are running slow. You can use this information, for
example, to determine what indexes to create.
For more information, see Database Profiling (page 234).
Example
To use cursor.explain() on a query for documents matching the expression { a: 1 }, in the collection
named records, use an operation that resembles the following in the mongo shell:
db.records.find( { a: 1 } ).explain("executionStats")
Additional Resources
On this page
Create Indexes to Support Queries (page 236)
Limit the Number of Query Results to Reduce Network Demand (page 237)
Use Projections to Return Only Necessary Data (page 237)
Use $hint to Select a Particular Index (page 238)
Use the Increment Operator to Perform Operations Server-Side (page 238)
Additional Resources (page 238)
For commonly issued queries, create indexes (page 487). If a query searches multiple fields, create a compound index
(page 495). Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the
documents reference, and store references in order.
86 https://www.mongodb.com/products/consulting?jmp=docs#performance_evaluation
Example
If you have a posts collection containing blog posts, and if you regularly issue a query that sorts on the
author_name field, then you can optimize the query by creating an index on the author_name field:
db.posts.createIndex( { author_name : 1 } )
Indexes also improve efficiency on queries that routinely sort on a given field.
Example
If you regularly issue a query that sorts on the timestamp field, then you can optimize the query by creating an
index on the timestamp field:
Creating this index:
db.posts.createIndex( { timestamp : 1 } )
Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does
not matter.
Indexes support queries, update operations, and some phases of the aggregation pipeline (page 448).
Index keys that are of the BinData type are more efficiently stored in the index if:
the binary subtype value is in the range of 0-7 or 128-135, and
the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.
MongoDB cursors return results in groups of multiple documents. If you know the number of results you want, you
can reduce the demand on network resources by issuing the limit() method.
This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to
the posts collection, you would issue the following command:
db.posts.find().sort( { timestamp : -1 } ).limit(10)
When you need only a subset of fields from documents, you can achieve better performance by returning only the
fields you need:
For example, if in your query to the posts collection, you need only the timestamp, title, author, and
abstract fields, you would issue the following command:
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1
For more information on using projections, see Limit Fields to Return from a Query (page 115).
In most cases the query optimizer (page 72) selects the optimal index for a specific operation; however, you can force
MongoDB to use a specific index using the hint() method. Use hint() to support performance testing, or on
some queries where you must select a field or field included in several indexes.
Use MongoDBs $inc operator to increment or decrement values in documents. The operator increments the value
of the field on the server side, as an alternative to selecting a document, making simple modifications in the client
and then writing the entire document to the server. The $inc operator can also help avoid race conditions, which
would result when two application instances queried for a document, manually incremented a field, and saved the
entire document back at the same time.
Additional Resources
Design Notes
On this page
Schema Considerations (page 238)
General Considerations (page 239)
Replica Set Considerations (page 239)
Sharding Considerations (page 240)
Analyze Performance (page 240)
Additional Resources (page 240)
This page details features of MongoDB that may be important to keep in mind when developing applications.
Schema Considerations
Dynamic Schema Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This
facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly ho-
mogeneous structures. See Data Modeling Concepts (page 162) for more information.
Some operational considerations include:
the exact set of collections to be used;
the indexes to be used: with the exception of the _id index, all indexes must be created explicitly;
shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once
set.
Avoid importing unmodified data directly from a relational database. In general, you will want to roll up certain
data into richer documents that take advantage of MongoDBs support for embedded documents and nested arrays.
87 https://www.mongodb.com/products/consulting?jmp=docs#performance_evaluation
Case Sensitive Strings MongoDB strings are case sensitive. So a search for "joe" will not find "Joe".
Consider:
storing data in a normalized case format, or
using regular expressions ending with the i option, and/or
using $toLower or $toUpper in the aggregation framework (page 447).
Type Sensitive Fields MongoDB data is stored in the BSON format, a binary encoded serialization of JSON-like
documents. BSON encodes additional type information. See bsonspec.org88 for more information.
Consider the following document which has a field x with the string value "123":
{ x : "123" }
Then the following query which looks for a number value 123 will not return that document:
db.mycollection.find( { x : 123 } )
General Considerations
By Default, Updates Affect one Document To update multiple documents that meet your query criteria, set the
update multi option to true or 1. See: Update Multiple Documents (page 85).
Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional
boolean options. See: the update method reference documentation.
BSON Document Size Limit The BSON Document Size limit is currently set at 16 MB per document. If you
require larger documents, use GridFS (page 603).
No Fully Generalized Transactions MongoDB does not have fully generalized transactions (page 88). If you
model your data using rich documents that closely resemble your applications objects, each logical object will be in
one MongoDB document. MongoDB allows you to modify a document in a single atomic operation. These kinds of
data modification pattern covers most common uses of transactions in other systems.
Use an Odd Number of Replica Set Members Replica sets (page 613) perform consensus elections. To ensure
that elections will proceed successfully, either use an odd number of members, typically three, or else use an arbiter
to ensure an odd number of votes.
Keep Replica Set Members Up-to-Date MongoDB replica sets support automatic failover (page 635). It is impor-
tant for your secondaries to be up-to-date. There are various strategies for assessing consistency:
1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB (page 203) for a detailed discus-
sion of MongoDBs monitoring options.
2. Specify appropriate write concern.
88 http://bsonspec.org/#/specification
3. If your application requires manual fail over, you can configure your secondaries as priority 0 (page 621).
Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but
large deployments should fail over automatically.
See also:
replica set rollbacks (page 638).
Sharding Considerations
Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded.
Shard key values are immutable.
When enabling sharding on an existing collection, MongoDB imposes a maximum size on those col-
lections to ensure that it is possible to create chunks. For a detailed explanation of this limit, see:
<sharding-existing-collection-data-size>.
To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source
collection using an application level import operation.
Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for
Sharded Collections (page 810).
Consider pre-splitting (page 800) an empty sharded collection before a massive bulk import.
Analyze Performance
As you develop and operate applications with MongoDB, you may want to analyze the performance of the database
as the application. Analyzing MongoDB Performance (page 232) discusses some of the operational factors that can
influence performance.
Additional Resources
The administration tutorials provide specific step-by-step instructions for performing common MongoDB setup, main-
tenance, and configuration operations.
Configuration, Maintenance, and Analysis (page 241) Describes routine management operations, including config-
uration and performance analysis.
Manage mongod Processes (page 245) Start, configure, and manage running mongod process.
Rotate Log Files (page 253) Archive the current log files and start new ones.
Continue reading from Configuration, Maintenance, and Analysis (page 241) for additional tutorials of funda-
mental MongoDB maintenance procedures.
Backup and Recovery (page 266) Outlines procedures for data backup and restoration with mongod instances and
deployments.
89 https://www.mongodb.com/products/consulting?jmp=docs#ops_optimization
Backup and Restore with Filesystem Snapshots (page 266) An outline of procedures for creating MongoDB
data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools.
Backup and Restore Sharded Clusters (page 277) Detailed procedures and considerations for backing up
sharded clusters and single shards.
Recover Data after an Unexpected Shutdown (page 289) Recover data from MongoDB data files that were not
properly closed or have an invalid state.
Continue reading from Backup and Recovery (page 266) for additional tutorials of MongoDB backup and re-
covery procedures.
MongoDB Tutorials (page 292) A complete list of tutorials in the MongoDB Manual that address MongoDB opera-
tion and use.
The following tutorials describe routine management operations, including configuration and performance analysis:
Disable Transparent Huge Pages (THP) (page 241) Describes Transparent Huge Pages (THP) and provides detailed
instructions on disabling them.
Use Database Commands (page 244) The process for running database commands that provide basic database oper-
ations.
Manage mongod Processes (page 245) Start, configure, and manage running mongod process.
Terminate Running Operations (page 248) Stop in progress MongoDB client operations using db.killOp() and
maxTimeMS().
Analyze Performance of Database Operations (page 249) Collect data that introspects the performance of query and
update operations on a mongod instance.
Rotate Log Files (page 253) Archive the current log files and start new ones.
Manage Journaling (page 255) Describes the procedures for configuring and managing MongoDBs journaling sys-
tem, which allows MongoDB to provide crash resiliency and durability.
Store a JavaScript Function on the Server (page 257) Describes how to store JavaScript functions on a MongoDB
server.
Upgrade to the Latest Revision of MongoDB (page 258) Introduces the basic process for upgrading a MongoDB de-
ployment between different minor release versions.
Monitor MongoDB With SNMP on Linux (page 261) The SNMP extension, available in MongoDB Enterprise, al-
lows MongoDB to provide database metrics via SNMP.
Monitor MongoDB Windows with SNMP (page 263) The SNMP extension, available in the Windows build of Mon-
goDB Enterprise, allows MongoDB to provide database metrics via SNMP.
Troubleshoot SNMP (page 265) Outlines common errors and diagnostic processes useful for deploying MongoDB
Enterprise with SNMP support.
On this page
Init Script (page 242)
Using tuned and ktune (page 243)
Test Your Changes (page 244)
Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation
Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages.
However, database workloads often perform poorly with THP, because they tend to have sparse rather than contiguous
memory access patterns. You should disable THP on Linux machines to ensure best performance with MongoDB.
Init Script
Important: If you are using tuned or ktune (for example, if you are running Red Hat or CentOS 6+), you must
additionally configure them so that THP is not re-enabled. See Using tuned and ktune (page 243).
Step 1: Create the init.d script. Create the following file at /etc/init.d/disable-transparent-hugepages:
#!/bin/sh
### BEGIN INIT INFO
# Provides: disable-transparent-hugepages
# Required-Start: $local_fs
# Required-Stop:
# X-Start-Before: mongod mongodb-mms-automation-agent
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable Linux transparent huge pages
# Description: Disable Linux transparent huge pages, to improve
# database performance.
### END INIT INFO
case $1 in
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
thp_path=/sys/kernel/mm/transparent_hugepage
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
thp_path=/sys/kernel/mm/redhat_transparent_hugepage
else
return 0
fi
unset thp_path
;;
esac
Step 2: Make it executable. Run the following command to ensure that the init script can be used:
sudo chmod 755 /etc/init.d/disable-transparent-hugepages
Step 3: Configure your operating system to run it on boot. Use the appropriate command to configure the new
init script on your Linux distribution.
Distribution Command
Ubuntu and Debian
sudo update-rc.d disable-transparent-hugepages defa
SUSE
sudo insserv /etc/init.d/disable-transparent-hugepa
Step 4: Override tuned and ktune, if applicable If you are using tuned or ktune (for example, if you are
running Red Hat or CentOS 6+) you must now configure them to preserve the above settings.
Important: If using tuned or ktune, you must perform this step in addition to installing the init script.
tuned and ktune are dynamic kernel tuning tools available on Red Hat and CentOS that can disable transparent
huge pages.
To disable transparent huge pages in tuned or ktune, you need to edit or create a new profile that sets THP to
never.
Red Hat/CentOS 6
Step 1: Create a new profile. Create a new profile from an existing default profile by copying the relevant directory.
In the example we use the default profile as the base and call our new profile no-thp.
sudo cp -r /etc/tune-profiles/default /etc/tune-profiles/no-thp
Step 3: Enable the new profile. Finally, enable the new profile by issuing:
sudo tuned-adm profile no-thp
Red Hat/CentOS 7
Step 2: Edit tuned.conf. Create and edit /etc/tuned/no-thp/tuned.conf so that it contains the fol-
lowing:
[main]
include=virtual-guest
[vm]
transparent_hugepages=never
Step 3: Enable the new profile. Finally, enable the new profile by issuing:
sudo tuned-adm profile no-thp
You can check the status of THP support by issuing the following commands:
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
On Red Hat Enterprise Linux, CentOS, and potentially other Red Hat-based derivatives, you may instead need to use
the following:
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
On this page
Database Command Form (page 244)
Issue Commands (page 245)
admin Database Commands (page 245)
Command Responses (page 245)
The MongoDB command interface provides access to all non CRUD database operations. Fetching server stats,
initializing a replica set, and running a map-reduce job are all accomplished with commands.
See https://docs.mongodb.org/manual/reference/command for list of all commands sorted by func-
tion.
You specify a command first by constructing a standard BSON document whose first key is the name of the command.
For example, specify the isMaster command using the following BSON document:
{ isMaster: 1 }
Issue Commands
The mongo shell provides a helper method for running commands called db.runCommand(). The following
operation in mongo runs the above command:
db.runCommand( { isMaster: 1 } )
Many drivers provide an equivalent for the db.runCommand() method. Internally, running commands with
db.runCommand() is equivalent to a special query against the $cmd collection.
Many common commands have their own shell helpers or wrappers in the mongo shell and drivers, such as the
db.isMaster() method in the mongo JavaScript shell.
You can use the maxTimeMS option to specify a time limit for the execution of a command, see Terminate a Command
(page 248) for more information on operation termination.
You must run some commands on the admin database. Normally, these operations resemble the followings:
use admin
db.runCommand( {buildInfo: 1} )
However, theres also a command helper that automatically runs the command in the context of the admin database:
db._adminCommand( {buildInfo: 1} )
Command Responses
All commands return, at minimum, a document with an ok field indicating whether the command has succeeded:
{ 'ok': 1 }
On this page
Start mongod Processes (page 246)
Stop mongod Processes (page 246)
Stop a Replica Set (page 247)
MongoDB runs as a standard program. You can start MongoDB from a command line by issuing the mongod com-
mand and specifying options. For a list of options, see the mongod reference. MongoDB can also run as a Windows
service. For details, see Configure a Windows Service for MongoDB Community Edition (page 31). To install Mon-
goDB, see Install MongoDB (page 5).
The following examples assume the directory containing the mongod process is in your system paths. The mongod
process is the primary database process that runs on an individual server. mongos provides a coherent MongoDB
interface equivalent to a mongod from the perspective of a client. The mongo binary provides the administrative
shell.
This document discusses the mongod process; however, some portions of this document may be applicable to mongos
instances.
By default, MongoDB stores data in the /data/db directory. On Windows, MongoDB stores data in C:\data\db.
On all platforms, MongoDB listens for connections from clients on port 27017.
To start MongoDB using all defaults, issue the following command at the system shell:
mongod
Specify a Data Directory If you want mongod to store data files at a path other than /data/db you can specify
a dbPath. The dbPath must exist before you start mongod. If it does not exist, create the directory and the
permissions so that mongod can read and write data to this path. For more information on permissions, see the
security operations documentation (page 315).
To specify a dbPath for mongod to use as a data directory, use the --dbpath option. The following invocation
will start a mongod instance and store data in the /srv/mongodb path
mongod --dbpath /srv/mongodb/
Specify a TCP Port Only a single process can listen for connections on a network interface at a time. If you run
multiple mongod processes on a single machine, or have other processes that must use this port, you must assign each
a different port to listen on for client connections.
To specify a port to mongod, use the --port option on the command line. The following command starts mongod
listening on port 12345:
mongod --port 12345
Start mongod as a Daemon To run a mongod process as a daemon (i.e. fork), and write its output to a log file,
use the --fork and --logpath options. You must create the log directory; however, mongod will create the log
file if it does not exist.
The following command starts mongod as a daemon and records log output to /var/log/mongodb.log.
mongod --fork --logpath /var/log/mongodb.log
Additional Configuration Options For an overview of common configurations and deployments for common use
cases, see Run-time Database Configuration (page 209).
In a clean shutdown a mongod completes all pending operations, flushes all data to data files, and closes all data files.
Other shutdowns are unclean and can compromise the validity of the data files.
To ensure a clean shutdown, always shutdown mongod instances using one of the following methods:
Use shutdownServer() Shut down the mongod from the mongo shell using the db.shutdownServer()
method as follows:
use admin
db.shutdownServer()
Calling the same method from a init script accomplishes the same result.
For systems with authorization enabled, users may only issue db.shutdownServer() when authenticated
to the admin database or via the localhost interface on systems without authentication enabled.
Use --shutdown From the Linux command line, shut down the mongod using the --shutdown option in the
following command:
mongod --shutdown
Use CTRL-C When running the mongod instance in interactive mode (i.e. without --fork), issue Control-C
to perform a clean shutdown.
Use kill From the Linux command line, shut down a specific mongod instance using the following command:
kill <mongod process ID>
Procedure If the mongod is the primary in a replica set, the shutdown process for this mongod instance has the
following steps:
1. Check how up-to-date the secondaries are.
2. If no secondary is within 10 seconds of the primary, mongod will return a message that it will not shut down.
You can pass the shutdown command a timeoutSecs argument to wait for a secondary to catch up.
3. If there is a secondary within 10 seconds of the primary, the primary will step down and wait for the secondary
to catch up.
4. After 60 seconds or once the secondary has caught up, the primary will shut down.
Force Replica Set Shutdown If there is no up-to-date secondary and you want the primary to shut down, issue the
shutdown command with the force argument, as in the following mongo shell operation:
db.adminCommand({shutdown : 1, force : true})
To keep checking the secondaries for a specified number of seconds if none are immediately up-to-date, issue
shutdown with the timeoutSecs argument. MongoDB will keep checking the secondaries for the specified
number of seconds if none are immediately up-to-date. If any of the secondaries catch up within the allotted time, the
primary will shut down. If no secondaries catch up, it will not shut down.
The following command issues shutdown with timeoutSecs set to 5:
db.adminCommand({shutdown : 1, timeoutSecs : 5})
Alternately you can use the timeoutSecs argument with the db.shutdownServer() method:
db.shutdownServer({timeoutSecs : 5})
On this page
Overview (page 248)
Available Procedures (page 248)
Overview
MongoDB provides two facilitates to terminate running operations: maxTimeMS() and db.killOp(). Use these
operations as needed to control the behavior of operations in a MongoDB deployment.
Available Procedures
Terminate a Query From the mongo shell, use the following method to set a time limit of 30 milliseconds for this
query:
db.location.find( { "town": { "$regex": "(Pine Lumber)",
"$options": 'i' } } ).maxTimeMS(30)
Terminate a Command Consider a potentially long running operation using distinct to return each dis-
tinctcollection field that has a city key:
db.runCommand( { distinct: "collection",
key: "city" } )
You can add the maxTimeMS field to the command document to set a time limit of 45 milliseconds for the operation:
db.runCommand( { distinct: "collection",
key: "city",
maxTimeMS: 45 } )
killOp The db.killOp() method interrupts a running operation at the next interrupt point. db.killOp()
identifies the target operation by operation ID.
db.killOp(<opId>)
Warning: Terminate running operations with extreme caution. Only use db.killOp() to terminate operations
initiated by clients and do not terminate internal database operations.
Related
To return a list of running operations see db.currentOp().
On this page
Profiling Levels (page 249)
Enable Database Profiling and Set the Profiling Level (page 249)
View Profiler Data (page 251)
Profiler Overhead (page 252)
Additional Resources (page 253)
The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on
a running mongod instance. You can enable profiling on a per-database or per-instance basis. The profiling level
(page 249) is also configurable when enabling profiling.
The database profiler writes all the data it collects to the system.profile (page 300) collection, which is a capped
collection (page 228). See Database Profiler Output (page 300) for overview of the data in the system.profile
(page 300) documents created by the profiler.
This document outlines a number of key administration options for the database profiler. For additional related infor-
mation, consider the following resources:
Database Profiler Output (page 300)
Profile Command
db.currentOp()
Profiling Levels
You can enable database profiling from the mongo shell or through a driver using the profile command. This
section will describe how to do so from the mongo shell. See your driver documentation if you want to
control the profiler from within your application.
When you enable profiling, you also set the profiling level (page 249). The profiler records data in the
system.profile (page 300) collection. MongoDB creates the system.profile (page 300) collection in a
database after you enable profiling for that database.
To enable profiling and set the profiling level, use the db.setProfilingLevel() helper in the mongo shell,
passing the profiling level as a parameter. For example, to enable profiling for all database operations, consider the
following operation in the mongo shell:
db.setProfilingLevel(2)
The shell returns a document showing the previous level of profiling. The "ok" : 1 key-value pair indicates the
operation succeeded:
{ "was" : 0, "slowms" : 100, "ok" : 1 }
To verify the new setting, see the Check Profiling Level (page 250) section.
Specify the Threshold for Slow Operations The threshold for slow operations applies to the entire mongod in-
stance. When you change the threshold, you change it for all databases on the instance.
Important: Changing the slow operation threshold for the database profiler also affects the profiling subsystems
slow operation threshold for the entire mongod instance. Always set the threshold to the highest useful value.
By default the slow operation threshold is 100 milliseconds. Databases with a profiling level of 1 will log operations
slower than 100 milliseconds.
To change the threshold, pass two parameters to the db.setProfilingLevel() helper in the mongo shell. The
first parameter sets the profiling level for the current database, and the second sets the default slow operation threshold
for the entire mongod instance.
For example, the following command sets the profiling level for the current database to 0, which disables profiling,
and sets the slow-operation threshold for the mongod instance to 20 milliseconds. Any database on the instance with
a profiling level of 1 will use this threshold:
db.setProfilingLevel(0,20)
Check Profiling Level To view the profiling level (page 249), issue the following from the mongo shell:
db.getProfilingStatus()
Disable Profiling To disable profiling, use the following helper in the mongo shell:
db.setProfilingLevel(0)
Enable Profiling for an Entire mongod Instance For development purposes in testing environments, you can
enable database profiling for an entire mongod instance. The profiling level applies to all databases provided by the
mongod instance.
To enable profiling for a mongod instance, pass the following parameters to mongod at startup or within the
configuration file:
mongod --profile=1 --slowms=15
This sets the profiling level to 1, which collects profiling data for slow operations only, and defines slow operations as
those that last longer than 15 milliseconds.
See also:
mode and slowOpThresholdMs.
Database Profiling and Sharding You cannot enable profiling on a mongos instance. To enable profiling in a
shard cluster, you must enable profiling for each mongod instance in the cluster.
The database profiler logs information about database operations in the system.profile (page 300) collection.
To view profiling information, query the system.profile (page 300) collection. You can use $comment to add
data to the query document to make it easier to analyze data from the profiler. To view example queries, see Profiler
Overhead (page 252).
For an explanation of the output data, see Database Profiler Output (page 300).
Example Profiler Data Queries This section displays example queries to the system.profile (page 300) col-
lection. For an explanation of the query output, see Database Profiler Output (page 300).
To return the most recent 10 log entries in the system.profile (page 300) collection, run a query similar to the
following:
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
To return all operations except command operations ($cmd), run a query similar to the following:
db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
To return operations for a particular collection, run a query similar to the following. This example returns operations
in the mydb databases test collection:
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
To return information from a certain time range, run a query similar to the following:
db.system.profile.find(
{
ts : {
$gt : new ISODate("2012-12-09T03:00:00Z") ,
$lt : new ISODate("2012-12-09T03:40:00Z")
}
}
).pretty()
The following example looks at the time range, suppresses the user field from the output to make it easier to read,
and sorts the results by how long each operation took to run:
db.system.profile.find(
{
ts : {
$gt : new ISODate("2011-07-12T03:00:00Z") ,
$lt : new ISODate("2011-07-12T03:40:00Z")
}
},
{ user : 0 }
).sort( { millis : -1 } )
Show the Five Most Recent Events On a database that has profiling enabled, the show profile helper in the
mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile
from the mongo shell, as follows:
show profile
Profiler Overhead
When enabled, profiling has a minor effect on performance. The system.profile (page 300) collection is a
capped collection with a default size of 1 megabyte. A collection of this size can typically store several thousand
profile documents, but some application may use more or less profiling data per operation.
Change Size of system.profile Collection on the Primary To change the size of the system.profile
(page 300) collection, you must:
1. Disable profiling.
2. Drop the system.profile (page 300) collection.
3. Create a new system.profile (page 300) collection.
4. Re-enable profiling.
For example, to create a new system.profile (page 300) collections thats 4000000 bytes, use the following
sequence of operations in the mongo shell:
db.setProfilingLevel(0)
db.system.profile.drop()
db.setProfilingLevel(1)
Change Size of system.profile Collection on a Secondary To change the size of the system.profile
(page 300) collection on a secondary, you must stop the secondary, run it as a standalone, and then perform the
steps above. When done, restart the standalone as a member of the replica set. For more information, see Perform
Maintenance on Replica Set Members (page 686).
Additional Resources
On this page
Overview (page 253)
Default Log Rotation Behavior (page 253)
Log Rotation with --logRotate reopen (page 254)
Syslog Log Rotation (page 255)
Forcing a Log Rotation with SIGUSR1 (page 255)
Overview
When used with the --logpath option or systemLog.path setting, mongod and mongos instances report a
live account of all activity and operations to a log file. When reporting activity data to a log file, by default, MongoDB
only rotates logs in response to the logRotate command, or when the mongod or mongos process receives a
SIGUSR1 signal from the operating system.
MongoDBs standard log rotation approach archives the current log file and starts a new one. To do this, the mongod
or mongos instance renames the current log file by appending a UTC timestamp to the filename, in ISODate format.
It then opens a new log file, closes the old log file, and sends all new log entries to the new log file.
You can also configure MongoDB to support the Linux/Unix logrotate utility by setting systemLog.logRotate
or --logRotate to reopen. With reopen, mongod or mongos closes the log file, and then reopens a log file
with the same name, expecting that another process renamed the file prior to rotation.
Finally, you can configure mongod to send log data to the syslog. using the --syslog option. In this case, you
can take advantage of alternate logrotation tools.
See also:
For information on logging, see the Process Logging (page 207) section.
By default, MongoDB uses the --logRotate rename behavior. With rename, mongod or mongos renames
the current log file by appending a UTC timestamp to the filename, opens a new log file, closes the old log file, and
sends all new log entries to the new log file.
90 https://www.mongodb.com/products/consulting?jmp=docs#performance_evaluation
Step 2: List the log files In a separate terminal, list the matching files:
ls /var/log/mongodb/server1.log*
Step 3: Rotate the log file. Rotate the log file by issuing the logRotate command from the admin database in a
mongo shell:
use admin
db.runCommand( { logRotate : 1 } )
Step 4: View the new log files List the new log files to view the newly-created log:
ls /var/log/mongodb/server1.log*
There should be two log files listed: server1.log, which is the log file that mongod or mongos made when it
reopened the log file, and server1.log.<timestamp>, the renamed original log file.
Rotating log files does not modify the old rotated log files. When you rotate a log, you rename the server1.log
file to include the timestamp, and a new, empty server1.log file receives all new log input.
Step 2: List the log files In a separate terminal, list the matching files:
ls /var/log/mongodb/server1.log*
Step 3: Rotate the log file. Rotate the log file by issuing the logRotate command from the admin database in a
mongo shell:
use admin
db.runCommand( { logRotate : 1 } )
You should rename the log file using an external process, following the typical Linux/Unix log rotate behavior.
Do not include --logpath. Since --syslog tells mongod to send log data to the syslog, specifying a
--logpath will causes an error.
To specify the facility level used when logging messages to the syslog, use the --syslogFacility option or
systemLog.syslogFacility configuration setting.
Step 2: Rotate the log. Store and rotate the log output using your systems default log rotation mechanism.
For Linux and Unix-based systems, you can use the SIGUSR1 signal to rotate the logs for a single process, as in the
following:
kill -SIGUSR1 <mongod process id>
Manage Journaling
On this page
Procedures (page 255)
MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 77) durability. The
MMAPv1 storage engine also requires the journal in order to provide crash resiliency.
The WiredTiger storage engine does not require journaling to guarantee a consistent state after a crash. The database
will be restored to the last consistent checkpoint (page 588) during recovery. However, if MongoDB exits unexpectedly
in between checkpoints, journaling is required to recover writes that occurred after the last checkpoint.
With journaling enabled, if mongod stops unexpectedly, the program can recover everything written to the journal.
MongoDB will re-apply the write operations on restart and maintain a consistent state. By default, the greatest extent
of lost writes, i.e., those not made to the journal, are those made in the last 100 milliseconds, plus the time it takes to
perform the actual journal writes. See commitIntervalMs for more information on the default.
Procedures
Enable Journaling Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default.
To enable journaling, start mongod with the --journal command line option.
Warning: Do not disable journaling on production systems. When using the MMAPv1 storage engine withou
journal, if your mongod instance stops without shutting down cleanly unexpectedly for any reason, (e.g. pow
Disable Journaling
failure) and you are not running with journaling, then you must recover from an unaffected replica set member
backup, as described in repair (page 289).
To disable journaling, start mongod with the --nojournal command line option.
Get Commit Acknowledgment You can get commit acknowledgment with the Write Concern (page 141) and the
j (page 143) option. For details, see Write Concern (page 141).
Avoid Preallocation Lag for MMAPv1 With the MMAPv1 storage engine (page 595), MongoDB may preallocate
journal files if the mongod process determines that it is more efficient to preallocate journal files than create new
journal files as needed.
Depending on your filesystem, you might experience a preallocation lag the first time you start a mongod instance
with journaling enabled. The amount of time required to pre-allocate files might last several minutes; during this
time, you will not be able to connect to the database. This is a one-time preallocation and does not occur with future
invocations.
To avoid preallocation lag (page 600), you can preallocate files in the journal directory by copying them from another
instance of mongod.
Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod with journaling,
mongod will create them again.
Example
The following sequence preallocates journal files for an instance of mongod running on port 27017 with a database
path of /data/db.
For demonstration purposes, the sequence starts by creating a set of journal files in the usual way.
1. Create a temporary directory into which to create a set of journal files:
mkdir ~/tmpDbpath
2. Create a set of journal files by staring a mongod instance that uses the temporary directory:
mongod --port 10000 --dbpath ~/tmpDbpath --journal
3. When you see the following log output, indicating mongod has the files, press CONTROL+C to stop the
mongod instance:
[initandlisten] waiting for connections on port 10000
4. Preallocate journal files for the new instance of mongod by moving the journal files from the data directory of
the existing instance to the data directory of the new instance:
mv ~/tmpDbpath/journal /data/db/
Monitor Journal Status Use the following commands and methods to monitor journal status:
serverStatus
The serverStatus command returns database status information that is useful for assessing performance.
journalLatencyTest
Use journalLatencyTest to measure how long it takes on your volume to write to the disk in an append-
only fashion. You can run this command on an idle system to get a baseline sync time for journaling. You can
also run this command on a busy system to see the sync time on a busy system, which may be higher if the
journal directory is on the same volume as the data files.
The journalLatencyTest command also provides a way to check if your disk drive is buffering writes in
its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the drive
is probably buffering writes. In that case, enable cache write-through for the device in your operating system,
unless you have a disk controller card with battery backed RAM.
Change the Group Commit Interval for MMAPv1 For the MMAPv1 storage engine (page 595), you can set the
group commit interval using the --journalCommitInterval command line option. The allowed range is 2 to
300 milliseconds.
Lower values increase the durability of the journal at the expense of disk performance.
Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the
journal directory before the server becomes available. If MongoDB must replay journal files, mongod notes these
events in the log output.
There is no reason to run repairDatabase in these situations.
Note: Do not store application logic in the database. There are performance limitations to running JavaScript inside
of MongoDB. Application code also is typically most effective when it shares version control with the application
itself.
There is a special system collection named system.js that can store JavaScript functions for reuse.
To store a function, you can use the db.collection.save(), as in the following examples:
db.system.js.save(
{
_id: "echoFunction",
value : function(x) { return x; }
}
)
db.system.js.save(
{
_id : "myAddFunction" ,
value : function (x, y){ return x + y; }
}
);
The _id field holds the name of the function and is unique per database.
The value field holds the function definition.
Once you save a function in the system.js collection, you can use the function from any JavaScript context; e.g.
$where operator, mapReduce command or db.collection.mapReduce().
In the mongo shell, you can use db.loadServerScripts() to load all the scripts saved in the system.js
collection for the current database. Once loaded, you can invoke the functions directly in the shell, as in the following
example:
db.loadServerScripts();
echoFunction(3);
myAddFunction(3, 5);
On this page
Before Upgrading (page 258)
Upgrade Procedure (page 258)
Upgrade a MongoDB Instance (page 259)
Replace the Existing Binaries (page 259)
Upgrade Sharded Clusters (page 259)
Upgrade Replica Sets (page 260)
Additional Resources (page 261)
Revisions provide security patches, bug fixes, and new or changed features that do not contain any backward breaking
changes. Always upgrade to the latest revision in your release series. The third number in the MongoDB version
number (page 1061) indicates the revision.
Before Upgrading
Ensure you have an up-to-date backup of your data set. See MongoDB Backup Methods (page 200).
Consult the following documents for any special considerations or compatibility issues specific to your Mon-
goDB release:
The release notes, located at Release Notes (page 865).
The documentation for your driver. See Drivers91 and Driver Compatibility92 pages for more information.
If your installation includes replica sets, plan the upgrade during a predefined maintenance window.
Before you upgrade a production environment, use the procedures in this document to upgrade a staging environ-
ment that reproduces your production environment, to ensure that your production configuration is compatible
with all changes.
Upgrade Procedure
91 https://docs.mongodb.org/ecosystem/drivers
92 https://docs.mongodb.org/ecosystem/drivers/driver-compatibility-reference
Upgrade each mongod and mongos binary separately, using the procedure described here. When upgrading a binary,
use the procedure Upgrade a MongoDB Instance (page 259).
Follow this upgrade procedure:
1. For deployments that use authentication, first upgrade all of your MongoDB drivers. To upgrade, see the
documentation for your driver as well as the Driver Compatibility93 page.
2. Upgrade sharded clusters, as described in Upgrade Sharded Clusters (page 259).
3. Upgrade any standalone instances. See Upgrade a MongoDB Instance (page 259).
4. Upgrade any replica sets that are not part of a sharded cluster, as described in Upgrade Replica Sets (page 260).
This section describes how to upgrade MongoDB by replacing the existing binaries. The preferred approach to an
upgrade is to use the operating systems package management tool and the official MongoDB packages, as described
in Install MongoDB (page 5).
To upgrade a mongod or mongos instance by replacing the existing binaries:
1. Download the binaries for the latest MongoDB revision from the MongoDB Download Page94 and store the
binaries in a temporary location. The binaries download as compressed files that uncompress to the directory
structure used by the MongoDB installation.
2. Shutdown the instance.
3. Replace the existing MongoDB binaries with the downloaded binaries.
4. Restart the instance.
3. Upgrade each mongod config server (page 734) individually starting with the last config server listed in your
mongos --configdb string and working backward. To keep the cluster online, make sure at least one config
server is always running. For each config server upgrade, follow the instructions below in Upgrade a MongoDB
Instance (page 259)
Example
Given the following config string:
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019
To upgrade a replica set, upgrade each member individually, starting with the secondaries and finishing with the
primary. Plan the upgrade during a predefined maintenance window.
Note: Stepping down the primary is preferable to directly shutting down the primary. Stepping down expedites
the failover procedure.
2. Once the primary has stepped down, call the rs.status() method from the mongo shell until you see that
another member has assumed the PRIMARY state.
3. Shut down the original primary and upgrade its instance by following the instructions below in Upgrade a
MongoDB Instance (page 259).
Additional Resources
On this page
Overview (page 261)
Considerations (page 261)
Configuration Files (page 261)
Procedure (page 262)
Optional: Run MongoDB as SNMP Master (page 262)
Enterprise Feature
SNMP is only available in MongoDB Enterprise96 .
Overview
MongoDB Enterprise can provide database metrics via SNMP, in support of centralized data collection and aggrega-
tion. This procedure explains the setup and configuration of a mongod instance as an SNMP subagent, as well as
initializing and testing of SNMP support with MongoDB Enterprise.
See also:
Troubleshoot SNMP (page 265) and Monitor MongoDB Windows with SNMP (page 263) for complete instructions on
using MongoDB with SNMP on Windows systems.
Considerations
Only mongod instances provide SNMP support. mongos and the other MongoDB binaries do not support SNMP.
Configuration Files
MONGOD-MIB.txt:
The management information base (MIB) file that defines MongoDBs SNMP output.
mongod.conf.subagent:
The configuration file to run mongod as the SNMP subagent. This file sets SNMP run-time configuration
options, including the AgentX socket to connect to the SNMP master.
mongod.conf.master:
The configuration file to run mongod as the SNMP master. This file sets SNMP run-time configuration options.
Procedure
Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files
to the SNMP service configuration directory.
First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-
tion files to the SNMP service configuration directory:
mkdir -p /etc/snmp/
cp MONGOD-MIB.txt /usr/share/snmp/mibs/MONGOD-MIB.txt
cp mongod.conf.subagent /etc/snmp/mongod.conf
The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is
snmpd.conf.
By default SNMP uses UNIX domain for communication between the agent (i.e. snmpd or the master) and sub-agent
(i.e. MongoDB).
Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the
agentXAddress in the SNMP master configuration file.
Step 2: Start MongoDB. Start mongod with the snmp-subagent to send data to the SNMP master.
mongod --snmp-subagent
Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod:
Connect an SNMP client to verify the ability to collect SNMP data from MongoDB.
Install the net-snmp97 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client.
snmpwalk -m /usr/share/snmp/mibs/MONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601
<port> refers to the port defined by the SNMP master, not the primary port used by mongod for client communi-
cation.
You can run mongod with the snmp-master option for testing purposes. To do this, use the SNMP master configu-
ration file instead of the subagent configuration file. From the directory containing the unpacked MongoDB installation
files:
97 http://www.net-snmp.org/
cp mongod.conf.master /etc/snmp/mongod.conf
On this page
Overview (page 263)
Considerations (page 263)
Configuration Files (page 263)
Procedure (page 264)
Optional: Run MongoDB as SNMP Master (page 264)
Enterprise Feature
SNMP is only available in MongoDB Enterprise98 .
Overview
MongoDB Enterprise can provide database metrics via SNMP, in support of centralized data collection and aggrega-
tion. This procedure explains the setup and configuration of a mongod.exe instance as an SNMP subagent, as well
as initializing and testing of SNMP support with MongoDB Enterprise.
See also:
Monitor MongoDB With SNMP on Linux (page 261) and Troubleshoot SNMP (page 265) for more information.
Considerations
Only mongod.exe instances provide SNMP support. mongos.exe and the other MongoDB binaries do not support
SNMP.
Configuration Files
mongod.conf.master:
The configuration file to run mongod.exe as the SNMP master. This file sets SNMP run-time configuration
options.
Procedure
Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files
to the SNMP service configuration directory.
First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-
tion files to the SNMP service configuration directory:
md C:\snmp\etc\config
copy MONGOD-MIB.txt C:\snmp\etc\config\MONGOD-MIB.txt
copy mongod.conf.subagent C:\snmp\etc\config\mongod.conf
The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is
snmpd.conf.
Edit the configuration file to ensure that the communication between the agent (i.e. snmpd or the master) and sub-
agent (i.e. MongoDB) uses TCP.
Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the
agentXAddress in the SNMP master configuration file.
Step 2: Start MongoDB. Start mongod.exe with the snmp-subagent to send data to the SNMP master.
mongod.exe --snmp-subagent
Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod.exe:
Connect an SNMP client to verify the ability to collect SNMP data from MongoDB.
Install the net-snmp99 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client.
snmpwalk -m C:\snmp\etc\config\MONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601
<port> refers to the port defined by the SNMP master, not the primary port used by mongod.exe for client
communication.
You can run mongod.exe with the snmp-master option for testing purposes. To do this, use the SNMP master
configuration file instead of the subagent configuration file. From the directory containing the unpacked MongoDB
installation files:
copy mongod.conf.master C:\snmp\etc\config\mongod.conf
Troubleshoot SNMP
On this page
Overview (page 265)
Issues (page 265)
Enterprise Feature
SNMP is only available in MongoDB Enterprise.
Overview
MongoDB Enterprise can provide database metrics via SNMP, in support of centralized data collection and aggre-
gation. This document identifies common problems you may encounter when deploying MongoDB Enterprise with
SNMP as well as possible solutions for these issues.
See Monitor MongoDB With SNMP on Linux (page 261) and Monitor MongoDB Windows with SNMP (page 263) for
complete installation instructions.
Issues
AgentX is the SNMP agent extensibility protocol defined in Internet RFC 2741100 . It explains how to define additional
data to monitor over SNMP. When MongoDB fails to connect to the agentx master agent, use the following procedure
to ensure that the SNMP subagent can connect properly to the SNMP master.
1. Make sure the master agent is running.
2. Compare the SNMP masters configuration file with the subagent configuration file. Ensure that the agentx
socket definition is the same between the two.
3. Check the SNMP configuration files to see if they specify using UNIX Domain Sockets. If so, confirm that the
mongod has appropriate permissions to open a UNIX domain socket.
Error Parsing Command Line One of the following errors at the command line:
Error parsing command line: unknown option snmp-master
try 'mongod --help' for more information
mongod binaries that are not part of the Enterprise Edition produce this error. Install the Enterprise Edition (page 33)
and attempt to start mongod again.
Other MongoDB binaries, including mongos will produce this error if you attempt to star them with snmp-master
or snmp-subagent. Only mongod supports SNMP.
100 http://www.ietf.org/rfc/rfc2741.txt
Error Starting SNMPAgent The following line in the log file indicates that mongod cannot read the
mongod.conf file:
[SNMPAgent] warning: error starting SNMPAgent as master err:1
If running on Linux, ensure mongod.conf exists in the /etc/snmp directory, and ensure that the mongod UNIX
user has permission to read the mongod.conf file.
If running on Windows, ensure mongod.conf exists in C:\snmp\etc\config.
The following tutorials describe backup and restoration for a mongod instance:
Backup and Restore with Filesystem Snapshots (page 266) An outline of procedures for creating MongoDB data set
backups using system-level file snapshot tool, such as LVM or native storage appliance tools.
Restore a Replica Set from MongoDB Backups (page 270) Describes procedure for restoring a replica set from an
archived backup such as a mongodump or MongoDB Cloud Manager101 Backup file.
Back Up and Restore with MongoDB Tools (page 272) Describes a procedure for exporting the contents of a
database to either a binary dump or a textual exchange format, and for importing these files into a database.
Backup and Restore Sharded Clusters (page 277) Detailed procedures and considerations for backing up sharded
clusters and single shards.
Recover Data after an Unexpected Shutdown (page 289) Recover data from MongoDB data files that were not prop-
erly closed or have an invalid state.
On this page
Snapshots Overview (page 267)
Back up and Restore Using LVM on Linux (page 268)
Back up Instances with Journal Files on Separate Volume or without Journaling (page 270)
Additional Resources (page 270)
This document describes a procedure for creating backups of MongoDB systems using system-level tools, such as
LVM or storage appliance, as well as the corresponding restoration strategies.
These filesystem snapshots, or block-level backup methods, use system level tools to create copies of the device
that holds MongoDBs data files. These methods complete quickly and work reliably, but require additional system
configuration outside of MongoDB.
Changed in version 3.2: Starting in MongoDB 3.2, the data files as well as the journal files can reside on separate
volumes to create volume-level backup of MongoDB instances using the WiredTiger (page 587) storage engine. With
previous versions, for the purpose of volume-level backup of MongoDB instances using WiredTiger, the data files and
the journal must reside on a single volume.
See also:
MongoDB Backup Methods (page 200) and Back Up and Restore with MongoDB Tools (page 272).
101 https://cloud.mongodb.com/?jmp=docs
Snapshots Overview
Snapshots work by creating pointers between the live data and a special snapshot volume. These pointers are the-
oretically equivalent to hard links. As the working data diverges from the snapshot, the snapshot process uses a
copy-on-write strategy. As a result the snapshot only stores modified data.
After making the snapshot, you mount the snapshot image on your file system and copy data from the snapshot. The
resulting backup contains a full copy of all data.
Considerations
Valid Database at the Time of Snapshot The database must be valid when the snapshot takes place. This means
that all writes accepted by the database need to be fully written to disk: either to the journal or to data files.
If all writes are not on disk when the backup occurs, the backup will not reflect these changes.
For the MMAPv1 storage engine (page 595), if writes are in progress when the backup occurs, the data files will reflect
an inconsistent state. With journaling (page 599), all data-file states resulting from in-progress writes are recoverable;
without journaling, you must flush all pending writes to disk before running the backup operation and must ensure
that no writes occur during the entire backup procedure. If you do use journaling, the journal must reside on the same
volume as the data.
For the WiredTiger storage engine (page 587), the data files reflect a consistent state as of the last checkpoint
(page 588), which occurs with every 2 GB of data or every minute.
Entire Disk Image Snapshots create an image of an entire disk image. Unless you need to back up your entire
system, consider isolating your MongoDB data files, journal (if applicable), and configuration on one logical disk that
doesnt contain any other data.
Alternately, store all MongoDB data files on a dedicated device so that you can make backups without duplicating
extraneous data.
Site Failure Precaution Ensure that you copy data from snapshots onto other systems. This ensures that data is safe
from site failures.
No Incremental Backups This tutorial does not include procedures for incremental backups. Although different
snapshots methods provide different capability, the LVM method outlined below does not provide any capacity for
capturing incremental backups.
Snapshots With Journaling If your mongod instance has journaling enabled, then you can use any kind of file
system or volume/block level snapshot tool to create backups.
If you manage your own infrastructure on a Linux-based system, configure your system with LVM to provide your disk
packages and provide snapshot capability. You can also use LVM-based setups within a cloud/virtualized environment.
Note: Running LVM provides additional flexibility and enables the possibility of using snapshots to back up Mon-
goDB.
Snapshots with Amazon EBS in a RAID 10 Configuration If your deployment depends on Amazons Elastic
Block Storage (EBS) with RAID configured within your instance, it is impossible to get a consistent state across all
disks using the platforms snapshot tool. As an alternative, you can do one of the following:
Flush all writes to disk and create a write lock to ensure consistent state during the backup process.
If you choose this option see Back up Instances with Journal Files on Separate Volume or without Journaling
(page 270).
Configure LVM to run and hold your MongoDB data files on top of the RAID within your system.
If you choose this option, perform the LVM backup operation described in Create a Snapshot (page 268).
This section provides an overview of a simple backup process using LVM on a Linux system. While the tools, com-
mands, and paths may be (slightly) different on your system the following steps provide a high level overview of the
backup operation.
Note: Only use the following procedure as a guideline for a backup system and infrastructure. Production backup
systems must consider a number of application specific requirements and factors unique to specific environments.
Create a Snapshot Changed in version 3.2: Starting in MongoDB 3.2, for the purpose of volume-level backup
of MongoDB instances using WiredTiger, the data files and the journal are no longer required to reside on a single
volume.
To create a snapshot with LVM, issue a command as root in the following format:
lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb
This command creates an LVM snapshot (with the --snapshot option) named mdb-snap01 of the mongodb
volume in the vg0 volume group.
This example creates a snapshot named mdb-snap01 located at /dev/vg0/mdb-snap01. The location and
paths to your systems volume groups and devices may vary slightly depending on your operating systems LVM
configuration.
The snapshot has a cap of at 100 megabytes, because of the parameter --size 100M. This size does not re-
flect the total amount of the data on the disk, but rather the quantity of differences between the current state of
/dev/vg0/mongodb and the creation of the snapshot (i.e. /dev/vg0/mdb-snap01.)
Warning: Ensure that you create snapshots with enough space to account for data growth, particularly for the
period of time that it takes to copy data out of the system or to a temporary image.
If your snapshot runs out of space, the snapshot image becomes unusable. Discard this logical volume and create
another.
The snapshot will exist when the command returns. You can restore directly from the snapshot at any time or by
creating a new logical volume and restoring from this snapshot to the alternate image.
While snapshots are great for creating high quality backups very quickly, they are not ideal as a format for storing
backup data. Snapshots typically depend and reside on the same storage infrastructure as the original disk images.
Therefore, its crucial that you archive these snapshots and store them elsewhere.
Archive a Snapshot After creating a snapshot, mount the snapshot and copy the data to separate storage. Your
system might try to compress the backup images as you move them offline. Alternatively, take a block level copy of
the snapshot image, such as with the following procedure:
umount /dev/vg0/mdb-snap01
dd if=/dev/vg0/mdb-snap01 | gzip > mdb-snap01.gz
Warning: This command will create a large gz file in your current working directory. Make sure that you
run this command in a file system that has enough free space.
Restore a Snapshot To restore a snapshot created with the above method, issue the following sequence of com-
mands:
lvcreate --size 1G --name mdb-new vg0
gzip -d -c mdb-snap01.gz | dd of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
Warning: This volume will have a maximum size of 1 gigabyte. The original file system must have had a
total size of 1 gigabyte or smaller, or else the restoration will fail.
Change 1G to your desired volume size.
Uncompresses and unarchives the mdb-snap01.gz into the mdb-new disk image.
Mounts the mdb-new disk image to the /srv/mongodb directory. Modify the mount point to correspond to
your MongoDB data file location, or other location as needed.
Note: The restored snapshot will have a stale mongod.lock file. If you do not remove this file from the snap-
shot, and MongoDB may assume that the stale lock file indicates an unclean shutdown. If youre running with
storage.journal.enabled enabled, and you do not use db.fsyncLock(), you do not need to remove
the mongod.lock file. If you use db.fsyncLock() you will need to remove the lock.
Restore Directly from a Snapshot To restore a backup without writing to a compressed gz file, use the following
sequence of commands:
umount /dev/vg0/mdb-snap01
lvcreate --size 1G --name mdb-new vg0
dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
Remote Backup Storage You can implement off-system backups using the combined process (page 269) and SSH.
This sequence is identical to procedures explained above, except that it archives and compresses the backup on a
remote system using SSH.
Consider the following procedure:
umount /dev/vg0/mdb-snap01
dd if=/dev/vg0/mdb-snap01 | ssh username@example.com gzip > /opt/backup/mdb-snap01.gz
lvcreate --size 1G --name mdb-new vg0
ssh username@example.com gzip -d -c /opt/backup/mdb-snap01.gz | dd of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
Changed in version 3.2: Starting in MongoDB 3.2, for the purpose of volume-level backup of MongoDB instances
using WiredTiger, the data files and the journal are no longer required to reside on a single volume.
If your mongod instance is either running without journaling or has the journal files on a separate volume, you must
flush all writes to disk and lock the database to prevent writes during the backup process. If you have a replica set
configuration, then for your backup use a secondary which is not receiving reads (i.e. hidden member).
Important: In the following procedure to create backups, you must issue the db.fsyncLock() and
db.fsyncUnlock() operations on the same connection. The client that issues db.fsyncLock() is solely re-
sponsible for issuing a db.fsyncUnlock() operation and must be able to handle potential error conditions so that
it can perform the db.fsyncUnlock() before terminating the connection.
Step 1: Flush writes to disk and lock the database to prevent further writes. To flush writes to disk and to lock
the database, issue the db.fsyncLock() method in the mongo shell:
db.fsyncLock();
Step 3: After the snapshot completes, unlock the database. To unlock the database after the snapshot has com-
pleted, use the following command in the mongo shell:
db.fsyncUnlock();
Additional Resources
See also MongoDB Cloud Manager102 for seamless automation, backup, and monitoring.
On this page
Restore Database into a Single Node Replica Set (page 271)
Add Members to the Replica Set (page 271)
This procedure outlines the process for taking MongoDB data and restoring that data into a new replica set. Use this
approach for seeding test deployments from production backups as well as part of disaster recovery.
102 https://cloud.mongodb.com/?jmp=docs
You cannot restore a single data set to three new mongod instances and then create a replica set. In this situation
MongoDB will force the secondaries to perform an initial sync. The procedures in this document describe the correct
and efficient ways to deploy a replica set.
You can also use mongorestore to restore database files using data created with mongodump. See Back Up and
Restore with MongoDB Tools (page 272) for more information.
Step 1: Obtain backup MongoDB Database files. The backup files may come from a file system snapshot
(page 266). The MongoDB Cloud Manager103 produces MongoDB database files for stored snapshots104 and point in
time snapshots105 . For Ops Manager, an on-premise solution available in MongoDB Enterprise Advanced106 , see also
the Ops Manager Backup overview107 .
Step 2: Start a mongod using data files from the backup as the data path. Start a mongod instance for a new
single-node replica set. Specify the path to the backup data files with --dbpath option and the replica set name with
the --replSet option. For config server replica set (CSRS) (page 735), include the --configsvr option.
mongod --dbpath /data/db --replSet <replName>
Step 3: Connect a mongo shell to the mongod instance. For example, to connect to a mongod running on
localhost on the default port of 27017, simply issue:
mongo
Step 4: Initiate the new replica set. Use rs.initiate() on one and only one member of the replica set:
rs.initiate()
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration.
MongoDB provides two options for restoring secondary members of a replica set:
Manually copy the database files to each data directory.
Allow initial sync (page 648) to distribute data automatically.
The following sections outlines both approaches.
Note: If your database is large, initial sync can take a long time to complete. For large databases, it might be
preferable to copy the database files onto each host.
Copy Database Files and Restart mongod Instance Use the following sequence of operations to seed additional
members of the replica set with the restored data by copying MongoDB data files directly.
103 https://cloud.mongodb.com/?jmp=docs
104 https://docs.cloud.mongodb.com/tutorial/restore-from-snapshot/
105 https://docs.cloud.mongodb.com/tutorial/restore-from-point-in-time-snapshot/
106 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
107 https://docs.opsmanager.mongodb.com/current/core/backup-overview
Step 1: Shut down the mongod instance that you restored. Use --shutdown or db.shutdownServer()
to ensure a clean shut down.
Step 2: Copy the primarys data directory to each secondary. Copy the primarys data directory into the dbPath
of the other members of the replica set. The dbPath is /data/db by default.
Step 4: Add the secondaries to the replica set. In a mongo shell connected to the primary, add the secondaries to
the replica set using rs.add(). See Deploy a Replica Set (page 657) for more information about deploying a replica
set.
Update Secondaries using Initial Sync Use the following sequence of operations to seed additional members of
the replica set with the restored data using the default initial sync operation.
Step 1: Ensure that the data directories on the prospective replica set members are empty.
Step 2: Add each prospective member to the replica set. When you add a member to the replica set, Initial Sync
(page 648) copies the data from the primary to the new member.
On this page
Binary BSON Dumps (page 273)
Human Intelligible Import/Export Formats (page 275)
This document describes the process for creating backups and restoring data using the utilities provided with Mon-
goDB.
Because all of these tools primarily operate by interacting with a running mongod instance, they can impact the
performance of your running database.
Not only do they create traffic for a running database instance, they also force the database to read all data through
memory. When MongoDB reads infrequently used data, it can supplant more frequently accessed data, causing a
deterioration in performance for the databases regular workload.
No matter how you decide to import or export your data, consider the following guidelines:
Label files so that you can identify the contents of the export or backup as well as the point in time the ex-
port/backup reflect.
Do not create or apply exports if the backup process itself will have an adverse effect on a production system.
Make sure that the backups reflect a consistent data state. Export or backup processes can impact data integrity
(i.e. type fidelity) and consistency if updates continue during the backup process.
Test backups and exports by restoring and importing to ensure that the backups are useful.
See also:
MongoDB Backup Methods (page 200) or MongoDB Cloud Manager Backup documentation108 for more information
on backing up MongoDB instances. Additionally, consider the following references for the MongoDB import/export
tools:
mongoimport
mongoexport
mongorestore
mongodump
The mongorestore and mongodump utilities work with BSON (page 194) data dumps, and are useful for creating
backups of small deployments. For resilient and non-disruptive backups, use a file system or block-level disk snapshot
function, such as the methods described in the MongoDB Backup Methods (page 200) document.
Use these tools for backups if other backup methods, such as the MongoDB Cloud Manager109 or file system snapshots
(page 266) are unavailable.
Exclude local Database mongodump excludes the content of the local database in its output.
Required Access To run mongodump against a MongoDB deployment that has access control (page 331) enabled,
you must have privileges that grant find (page 429) action for each database to back up. The built-in backup
(page 420) role provides the required privileges to perform backup of any and all databases.
Changed in version 3.2.1: The backup (page 420) role provides additional privileges to back up the
system.profile (page 300) collections that exist when running with database profiling (page 234). Previously,
users required an additional read access on this collection.
Basic mongodump Operations The mongodump utility backs up data by connecting to a running mongod or
mongos instance.
The utility can create a backup for an entire server, database or collection, or can use a query to backup just part of a
collection.
When you run mongodump without any arguments, the command connects to the MongoDB instance on the local
system (e.g. 127.0.0.1 or localhost) on port 27017 and creates a database backup named dump/ in the
current directory.
To backup data from a mongod or mongos instance running on the same machine and on the default port of 27017,
use the following command:
mongodump
The data format used by mongodump from version 2.2 or later is incompatible with earlier versions of mongod. Do
not use recent versions of mongodump to back up older data stores.
108 https://docs.cloud.mongodb.com/tutorial/nav/backup-use/
109 https://cloud.mongodb.com/?jmp=docs
You can also specify the --host and --port of the MongoDB instance that the mongodump should connect to.
For example:
mongodump --host mongodb.example.net --port 27017
mongodump will write BSON files that hold a copy of data accessible via the mongod listening on port 27017 of
the mongodb.example.net host. See Create Backups from Non-Local mongod Instances (page 274) for more
information.
To specify a different output directory, you can use the --out or -o option:
mongodump --out /data/backup/
To limit the amount of data included in the database dump, you can specify --db and --collection as options to
mongodump. For example:
mongodump --collection myCollection --db test
This operation creates a dump of the collection named myCollection from the database test in a dump/ subdi-
rectory of the current working directory.
mongodump overwrites output files if they exist in the backup data folder. Before running the mongodump command
multiple times, either ensure that you no longer need the files in the output folder (the default is the dump/ folder) or
rename the folders or files.
Point in Time Operation Using Oplogs Use the --oplog option with mongodump to collect the oplog entries
to build a point-in-time snapshot of a database within a replica set. With --oplog, mongodump copies all the data
from the source database as well as all of the oplog entries from the beginning to the end of the backup procedure. This
operation, in conjunction with mongorestore --oplogReplay, allows you to restore a backup that reflects the
specific moment in time that corresponds to when mongodump completed creating the dump file.
Create Backups from Non-Local mongod Instances The --host and --port options for mongodump allow
you to connect to and backup from a remote host. Consider the following example:
mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/m
On any mongodump command you may, as above, specify username and password credentials to specify database
authentication.
Access Control To restore data to a MongoDB deployment that has access control (page 331) enabled, the restore
(page 420) role provides access to restore any database if the backup data does not include system.profile
(page 300) collection data.
If the backup data includes system.profile (page 300) collection data and the target database does not contain
the system.profile (page 300) collection, mongorestore attempts to create the collection even though the
program does not actually restore system.profile documents. As such, the user requires additional privileges to
perform createCollection (page 430) and convertToCapped (page 432) actions on the system.profile
(page 300) collection for a database.
If running mongorestore with --oplogReplay, additional privilege user-defined role (page 375) that has
anyAction (page 434) on anyResource (page 429) and grant only to users who must run mongorestore with
--oplogReplay.
Basic mongorestore Operations The mongorestore utility restores a binary backup created by
mongodump. By default, mongorestore looks for a database backup in the dump/ directory.
The mongorestore utility restores data by connecting to a running mongod or mongos directly.
mongorestore can restore either an entire database backup or a subset of the backup.
To use mongorestore to connect to an active mongod or mongos, use a command with the following prototype
form:
mongorestore --port <port number> <path to the backup>
Here, mongorestore imports the database backup in the dump-2013-10-25 directory to the mongod instance
running on the localhost interface.
Restore Point in Time Oplog Backup If you created your database dump using the --oplog option to ensure a
point-in-time snapshot, call mongorestore with the --oplogReplay option, as in the following example:
mongorestore --oplogReplay
You may also consider using the mongorestore --objcheck option to check the integrity of objects while
inserting them into the database, or you may consider the mongorestore --drop option to drop each collection
from the database before restoring from backups.
Restore Backups to Non-Local mongod Instances By default, mongorestore connects to a MongoDB instance
running on the localhost interface (e.g. 127.0.0.1) and on the default port (27017). If you want to restore to a
different host or port, use the --host and --port options.
Consider the following example:
mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mong
As above, you may specify username and password connections if your mongod requires authentication.
MongoDBs mongoimport and mongoexport tools allow you to work with your data in a human-readable
Extended JSON or CSV format. This is useful for simple ingestion to or from a third-party system, and when
you want to backup or export a small subset of your data. For more complex data migration tasks, you may want to
write your own import and export scripts using a client driver to interact with the database.
The examples in this section use the MongoDB tools mongoimport and mongoexport. These tools may also be
useful for importing data into a MongoDB database from third party applications.
If you want to simply copy a database or collection from one instance to another, consider using the copydb,
clone, or cloneCollection commands, which may be more suited to this task. The mongo shell provides
the db.copyDatabase() method.
Warning: Avoid using mongoimport and mongoexport for full instance production backups. They do not
reliably preserve all rich BSON data types, because JSON can only represent a subset of the types supported by
BSON. Use mongodump and mongorestore as described in MongoDB Backup Methods (page 200) for this
kind of functionality.
Export in CSV Format Changed in version 3.0.0: mongoexport removed the --csv option. Use the
--type=csv option to specify CSV format for the output.
In the following example, mongoexport exports data from the collection contacts collection in the users
database in CSV format to the file /opt/backups/contacts.csv.
The mongod instance that mongoexport connects to is running on the localhost port number 27017.
When you export in CSV format, you must specify the fields in the documents to export. The operation specifies the
name and address fields to export.
mongoexport --db users --collection contacts --type=csv --fields name,address --out /opt/backups/cont
For CSV exports only, you can also specify the fields in a file containing the line-separated list of fields to export. The
file must have only one field per line.
For example, you can specify the name and address fields in a file fields.txt:
name
address
Then, using the --fieldFile option, specify the fields to export with the file:
mongoexport --db users --collection contacts --type=csv --fieldFile fields.txt --out /opt/backups/con
Changed in version 3.0.0: mongoexport removed the --csv option and replaced with the --type option.
Export in JSON Format This example creates an export of the contacts collection from the MongoDB instance
running on the localhost port number 27017. This writes the export to the contacts.json file in JSON format.
mongoexport --db sales --collection contacts --out contacts.json
Export from Remote Host Running with Authentication The following example exports the contacts collec-
tion from the marketing database, which requires authentication.
This data resides on the MongoDB instance located on the host mongodb1.example.net running on port 37017,
which requires the username user and the password pass.
mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
Export Query Results You can export only the results of a query by supplying a query filter with the --query
option, and limit the results to a single database using the --db option.
For instance, this command returns all documents in the sales databases contacts collection that contain a field
named field with a value of 1.
mongoexport --db sales --collection contacts --query '{"field": 1}'
You must enclose the query in single quotes (e.g. ) to ensure that it does not interact with your shell environment.
Simple Usage mongoimport restores a database from a backup taken with mongoexport. Most of the argu-
ments to mongoexport also exist for mongoimport.
In the following example, mongoimport imports the data in the JSON data from the contacts.json file into
the collection contacts in the users database.
mongoimport --db users --collection contacts --file contacts.json
Import JSON to Remote Host Running with Authentication In the following example, mongoimport imports
data from the file /opt/backups/mdb1-examplenet.json into the contacts collection within the database
marketing on a remote MongoDB database with authentication enabled.
mongoimport connects to the mongod instance running on the host mongodb1.example.net over port
37017. It authenticates with the username user and the password pass.
mongoimport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
CSV Import In the following example, mongoimport imports the csv formatted data in the
/opt/backups/contacts.csv file into the collection contacts in the users database on the Mon-
goDB instance running on the localhost port numbered 27017.
Specifying --headerline instructs mongoimport to determine the name of the fields using the first line in the
CSV file.
mongoimport --db users --collection contacts --type csv --headerline --file /opt/backups/contacts.csv
mongoimport uses the input file name, without the extension, as the collection name if -c or --collection is
unspecified. The following example is therefore equivalent:
mongoimport --db users --type csv --headerline --file /opt/backups/contacts.csv
Use the --ignoreBlanks option to ignore blank fields. For CSV and TSV imports, this option provides the
desired functionality in most cases because it avoids inserting fields with null values into your collection.
Additional Resources
The following tutorials describe backup and restoration for sharded clusters:
Backup a Small Sharded Cluster with mongodump (page 278) If your sharded cluster holds a small data set, you
can use mongodump to capture the entire backup in a reasonable amount of time.
110 https://www.mongodb.com/lp/white-paper/backup-disaster-recovery?jmp=docs
111 https://cloud.mongodb.com/?jmp=docs
112 http://www.mongodb.com/blog/post/backup-vs-replication-why-do-you-need-both?jmp=docs
113 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
Backup a Sharded Cluster with Filesystem Snapshots (page 279) Use file system snapshots back up each compo-
nent in the sharded cluster individually. The procedure involves stopping the cluster balancer. If your system
configuration allows file system backups, this might be more efficient than using MongoDB tools.
Backup a Sharded Cluster with Database Dumps (page 281) Create backups using mongodump to back up each
component in the cluster individually.
Schedule Backup Window for Sharded Clusters (page 284) Limit the operation of the cluster balancer to provide a
window for regular backup operations.
Restore a Single Shard (page 285) An outline of the procedure and consideration for restoring a single shard from a
backup.
Restore a Sharded Cluster (page 285) An outline of the procedure and consideration for restoring an entire sharded
cluster from backup.
On this page
Overview (page 278)
Considerations (page 278)
Procedure (page 278)
Additional Resources (page 279)
Overview If your sharded cluster holds a small data set, you can connect to a mongos using mongodump. You can
create backups of your MongoDB cluster, if your backup infrastructure can capture the entire backup in a reasonable
amount of time and if you have a storage system that can hold the complete MongoDB data set.
See MongoDB Backup Methods (page 200) and Backup and Restore Sharded Clusters (page 277) for complete infor-
mation on backups in MongoDB and backups of sharded clusters in particular.
Considerations If you use mongodump without specifying a database or collection, mongodump will capture
collection data and the cluster meta-data from the config servers (page 734).
You cannot use the --oplog option for mongodump when capturing data from mongos. As a result, if you need
to capture a backup that reflects a single moment in time, you must stop all writes to the cluster for the duration of the
backup operation.
To run mongodump against a MongoDB deployment that has access control (page 331) enabled, you must have
privileges that grant find (page 429) action for each database to back up. The built-in backup (page 420) role
provides the required privileges to perform backup of any and all databases.
Changed in version 3.2.1: The backup (page 420) role provides additional privileges to back up the
system.profile (page 300) collections that exist when running with database profiling (page 234). Previously,
users required an additional read access on this collection.
Procedure
Capture Data You can perform a backup of a sharded cluster by connecting mongodump to a mongos. Use the
following operation at your systems prompt:
mongodump --host mongos3.example.net --port 27017
mongodump will write BSON files that hold a copy of data stored in the sharded cluster accessible via the mongos
listening on port 27017 of the mongos3.example.net host.
Restore Data Backups created with mongodump do not reflect the chunks or the distribution of data in the sharded
collection or collections. Like all mongodump output, these backups contain separate directories for each database
and BSON files for each collection in that database.
You can restore mongodump output to any MongoDB instance, including a standalone, a replica set, or a new sharded
cluster. When restoring data to sharded cluster, you must deploy and configure sharding before restoring data from
the backup. See Deploy a Sharded Cluster (page 757) for more information.
Additional Resources See also MongoDB Cloud Manager114 for seamless automation, backup, and monitoring.
On this page
Overview (page 279)
Considerations (page 279)
Procedure (page 280)
Additional Resources (page 281)
Changed in version 3.2: Starting in MongoDB 3.2, the procedure can be used with the MMAPv1 (page 595) and
the WiredTiger (page 587) storage engines. With previous versions of MongoDB, the procedure applied to MMAPv1
(page 595) only.
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This pro-
cedure uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump
to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with
Database Dumps (page 281) for the alternate procedure.
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
For more information on backups in MongoDB and backups of sharded clusters in particular, see MongoDB Backup
Methods (page 200) and Backup and Restore Sharded Clusters (page 277).
Considerations
Balancer It is essential that you stop the balancer (page 750) before capturing a backup.
If the balancer is active while you capture backups, the backup artifacts may be incomplete and/or have duplicate data,
as chunks may migrate while recording backups.
114 https://cloud.mongodb.com/?jmp=docs
Precision In this procedure, you will stop the cluster balancer and take a backup up of the config database, and
then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time
snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise
the snapshot will only approximate a moment in time.
For approximate point-in-time snapshots, you can minimize the impact on the cluster by taking the backup from a
secondary member of each replica set shard.
Consistency If the journal and data files are on the same logical volume, you can use a single point-in-time snapshot
to capture a consistent copy of the data files.
If the journal and data files are on different file systems, you must use db.fsyncLock() and
db.fsyncUnlock() to ensure that the data files do not change, providing consistency for the purposes of cre-
ating backups.
Procedure
Step 1: Disable the balancer. To disable the balancer (page 750), connect the mongo shell to a mongos instance
and run sh.stopBalancer() in the config database.
use config
sh.stopBalancer()
For more information, see the Disable the Balancer (page 794) procedure.
Step 2: If necessary, lock one secondary member of each replica set. If your secondary does not have journaling
enabled or its journal and data files are on different volumes, you must lock the secondarys mongod instance before
capturing a backup.
If your secondary has journaling enabled and its journal and data files are on the same volume, you may skip this step.
Important: If your deployment requires this step, you must perform it on one secondary of each shard and, if
your sharded cluster uses a replica set for the config servers, one secondary of the config server replica set (CSRS)
(page 735).
Ensure that the oplog has sufficient capacity to allow these secondaries to catch up to the state of the primaries after
finishing the backup procedure. See Oplog Size (page 647) for more information.
Lock shard replica set secondary. For each shard replica set in the sharded cluster, connect a mongo shell to the
secondary members mongod instance and run db.fsyncLock().
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to
db.fsyncUnlock().
Lock config server replica set secondary. Connect a mongo shell to the secondary members mongod instance.
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to
db.fsyncUnlock().
Step 3: Back up one of the config servers. Backing up a config server (page 734) backs up the sharded clusters
metadata. You only need to back up one config server, as they all hold the same data. If you are using CSRS config
servers, perform this step against the locked config server.
If the sharded cluster uses CSRS Confirm that the locked secondary member of the CSRS recognizes that the
balancer is disabled. In a mongo shell connected to the secondary members mongod instance, perform the following.
use config
rs.slaveOk()
db.settings.find({ "_id" : "balancer", "stopped" : true })
If the member recognizes that the balancer is disabled, the query should return a document. Otherwise, wait until the
query returns a document.
To confirm that the CSRS secondary member has replicated past the last completed migration, check the changelog
collection in the config database. The last logged moveChunk operation should be a commit.
use config;
db.changelog.find({what:/^moveChunk/}).sort({time:-1}).next().what"
The query should return "moveChunk.commit". If not, wait until the chunk migration completes.
Take a file-system snapshot of the config server. To create a file-system snapshot of the config server, follow the
procedure in Create a Snapshot (page 268).
Step 4: Back up a replica set member for each shard. If you locked a member of the replica set shards, perform
this step against the locked secondary.
You may back up the shards in parallel. For each shard, create a snapshot, using the procedure in Backup and Restore
with Filesystem Snapshots (page 266).
Step 5: Unlock all locked replica set members. If you locked any mongod instances to capture the backup, unlock
them.
To unlock the replica set members, use db.fsyncUnlock() method in the mongo shell. For each locked member,
use the same mongo shell used to lock the instance.
db.fsyncUnlock()
Step 6: Enable the balancer. To re-enable to balancer, connect the mongo shell to a mongos instance and run
sh.setBalancerState().
sh.setBalancerState(true)
Additional Resources See also MongoDB Cloud Manager115 for seamless automation, backup, and monitoring.
115 https://cloud.mongodb.com/?jmp=docs
On this page
Overview (page 282)
Prerequisites (page 282)
Consideration (page 282)
Procedure (page 282)
Additional Resources (page 284)
Changed in version 3.2: Starting in MongoDB 3.2, the following procedure can be used with the MMAPv1 (page 595)
and the WiredTiger (page 587) storage engines. With previous versions of MongoDB, the procedure applied to
MMAPv1 (page 595) only.
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This
procedure uses mongodump to create dumps of the mongod instance. An alternate procedure uses file system snap-
shots to capture the backup data, and may be more efficient in some situations if your system configuration allows file
system backups.
For more information on backups in MongoDB and backups of sharded clusters in particular, see MongoDB Backup
Methods (page 200) and Backup and Restore Sharded Clusters (page 277).
Prerequisites
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
Access Control The backup (page 420) role provides the required privileges to perform backup on a sharded
cluster that has access control enabled.
Changed in version 3.2.1: The backup (page 420) role provides additional privileges to back up the
system.profile (page 300) collections that exist when running with database profiling (page 234). Previously,
users required an additional read access on this collection.
Consideration To create these backups of a sharded cluster, you will stop the cluster balancer and take a backup of
the config database, and then take backups of each shard in the cluster using mongodump to capture the backup data.
To capture a more exact moment-in-time snapshot of the system, you will need to stop all application writes before
taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time.
For approximate point-in-time snapshots, you can minimize the impact on the cluster by taking the backup from a
secondary member of each replica set shard.
Procedure
Step 1: Disable the balancer process. To disable the balancer (page 750), connect the mongo shell to a mongos
instance and run sh.stopBalancer() in the config database.
use config
sh.stopBalancer()
For more information, see the Disable the Balancer (page 794) procedure.
Warning: If you do not stop the balancer, the backup could have duplicate data or omit data as chunks migrate
while recording backups.
Step 2: Lock one secondary member of each replica set. Lock a secondary member of each replica set in the
sharded cluster, and, if your sharded cluster uses a replica set for the config servers, one secondary of the config server
replica set (CSRS) (page 735).
Ensure that the oplog has sufficient capacity to allow these secondaries to catch up to the state of the primaries after
finishing the backup procedure. See Oplog Size (page 647) for more information.
Lock shard replica set secondary. For each shard replica set in the sharded cluster, connect a mongo shell to the
secondary members mongod instance and run db.fsyncLock().
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to
db.fsyncUnlock().
Lock config server replica set secondary. If locking a secondary of the CSRS, confirm that the member recognizes
that the balancer is disabled and the last migration has finished. Connect a mongo shell to the secondary members
mongod instance. To confirm that the member recognizes that the balancer is disabled:
use config
rs.slaveOk()
db.settings.find({ "_id" : "balancer", "stopped" : true })
If the member recognizes that the balancer is disabled, the query should return a document. Otherwise, wait until the
query returns a document.
To confirm that the CSRS secondary member has replicated past the last completed migration, check the changelog
collection in the config database. The last logged moveChunk operation should be a commit.
use config;
db.changelog.find({what:/^moveChunk/}).sort({time:-1}).next().what"
The query should return "moveChunk.commit". If not, wait until the chunk migration completes.
If the secondary member recognizes that the balancer is disabled and the last migration is complete, lock the member.
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to
db.fsyncUnlock().
Step 3: Backup one config server. Run mongodump against a config server mongod instance to back up the
clusters metadata. You only need to back up one config server. If you are using CSRS config servers and locked a
config server secondary in the previous step, perform this step against the locked config server.
Use mongodump with the --oplog option to backup one of the config servers (page 734).
mongodump --oplog
If your deployment uses CSRS config servers, unlock the config server node before proceeding to the next step. To
unlock the CSRS member, use db.fsyncUnlock() method in the mongo shell used to lock the instance.
db.fsyncUnlock()
Step 4: Back up a replica set member for each shard. Back up the locked replica set members of the shards using
mongodump with the --oplog option. You may back up the shards in parallel.
mongodump --oplog
Step 5: Unlock replica set members for each shard. To unlock the replica set members, use
db.fsyncUnlock() method in the mongo shell. For each locked member, use the same mongo shell used to
lock the instance.
db.fsyncUnlock()
Step 6: Re-enable the balancer process. To re-enable to balancer, connect the mongo shell to a mongos instance
and run sh.setBalancerState().
use config
sh.setBalancerState(true)
Additional Resources See also MongoDB Cloud Manager116 for seamless automation, backup, and monitoring.
On this page
Overview (page 284)
Procedure (page 284)
Overview In a sharded cluster, the balancer process is responsible for distributing sharded data around the cluster,
so that each shard has roughly the same amount of data.
However, when creating backups from a sharded cluster it is important that you disable the balancer while taking
backups to ensure that no chunk migrations affect the content of the backup captured by the backup procedure. Using
the procedure outlined in the section Disable the Balancer (page 794) you can manually stop the balancer process
temporarily. As an alternative you can use this procedure to define a balancing window so that the balancer is always
disabled during your automated backup operation.
Procedure If you have an automated backup schedule, you can disable all balancing operations for a period of time.
For instance, consider the following command:
use config
db.settings.update( { _id : "balancer" }, { $set : { activeWindow : { start : "6:00", stop : "23:00"
This operation configures the balancer to run between 6:00am and 11:00pm, server time. Schedule your backup
operation to run and complete outside of this time. Ensure that the backup can complete outside the window when
the balancer is running and that the balancer can effectively balance the collection among the shards in the window
allotted to each.
116 https://cloud.mongodb.com/?jmp=docs
On this page
Overview (page 285)
Procedure (page 285)
Overview Restoring a single shard from backup with other unaffected shards requires a number of special consider-
ations and practices. This document outlines the additional tasks you must perform when restoring a single shard.
Consider the following resources on backups in general as well as backup and restoration of sharded clusters specifi-
cally:
Backup and Restore Sharded Clusters (page 277)
Restore a Sharded Cluster (page 285)
MongoDB Backup Methods (page 200)
Procedure Always restore sharded clusters as a whole. When you restore a single shard, keep in mind that the
balancer process might have moved chunks to or from this shard since the last backup. If thats the case, you must
manually move those chunks, as described in this procedure.
Step 1: Restore the shard as you would any other mongod instance. See MongoDB Backup Methods (page 200)
for overviews of these procedures.
Step 2: Manage the chunks. For all chunks that migrate away from this shard, you do not need to do anything at
this time. You do not need to delete these documents from the shard because the chunks are automatically filtered out
from queries by mongos. You can remove these documents from the shard, if you like, at your leisure.
For chunks that migrate to this shard after the most recent backup, you must manually recover the chunks using back-
ups of other shards, or some other source. To determine what chunks have moved, view the changelog collection
in the Config Database (page 816).
On this page
Overview (page 285)
Procedures (page 285)
Overview You can restore a sharded cluster either from snapshots (page 266) or from BSON database dumps
(page 281) created by the mongodump tool. This document describes procedures to
Restore a Sharded Cluster with Filesystem Snapshots (page 286)
Restore a Sharded Cluster with Database Dumps (page 287)
Procedures
Restore a Sharded Cluster with Filesystem Snapshots The following procedure outlines the steps to restore a
sharded cluster from filesystem snapshots. To create filesystem snapshots of sharded clusters, see Backup a Sharded
Cluster with Filesystem Snapshots (page 279).
Step 1: Shut down the entire cluster. Stop all mongos and mongod processes, including all shards and all config
servers. To stop all members, connect to each member and issue following operations:
use admin
db.shutdownServer()
Step 2: Restore the data files. On each server, extract the data files to the location where the mongod instance will
access them and restore the following:
Data files for each server in each shard.
For each shard replica set, restore all the members of the replica set. See Restore a Replica Set from MongoDB
Backups (page 270).
Data files for each config server.
Changed in version 3.2: If restoring to a config server replica set (CSRS) (page 735), restore the members of the
replica set. See Restore a Replica Set from MongoDB Backups (page 270).
Else, if restoring to 3 mirrored config servers, restore the files on each config server mongod instance as you
would a standalone node.
See also:
Restore a Snapshot (page 269).
Or, if restoring to a three mirrored mongod instances, start exactly three mongod config server instances.
mongod --configsvr --dbpath <config dbpath> --port 27019
Step 5: If shard hostnames have changed, update the config database. If shard hostnames have changed, con-
nect a mongo shell to the mongos instance and update the shards (page 821) collection in the Config Database
(page 816) to reflect the new hostnames.
Step 6: Clear per-shard sharding recovery information. If the backup data was from a deployment using CSRS
(page 735), clear out the no longer applicable recovery information on each shard. For each shard:
1. Restart the replica set members for the shard with the recoverShardingState parameter set to false.
Include additional options as required for your specific configuration.
mongod --setParameter=recoverShardingState=false --replSet <replSetName>
2. Connect mongo shell to the primary of the replica set and delete from the admin.system.version collec-
tion the document where _id equals minOpTimeRecovery id. Use write concern "majority".
use admin
db.system.version.remove(
{ _id: "minOpTimeRecovery" },
{ writeConcern: { w: "majority" } }
)
Step 7: Restart all the shard mongod instances. Do not include the recoverShardingState parameter.
Step 9: Verify that the cluster is operational. Connect to a mongos instance from a mongo shell and use the
db.printShardingStatus() method to ensure that the cluster is operational.
db.printShardingStatus()
show collections
Restore a Sharded Cluster with Database Dumps The following procedure outlines the steps to restore a sharded
cluster from the BSON database dumps created by mongodump. For information on using mongodump to backup
sharded clusters, see Backup a Sharded Cluster with Database Dumps (page 281).
Changed in version 3.0: mongorestore requires a running MongoDB instances. Earlier versions of
mongorestore did not require a running MongoDB instances and instead used the --dbpath option. For in-
structions specific to your version of mongorestore, refer to the appropriate version of the manual.
Step 1: Deploy a new replica set for each shard. For each shard, deploy a new replica set:
1. Start a new mongod for each member of the replica set. Include any other configuration as appropriate.
2. Connect a mongo to one of the mongod instances. In the mongo shell:
(a) Run rs.initiate().
(b) Use rs.add() to add the other members of the replica set.
For detailed instructions on deploying a replica set, see Deploy a Replica Set (page 657).
Step 2: Deploy new config servers. To deploy config servers as replica set (CSRS), see Deploy the Config Server
Replica Set (page 758).
To deploy config servers as 3 mirrored mongod instances, see Start 3 Mirrored Config Servers (Deprecated)
(page 761).
Step 3: Start the mongos instances. Start the mongos instances, specifying the new config servers with
--configdb. Include any other configuration as appropriate.
For sharded clusters with CSRS, see Start the mongos Instances (page 759).
For sharded clusters with 3 mirrored config servers, see Start the mongos Instances (Deprecated) (page 762).
Step 4: Add shards to the cluster. Connect a mongo shell to a mongos instance. Use sh.addShard() to add
each replica sets as a shard.
For detailed instructions in adding shards to the cluster, see Add Shards to the Cluster (page 759).
Step 5: Shut down the mongos instances. Once the new sharded cluster is up, shut down all mongos instances.
Step 6: Restore the shard data. For each shard, use mongorestore to restore the data dump to the primarys
data directory. Include the --drop option to drop the collections before restoring and, because the backup procedure
(page 281) included the --oplog option, include the --oplogReplay option for mongorestore.
For example, on the primary for ShardA, run the mongorestore. Specify any other configuration as appropriate.
mongorestore --drop --oplogReplay /data/dump/shardA
After you have finished restoring all the shards, shut down all shard instances.
Step 9: If shard hostnames have changed, update the config database. If shard hostnames have changed, con-
nect a mongo shell to the mongos instance and update the shards (page 821) collection in the Config Database
(page 816) to reflect the new hostnames.
Step 10: Restart all the shard mongod instances. Do not include the recoverShardingState parameter.
Step 12: Verify that the cluster is operational. Connect to a mongos instance from a mongo shell and use the
db.printShardingStatus() method to ensure that the cluster is operational.
db.printShardingStatus()
show collections
See also:
MongoDB Backup Methods (page 200), Backup and Restore Sharded Clusters (page 277)
On this page
Process (page 290)
Procedures (page 290)
mongod.lock (page 291)
If MongoDB does not shutdown cleanly, the on-disk representation of the data files will likely reflect an inconsistent
state which could lead to data corruption. 117
To prevent data inconsistency and corruption, always shut down the database cleanly and use the durability journaling.
MongoDB writes data to the journal, by default, every 100 milliseconds, such that MongoDB can always recover to a
consistent state even in the case of an unclean shutdown due to power loss or other system failure.
If you are not running as part of a replica set and do not have journaling enabled, use the following procedure to
recover data that may be in an inconsistent state. If you are running as part of a replica set, you should always restore
from a backup or restart the mongod instance with an empty dbPath and allow MongoDB to perform an initial sync
to restore the data.
To ensure a clean shut down, use one of the following methods:
db.shutdownServer() from the mongo shell,
Your systems init script,
Control-C when running mongod in interactive mode,
kill $(pidof mongod); or kill -2 $(pidof mongod),
On Linux, the mongod --shutdown option.
See also:
The Administration (page 199) documents, including Replica Set Syncing (page 646), and the documentation on the
--repair repairPath and storage.journal.enabled settings.
117 You can also use the db.collection.validate() method to test the integrity of a single collection. However, this process is time
consuming, and without journaling you can safely assume that the data is in an invalid state and you should either run the repair operation or resync
from an intact member of the replica set.
Process
Indications When you are aware of a mongod instance running without journaling that stops unexpectedly and
youre not running with replication, you should always run the repair operation before starting MongoDB again. If
youre using replication, then restore from a backup and allow replication to perform an initial sync (page 646) to
restore data.
If the mongod.lock file in the data directory specified by dbPath, /data/db by default, is not a zero-byte file,
then mongod will refuse to start, and you will find a message that contains the following line in your MongoDB log
our output:
Unclean shutdown detected.
This indicates that you need to run mongod with the --repair option. If you run repair when the mongodb.lock
file exists in your dbPath, or the optional --repairpath, you will see a message that contains the following line:
old lock file: /data/db/mongod.lock. probably means unclean shutdown
If you see this message, as a last resort you may remove the lockfile and run the repair operation before starting the
database normally, as in the following procedure:
There are two processes to repair data files that result from an unexpected shutdown:
Use the --repair option in conjunction with the --repairpath option. mongod will read the existing
data files, and write the existing data to new data files.
You do not need to remove the mongod.lock file before using this procedure.
Use the --repair option. mongod will read the existing data files, write the existing data to new files and
replace the existing, possibly corrupt, files with new files.
You must remove the mongod.lock file before using this procedure.
Note: --repair functionality is also available in the shell with the db.repairDatabase() helper for the
repairDatabase command.
Procedures
Important: Always Run mongod as the same user to avoid changing the permissions of the MongoDB data files.
Repair Data Files and Preserve Original Files To repair your data files using the --repairpath option to
preserve the original data files unmodified.
Step 1: Start mongod using the option to replace the original files with the repaired files. Start the mongod
instance using the --repair option and the --repairpath option. Issue a command similar to the following:
When this completes, the new repaired data files will be in the /data/db0 directory.
Step 2: Start mongod with the new data directory. Start mongod using the following invocation to point the
dbPath at /data/db0:
mongod --dbpath /data/db0
Once you confirm that the data files are operational you may delete or archive the old data files in the /data/db
directory. You may also wish to move the repaired files to the old database location or update the dbPath to indicate
the new location.
Repair Data Files without Preserving Original Files To repair your data files without preserving the original files,
do not use the --repairpath option, as in the following procedure:
Warning: After you remove the mongod.lock file you must run the --repair process before using your
database.
Replace /data/db with your dbPath where your MongoDB instances data files reside.
Step 2: Start mongod using the option to replace the original files with the repaired files. Start the mongod
instance using the --repair option, which replaces the original data files with the repaired data files. Issue a
command similar to the following:
mongod --dbpath /data/db --repair
When this completes, the repaired data files will replace the original data files in the /data/db directory.
Step 3: Start mongod as usual. Start mongod using the following invocation to point the dbPath at /data/db:
mongod --dbpath /data/db
mongod.lock
In normal operation, you should never remove the mongod.lock file and start mongod. Instead consider the one
of the above methods to recover the database and remove the lock files. In dire situations you can remove the lockfile,
and start the database using the possibly corrupt files, and attempt to recover data from the database; however, its
impossible to predict the state of the database in these situations.
If you are not running with journaling, and your database shuts down unexpectedly for any reason, you should always
proceed as if your database is in an inconsistent and likely corrupt state. If at all possible restore from backup
(page 200) or, if running as a replica set, restore by performing an initial sync using data from an intact member of the
set, as described in Resync a Member of a Replica Set (page 690).
This page lists the tutorials available as part of the MongoDB Manual. In addition to these tutorial in the manual,
MongoDB provides Getting Started Guides in various driver editions. If there is a process or pattern that you would
like to see included here, please open a Jira Case118 .
Installation
Administration
Replica Sets
Sharding
Basic Operations
Security
Development Patterns
UNIX ulimit Settings (page 295) Describes user resources limits (i.e. ulimit) and introduces the considerations
and optimal configurations for systems that run MongoDB deployments.
System Collections (page 299) Introduces the internal collections that MongoDB uses to track per-database metadata,
including indexes, collections, and authentication credentials.
Database Profiler Output (page 300) Describes the data collected by MongoDBs operation profiler, which intro-
spects operations and reports data for analysis on performance and behavior.
Server-side JavaScript (page 306) Describes MongoDBs support for executing JavaScript code for server-side oper-
ations.
Exit Codes and Statuses (page 307) Lists the unique codes returned by mongos and mongod processes upon exit.
On this page
Resource Utilization (page 295)
Review and Set Resource Limits (page 296)
Most UNIX-like operating systems, including Linux and OS X, provide ways to limit and control the usage of system
resources such as threads, files, and network connections on a per-process and per-user basis. These ulimits prevent
single users from using too many system resources. Sometimes, these limits have low default values that can cause a
number of issues in the course of normal MongoDB operation.
Note: Red Hat Enterprise Linux and CentOS 6 place a max process limitation of 1024 which overrides ulimit set-
tings. Create a file named /etc/security/limits.d/99-mongodb-nproc.conf with new soft nproc
and hard nproc values to increase the process limit. See /etc/security/limits.d/90-nproc.conf file
as an example.
Resource Utilization
mongod and mongos each use threads and file descriptors to track connections and manage internal operations. This
section outlines the general resource utilization patterns for MongoDB. Use these figures in combination with the
actual information about your deployment and its use to determine ideal ulimit settings.
Generally, all mongod and mongos instances:
track each incoming connection with a file descriptor and a thread.
mongod
1 file descriptor for each data file in use by the mongod instance.
1 file descriptor for each journal file used by the mongod instance when storage.journal.enabled is
true.
In replica sets, each mongod maintains a connection to all other members of the set.
mongod uses background threads for a number of internal processes, including TTL collections (page 231), replica-
tion, and replica set health checks, which may require a small number of additional resources.
mongos
In addition to the threads and file descriptors for client connections, mongos must maintain connects to all config
servers and all shards, which includes all members of all replica sets.
For mongos, consider the following behaviors:
mongos instances maintain a connection pool to each shard so that the mongos can reuse connections and
quickly fulfill requests without needing to create new connections.
You can limit the number of incoming connections using the maxIncomingConnections run-time option.
By restricting the number of incoming connections you can prevent a cascade effect where the mongos creates
too many connections on the mongod instances.
Note: Changed in version 2.6: MongoDB removed the upward limit on the maxIncomingConnections
setting.
ulimit
You can use the ulimit command at the system prompt to check system limits, as in the following example:
$ ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 192276
-n: file descriptors 21000
-l: locked-in-memory size (kb) 40000
-v: address space (kb) unlimited
-x: file locks unlimited
-i: pending signals 192276
-q: bytes in POSIX msg queues 819200
-e: max nice 30
-r: max rt priority 65
-N 15: unlimited
ulimit refers to the per-user limitations for various resources. Therefore, if your mongod instance executes as
a user that is also running multiple processes, or multiple mongod processes, you might see contention for these
resources. Also, be aware that the processes value (i.e. -u) refers to the combined number of distinct processes
and sub-process threads.
You can change ulimit settings by issuing a command in the following form:
ulimit -n <value>
There are both hard and the soft ulimits that affect MongoDBs performance. The hard ulimit refers to
the maximum number of processes that a user can have active at any time. This is the ceiling: no non-root process
can increase the hard ulimit. In contrast, the soft ulimit is the limit that is actually enforced for a session or
process, but any process can increase it up to hard ulimit maximum.
A low soft ulimit can cause cant create new thread, closing connection errors if the number
of connections grows too high. For this reason, it is extremely important to set both ulimit values to the recom-
mended values.
ulimit will modify both hard and soft values unless the -H or -S modifiers are specified when modifying limit
values.
For many distributions of Linux you can change values by substituting the -n option for any possible value in the
output of ulimit -a. On OS X, use the launchctl limit command. See your operating system documentation
for the precise procedure for changing system limits on running systems.
After changing the ulimit settings, you must restart the process to take advantage of the modified settings. You can
use the /proc file system to see the current limitations on a running process.
Depending on your systems configuration, and default settings, any change to system limits made using ulimit
may revert following system a system restart. Check your distribution and operating system documentation for more
information.
Note: SUSE Linux Enterprise Server and potentially other SUSE distributions ship with virtual memory address
space limited to 8 GB by default. You must adjust this in order to prevent virtual memory allocation failures as the
database grows.
The SLES packages for MongoDB adjust these limits in the default scripts, but you will need to make this change
manually if you are using custom scripts and/or the tarball release rather than the SLES packages.
Every deployment may have unique requirements and settings; however, the following thresholds and settings are
particularly important for mongod and mongos deployments:
-f (file size): unlimited
-t (cpu time): unlimited
-v (virtual memory): unlimited 119
-n (open files): 64000
-m (memory size): unlimited 1 120
-u (processes/threads): 64000
119 If you limit virtual or resident memory size on a system running MongoDB the operating system will refuse to honor additional allocation
requests.
120 The -m parameter to ulimit has no effect on Linux systems with kernel versions more recent than 2.4.30. You may omit -m if you wish.
Always remember to restart your mongod and mongos instances after changing the ulimit settings to ensure that
the changes take effect.
For Linux distributions that use Upstart, you can specify limits within service scripts if you start mongod and/or
mongos instances as Upstart services. You can do this by using limit stanzas121 .
Specify the Recommended ulimit Settings (page 297), as in the following example:
limit fsize unlimited unlimited # (file size)
limit cpu unlimited unlimited # (cpu time)
limit as unlimited unlimited # (virtual memory size)
limit nofile 64000 64000 # (open files)
limit nproc 64000 64000 # (processes/threads)
Each limit stanza sets the soft limit to the first value specified and the hard limit to the second.
After changing limit stanzas, ensure that the changes take effect by restarting the application services, using the
following form:
restart <service name>
For Linux distributions that use systemd, you can specify limits within the [Service] sections of service scripts
if you start mongod and/or mongos instances as systemd services. You can do this by using resource limit direc-
tives122 .
Specify the Recommended ulimit Settings (page 297), as in the following example:
[Service]
# Other directives omitted
# (file size)
LimitFSIZE=infinity
# (cpu time)
LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (open files)
LimitNOFILE=64000
# (processes/threads)
LimitNPROC=64000
Each systemd limit directive sets both the hard and soft limits to the value specified.
After changing limit stanzas, ensure that the changes take effect by restarting the application services, using the
following form:
systemctl restart <service name>
121 http://upstart.ubuntu.com/wiki/Stanzas#limit
122 http://www.freedesktop.org/software/systemd/man/systemd.exec.html#LimitCPU=
The /proc file-system stores the per-process limits in the file system object located at /proc/<pid>/limits,
where <pid> is the processs PID or process identifier. You can use the following bash function to return the content
of the limits object for a process or processes with a given name:
return-limits(){
if [ -z $@ ]; then
echo "[no $process running]"
else
for pid in $process_pids; do
echo "[$process #$pid -- limits]"
cat /proc/$pid/limits
done
fi
done
You can copy and paste this function into a current shell session or load it as part of a script. Call the function with
one the following invocations:
return-limits mongod
return-limits mongos
return-limits mongod mongos
On this page
Synopsis (page 299)
Collections (page 299)
Synopsis
MongoDB stores system information in collections that use the <database>.system.* namespace, which Mon-
goDB reserves for internal use. Do not create collections that begin with system.
MongoDB also stores some additional instance-local metadata in the local database (page 715), specifically for repli-
cation purposes.
Collections
The admin.system.roles (page 299) collection stores custom roles that administrators create and assign
to users to provide access to specific resources.
admin.system.users
Changed in version 2.6.
The admin.system.users (page 300) collection stores the users authentication credentials as well as any
roles assigned to the user. Users may define authorization roles in the admin.system.roles (page 299)
collection.
admin.system.version
New in version 2.6.
Stores the schema version of the user credential documents.
System collections also include these collections stored directly in each database:
<database>.system.namespaces
Deprecated since version 3.0: Access this data using listCollections.
The <database>.system.namespaces (page 300) collection contains information about all of the
databases collections.
<database>.system.indexes
Deprecated since version 3.0: Access this data using listIndexes.
The <database>.system.indexes (page 300) collection lists all the indexes in the database.
<database>.system.profile
The <database>.system.profile (page 300) collection stores database profiling information. For in-
formation on profiling, see Database Profiling (page 234).
<database>.system.js
The <database>.system.js (page 300) collection holds special JavaScript code for use in server side
JavaScript (page 306). See Store a JavaScript Function on the Server (page 257) for more information.
On this page
Example system.profile Document (page 300)
Output Reference (page 302)
The database profiler captures data information about read and write operations, cursor operations, and database com-
mands. To configure the database profile and set the thresholds for capturing profile data, see the Analyze Performance
of Database Operations (page 249) section.
The database profiler writes data in the system.profile (page 300) collection, which is a capped collection. To
view the profilers output, use normal MongoDB queries on the system.profile (page 300) collection.
Note: Because the database profiler writes data to the system.profile (page 300) collection in a database, the
profiler will profile some write activity, even for databases that are otherwise read-only.
The documents in the system.profile (page 300) collection have the following form. This example document
reflects a find operation:
{
"op" : "query",
"ns" : "test.c",
"query" : {
"find" : "c",
"filter" : {
"a" : 1
}
},
"keysExamined" : 2,
"docsExamined" : 2,
"cursorExhausted" : true,
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 0,
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(2)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(1)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(1)
}
}
},
"nreturned" : 2,
"responseLength" : 108,
"millis" : 0,
"execStats" : {
"stage" : "FETCH",
"nReturned" : 2,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 2,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 2,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 2,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 2,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"a" : 1
},
"indexName" : "a_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"a" : [
"[1.0, 1.0]"
]
},
"keysExamined" : 2,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
},
"ts" : ISODate("2015-09-03T15:26:14.948Z"),
"client" : "127.0.0.1",
"allUsers" : [ ],
"user" : ""
}
Output Reference
For any single operation, the documents created by the database profiler will include a subset of the following fields.
The precise selection of fields in these documents depends on the type of operation.
Changed in version 3.2.0: system.profile.query.skip replaces the system.profile.ntoskip field.
Changed in version 3.2.0: The information in the system.profile.ntoreturn field has been replaced
by two separate fields, system.profile.query.limit and system.profile.query.batchSize.
Older drivers or older versions of the mongo shell may still use ntoreturn; this will appear as
system.profile.query.ntoreturn.
Note: For the output specific to the version of your MongoDB, refer to the appropriate version of the MongoDB
Manual.
system.profile.op
The type of operation. The possible values are:
insert
query
update
remove
getmore
command
system.profile.ns
The namespace the operation targets. Namespaces in MongoDB take the form of the database, followed by a
dot (.), followed by the name of the collection.
system.profile.query
The query document (page 103) used, or for an insert operation, the inserted document. If the document exceeds
50 kilobytes, the value is a string summary of the object. If the string summary exceeds 50 kilobytes, the string
summary is truncated, denoted with an ellipsis (...) at the end of the string.
Changed in version 3.0.4: For "getmore" (page 302) operations on cursors returned from a
db.collection.find() or a db.collection.aggregate(), the query (page 303) field contains
respectively the query predicate or the issued aggregate command document. For details on the aggregate
command document, see the aggregate reference page.
system.profile.command
The command operation. If the command document exceeds 50 kilobytes, the value is a string summary of the
object. If the string summary exceeds 50 kilobytes, the string summary is truncated, denoted with an ellipsis
(...) at the end of the string.
system.profile.updateobj
The <update> document passed in during an update (page 110) operation. If the document exceeds 50 kilo-
bytes, the value is a string summary of the object. If the string summary exceeds 50 kilobytes, the string
summary is truncated, denoted with an ellipsis (...) at the end of the string.
system.profile.cursorid
The ID of the cursor accessed by a query and getmore operations.
system.profile.keysExamined
Changed in version 3.2.0: Renamed from system.profile.nscanned.
The number of index (page 487) keys that MongoDB scanned in order to carry out the operation.
In general, if keysExamined (page 303) is much higher than nreturned (page 305), the database is scan-
ning many index keys to find the result documents. Consider creating or adjusting indexes to improve query
performance..
system.profile.docsExamined
Changed in version 3.2.0: Renamed from system.profile.nscannedObjects.
The number of documents in the collection that MongoDB scanned in order to carry out the operation.
system.profile.moved
Changed in version 3.0.0: Only appears when using the MMAPv1 storage engine.
This field appears with a value of true when an update operation moved one or more documents to a new
location on disk. If the operation did not result in a move, this field does not appear. Operations that result in a
move take more time than in-place updates and typically occur as a result of document growth.
system.profile.nmoved
Changed in version 3.0.0: Only appears when using the MMAPv1 storage engine.
The number of documents the operation moved on disk. This field appears only if the operation resulted in a
move. The fields implicit value is zero, and the field is present only when non-zero.
system.profile.hasSortStage
Changed in version 3.2.0: Renamed from system.profile.scanAndOrder.
hasSortStage (page 303) is a boolean that is true when a query cannot use the ordering in the index to
return the requested sorted results; i.e. MongoDB must sort the documents after it receives the documents from
a cursor. The field only appears when the value is true.
system.profile.ndeleted
The number of documents deleted by the operation.
system.profile.ninserted
The number of documents inserted by the operation.
system.profile.nMatched
New in version 2.6.
The number of documents that match the system.profile.query (page 303) condition for the update
operation.
system.profile.nModified
New in version 2.6.
The number of documents modified by the update operation.
system.profile.upsert
A boolean that indicates the update operations upsert option value. Only appears if upsert is true.
system.profile.keyUpdates
The number of index (page 487) keys the update changed in the operation. Changing an index key carries a
small performance cost because the database must remove the old key and inserts a new key into the B-tree
index.
system.profile.writeConflicts
New in version 3.0.0.
The number of conflicts encountered during the write operation; e.g. an update operation attempts to modify
the same document as another update operation. See also write conflict.
system.profile.numYield
The number of times the operation yielded to allow other operations to complete. Typically, operations yield
when they need access to data that MongoDB has not yet fully read into memory. This allows other operations
that have data in memory to complete while MongoDB reads in data for the yielding operation. For more
information, see the FAQ on when operations yield (page 837).
system.profile.locks
New in version 3.0.0: locks (page 304) replaces the lockStats field.
The system.profile.locks (page 304) provides information for various lock types and lock modes
(page 835) held during the operation.
The possible lock types are:
Lock Type Description
Global Represents global lock.
Represents MMAPv1 storage engine specific lock to synchronize journal writes; for
MMAPV1Journal
non-MMAPv1 storage engines, the mode for MMAPV1Journal is empty.
Database Represents database lock.
CollectionRepresents collection lock.
Metadata Represents metadata lock.
oplog Represents lock on the oplog.
The possible locking modes for the lock types are as follows:
Lock Mode Description
R Represents Shared (S) lock.
W Represents Exclusive (X) lock.
r Represents Intent Shared (IS) lock.
w Represents Intent Exclusive (IX) lock.
The returned lock information for the various lock types include:
system.profile.locks.acquireCount
Number of times the operation acquired the lock in the specified mode.
system.profile.locks.acquireWaitCount
Number of times the operation had to wait for the acquireCount (page 305) lock acquisitions because
the locks were held in a conflicting mode. acquireWaitCount (page 305) is less than or equal to
acquireCount (page 305).
system.profile.locks.timeAcquiringMicros
Cumulative time in microseconds that the operation had to wait to acquire the locks.
timeAcquiringMicros (page 305) divided by acquireWaitCount (page 305) gives an approxi-
mate average wait time for the particular lock mode.
system.profile.locks.deadlockCount
Number of times the operation encountered deadlocks while waiting for lock acquisitions.
For more information on lock modes, see What type of locking does MongoDB use? (page 835).
system.profile.nreturned
The number of documents returned by the operation.
system.profile.responseLength
The length in bytes of the operations result document. A large responseLength (page 305) can affect
performance. To limit the size of the result document for a query operation, you can use any of the following:
Projections (page 115)
The limit() method
The batchSize() method
Note: When MongoDB writes query profile information to the log, the responseLength (page 305) value
is in a field named reslen.
system.profile.millis
The time in milliseconds from the perspective of the mongod from the beginning of the operation to the end of
the operation.
system.profile.execStats
Changed in version 3.0.
A document that contains the execution statistics of the query operation. For other operations, the value is an
empty document.
The system.profile.execStats (page 305) presents the statistics as a tree; each node provides the
statistics for the operation executed during that stage of the query operation.
Note: The following fields list for execStats (page 305) is not meant to be exhaustive as the returned fields
vary per stage.
system.profile.execStats.stage
New in version 3.0: stage (page 305) replaces the type field.
The descriptive name for the operation performed as part of the query execution; e.g.
COLLSCAN for a collection scan
IXSCAN for scanning index keys
FETCH for retrieving documents
system.profile.execStats.inputStages
New in version 3.0: inputStages (page 306) replaces the children field.
An array that contains statistics for the operations that are the input stages of the current stage.
system.profile.ts
The timestamp of the operation.
system.profile.client
The IP address or hostname of the client connection where the operation originates.
For some operations, such as db.eval(), the client is 0.0.0.0:0 instead of an actual client.
system.profile.allUsers
An array of authenticated user information (user name and database) for the session. See also Users (page 318).
system.profile.user
The authenticated user who ran the operation. If the operation was not run by an authenticated user, this fields
value is an empty string.
On this page
Overview (page 306)
Running .js files via a mongo shell Instance on the Server (page 306)
Concurrency (page 307)
Disable Server-Side Execution of JavaScript (page 307)
Overview
MongoDB provides the following commands, methods, and operator that perform server-side execution of JavaScript
code:
mapReduce and the corresponding mongo shell method db.collection.mapReduce(). mapReduce
operations map, or associate, values to keys, and for keys with multiple values, reduce the values for each key
to a single object. For more information, see Map-Reduce (page 462).
$where operator that evaluates a JavaScript expression or a function in order to query for documents.
You can also specify a JavaScript file to the mongo shell to run on the server. For more information, see Running .js
files via a mongo shell Instance on the Server (page 306)
JavaScript in MongoDB
Although these methods use JavaScript, most interactions with MongoDB do not use JavaScript but use an
idiomatic driver in the language of the interacting application.
You can also disable server-side execution of JavaScript. For details, see Disable Server-Side Execution of JavaScript
(page 307).
You can specify a JavaScript (.js) file to a mongo shell instance to execute the file on the server. This is a good
technique for performing batch administrative work. When you run mongo shell on the server, connecting via the
Concurrency
Changed in version 3.2: MongoDB 3.2 uses SpiderMonkey as the JavaScript engine for the mongo shell. For infor-
mation on this change, see JavaScript Changes in MongoDB 3.2 (page 892).
Refer to the individual method or operator documentation for any concurrency information. See also the concurrency
table (page 837).
You can disable all server-side execution of JavaScript, by passing the --noscripting option on the command
line or setting security.javascriptEnabled in a configuration file.
See also:
Store a JavaScript Function on the Server (page 257)
MongoDB will return one of the following codes and statuses when exiting. Use this guide to interpret logs and when
troubleshooting issues with mongod and mongos instances.
0
Returned by MongoDB applications upon successful exit.
2
The specified options are in error or are incompatible with other options.
3
Returned by mongod if there is a mismatch between hostnames specified on the command line and in the
local.sources (page 717) collection. mongod may also return this status if oplog collection in the local
database is not readable.
4
The version of the database is different from the version supported by the mongod (or mongod.exe) instance.
The instance exits cleanly. Restart mongod with the --upgrade option to upgrade the database to the version
supported by this mongod instance.
5
Returned by mongod if a moveChunk operation fails to confirm a commit.
12
Returned by the mongod.exe process on Windows when it receives a Control-C, Close, Break or Shutdown
event.
14
Returned by MongoDB applications which encounter an unrecoverable error, an uncaught exception or uncaught
signal. The system exits without performing a clean shut down.
20
Message: ERROR: wsastartup failed <reason>
Returned by MongoDB applications on Windows following an error in the WSAStartup function.
Message: NT Service Error
Returned by MongoDB applications for Windows due to failures installing, starting or removing the NT Service
for the application.
45
Returned when a MongoDB application cannot open a file or cannot obtain a lock on a file.
47
MongoDB applications exit cleanly following a large clock skew (32768 milliseconds) event.
48
mongod exits cleanly if the server socket closes. The server socket is on port 27017 by default, or as specified
to the --port run-time option.
49
Returned by mongod.exe or mongos.exe on Windows when either receives a shutdown message from the
Windows Service Control Manager.
100
Returned by mongod when the process throws an uncaught exception.
On this page
Additional Resources (page 313)
The following checklists provide recommendations that will help you avoid issues in your production MongoDB
deployment.
On this page
Filesystem (page 308)
Replication (page 309)
Sharding (page 309)
Journaling: MMAPv1 Storage Engine (page 309)
Hardware (page 310)
Deployments to Cloud Hardware (page 310)
Operating System Configuration (page 310)
Backups (page 311)
Monitoring (page 311)
Load Balancing (page 311)
The following checklist, along with the Development (page 311) list, provides recommendations to help you avoid
issues in your production MongoDB deployment.
Filesystem
Avoid using NFS drives for your dbPath. Using NFS drives can result in degraded and unstable performance.
See: Remote Filesystems (page 220) for more information.
VMWare users should use VMWare virtual drives over NFS.
Linux/Unix: format your drives into XFS or EXT4. If possible, use XFS as it generally performs better with
MongoDB.
With the WiredTiger storage engine, use of XFS is strongly recommended to avoid performance issues
found when using EXT4 with WiredTiger.
If using RAID, you may need to configure XFS with your RAID geometry.
Windows: use the NTFS file system. Do not use any FAT file system (i.e. FAT 16/32/exFAT).
Replication
Verify that all non-hidden replica set members are identically provisioned in terms of their RAM, CPU, disk,
network setup, etc.
Configure the oplog size (page 684) to suit your use case:
The replication oplog window should cover normal maintenance and downtime windows to avoid the need
for a full resync.
The replication oplog window should cover the time needed to restore a replica set member, either by an
initial sync or by restoring from the last backup.
Ensure that your replica set includes at least three data-bearing nodes with w:majority write concern
(page 141). Three data-bearing nodes are required for replica set-wide data durability.
Use hostnames when configuring replica set members, rather than IP addresses.
Ensure full bidirectional network connectivity between all mongod instances.
Ensure that each host can resolve itself.
Ensure that your replica set contains an odd number of voting members.
Ensure that mongod instances have 0 or 1 votes.
For high availability, deploy your replica set into a minimum of three data centers.
Sharding
Place your config servers (page 734) on dedicated hardware for optimal performance in large clusters. Ensure
that the hardware has enough RAM to hold the data files entirely in memory and that it has dedicated storage.
Use NTP to synchronize the clocks on all components of your sharded cluster.
Ensure full bidirectional network connectivity between mongod, mongos and config servers.
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config
servers without downtime.
Hardware
Windows Azure: Adjust the TCP keepalive (tcp_keepalive_time) to 100-120. The default TTL for TCP
connections on Windows Azure load balancers is too slow for MongoDBs connection pooling behavior.
Use MongoDB version 2.6.4 or later on systems with high-latency storage, such as Windows Azure, as these
versions include performance improvements for those systems. See: Azure Deployment Recommendations123
for more information.
Linux
Turn off transparent hugepages and defrag. See Transparent Huge Pages Settings (page 241) for more informa-
tion.
Adjust the readahead settings (page 222) on the devices storing your database files to suit your use case. If your
working set is bigger that the available RAM, and the document access pattern is random, consider lowering the
readahead to 32 or 16. Evaluate different settings to find an optimal value that maximizes the resident memory
and lowers the number of page faults.
Use the noop or deadline disk schedulers for SSD drives.
Use the noop disk scheduler for virtualized drives in guest VMs.
Disable NUMA or set vm.zone_reclaim_mode to 0 and run mongod instances with node interleaving. See:
MongoDB and NUMA Hardware (page 219) for more information.
Adjust the ulimit values on your hardware to suit your use case. If multiple mongod or mongos instances
are running under the same user, scale the ulimit values accordingly. See: UNIX ulimit Settings (page 295)
for more information.
Use noatime for the dbPath mount point.
Configure sufficient file handles (fs.file-max), kernel pid limit (kernel.pid_max), and maximum
threads per process (kernel.threads-max) for your deployment. For large systems, values of 98000,
32768, and 64000 are a good starting point.
Ensure that your system has swap space configured. Refer to your operating systems documentation for details
on appropriate sizing.
Ensure that the system default TCP keepalive is set correctly. A value of 300 often provides better performance
for replica sets and sharded clusters. See: Does TCP keepalive time affect MongoDB Deployments? (page 858)
in the Frequently Asked Questions for more information.
123 https://docs.mongodb.org/ecosystem/platforms/windows-azure
Windows
Consider disabling NTFS last access time updates. This is analogous to disabling atime on Unix-like sys-
tems.
Backups
Schedule periodic tests of your back up and restore process to have time estimates on hand, and to verify its
functionality.
Monitoring
Use MongoDB Cloud Manager124 or Ops Manager, an on-premise solution available in MongoDB Enterprise
Advanced125 or another monitoring system to monitor key database metrics and set up alerts for them. Include
alerts for the following metrics:
lock percent (for the MMAPv1 storage engine (page 595))
replication lag
replication oplog window
assertions
queues
page faults
Monitor hardware statistics for your servers. In particular, pay attention to the disk use, CPU, and available disk
space.
In the absence of disk space monitoring, or as a precaution:
Create a dummy 4 GB file on the storage.dbPath drive to ensure available space if the disk becomes
full.
A combination of cron+df can alert when disk space hits a high-water mark, if no other monitoring tool
is available.
Load Balancing
Configure load balancers to enable sticky sessions or client affinity, with a sufficient timeout for existing
connections.
Avoid placing load balancers between MongoDB cluster or replica set components.
5.4.2 Development
124 https://cloud.mongodb.com/?jmp=docs
125 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
On this page
Data Durability (page 312)
Schema Design (page 312)
Replication (page 312)
Sharding (page 312)
Drivers (page 312)
The following checklist, along with the Operations Checklist (page 308), provides recommendations to help you avoid
issues in your production MongoDB deployment.
Data Durability
Ensure that your replica set includes at least three data-bearing nodes with w:majority write concern
(page 141). Three data-bearing nodes are required for replica-set wide data durability.
Ensure that all instances use journaling (page 598).
Schema Design
Ensure that your schema design does not rely on indexed arrays that grow in length without bound. Typically,
best performance can be achieved when such indexed arrays have fewer than 1000 elements.
Replication
Do not use secondary reads to scale overall read throughput. See: Can I use more replica nodes to scale126 for
an overview of read scaling. For information about secondary reads, see: Read Preference (page 641).
Sharding
Ensure that your shard key distributes the load evenly on your shards. See: Considerations for Selecting Shard
Keys (page 763) for more information.
Use targeted queries (page 744) for workloads that need to scale with the number of shards.
Always read from primary nodes for non-targeted queries that may be sensitive to stale or orphaned data127 .
Pre-split and manually balance chunks (page 800) when inserting large data sets into a new non-hashed sharded
collection. Pre-splitting and manually balancing enables the insert load to be distributed among the shards,
increasing performance for the initial load.
Drivers
Make use of connection pooling. Most MongoDB drivers support connection pooling. Adjust the connection
pool size to suit your use case, beginning at 110-115% of the typical number of concurrent database requests.
Ensure that your applications handle transient write and read errors during replica set elections.
Ensure that your applications handle failed requests and retry them if applicable. Drivers do not automatically
retry failed requests.
126 http://askasya.com/post/canreplicashelpscaling
127 http://blog.mongodb.org/post/74730554385/background-indexing-on-secondaries-and-orphaned
128 https://www.mongodb.com/products/consulting?jmp=docs#s_product_readiness
129 https://www.mongodb.com/products/consulting?jmp=docs#ops_optimization
Security
On this page
Additional Resources (page 441)
Maintaining a secure MongoDB deployment requires administrators to implement controls to ensure that users and
applications have access to only the data that they require. MongoDB provides features that allow administrators to
implement these controls and restrictions for any MongoDB deployment.
If you are already familiar with security and MongoDB security practices, consider the Security Checklist (page 315)
for a collection of recommended actions to protect a MongoDB deployment.
On this page
Enable Access Control and Enforce Authentication (page 315)
Configure Role-Based Access Control (page 316)
Encrypt Communication (page 316)
Limit Network Exposure (page 316)
Audit System Activity (page 316)
Encrypt and Protect Data (page 316)
Run MongoDB with a Dedicated User (page 316)
Run MongoDB with Secure Configuration Options (page 317)
Request a Security Technical Implementation Guide (where applicable) (page 317)
Consider Security Standards Compliance (page 317)
This documents provides a list of security measures that you should implement to protect your MongoDB installation.
Enable access control and specify the authentication mechanism. You can use the default MongoDB authentication
mechanism or an existing external framework. Authentication requires that all clients and servers provide valid cre-
dentials before they can connect to the system. In clustered deployments, enable authentication for each MongoDB
server.
See Authentication (page 317) and Enable Client Access Control (page 344).
315
MongoDB Documentation, Release 3.2.4
Create a user administrator first, then create additional users. Create a unique MongoDB user for each person and
application that accesses the system.
Create roles that define the exact access a set of users needs. Follow a principle of least privilege. Then create users
and assign them only the roles they need to perform their operations. A user can be a person or a client application.
See Role-Based Access Control (page 331) and Manage User and Roles (page 373), .
Configure MongoDB to use TLS/SSL for all incoming and outgoing connections. Use TLS/SSL to encrypt commu-
nication between mongod and mongos components of a MongoDB client as well as between all applications and
MongoDB.
See Configure mongod and mongos for TLS/SSL (page 382).
Ensure that MongoDB runs in a trusted network environment and limit the interfaces on which MongoDB instances
listen for incoming connections. Allow only trusted clients to access the network interfaces and ports on which
MongoDB instances are available.
See Security Hardening (page 341) and the bindIp setting.
Track access and changes to database configurations and data. MongoDB Enterprise1 includes a system auditing
facility that can record system events (e.g. user operations, connection events) on a MongoDB instance. These audit
records permit forensic analysis and allow administrators to verify proper controls.
See Auditing (page 340) and Configure Auditing (page 404).
Encrypt MongoDB data on each host using file-system, device, or physical encryption. Protect MongoDB data using
file-system permissions. MongoDB data includes data files, configuration files, auditing logs, and key files.
Run MongoDB processes with a dedicated operating system user account. Ensure that the account has permissions to
access data but no unnecessary permissions.
See Install MongoDB (page 5) for more information on running MongoDB.
1 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
MongoDB supports the execution of JavaScript code for certain server-side operations: mapReduce, group, and
$where. If you do not use these operations, disable server-side scripting by using the --noscripting option on
the command line.
Use only the MongoDB wire protocol on production deployments. Do not enable the following, all
of which enable the web server interface: net.http.enabled, net.http.JSONPEnabled, and
net.http.RESTInterfaceEnabled. Leave these disabled, unless required for backwards compatibility.
Deprecated since version 3.2: HTTP interface for MongoDB
Keep input validation enabled. MongoDB enables input validation by default through the wireObjectCheck
setting. This ensures that all documents stored by the mongod instance are valid BSON.
See Security Hardening (page 341) for more information on hardening MongoDB configuration.
The Security Technical Implementation Guide (STIG) contains security guidelines for deployments within the United
States Department of Defense. MongoDB Inc. provides its STIG, upon request, for situations where it is required.
Please request a copy2 for more information.
For applications requiring HIPAA or PCI-DSS compliance, please refer to the MongoDB Security Reference Archi-
tecture3 to learn more about how you can use the key security capabilities to build compliant application infrastructure.
6.2 Authentication
On this page
Authentication Methods (page 317)
Authentication Mechanisms (page 318)
Internal Authentication (page 318)
Authentication on Sharded Clusters (page 318)
Authentication is the process of verifying the identity of a client. When access control, i.e. authorization (page 331),
is enabled, MongoDB requires all clients to authenticate themselves in order to determine their access.
Although authentication and authorization (page 331) are closely connected, authentication is distinct from authoriza-
tion. Authentication verifies the identity of a user; authorization determines the verified users access to resources and
operations.
MongoDB supports a number of authentication mechanisms (page 320) that clients can use to verify their identity.
These mechanisms allow MongoDB to integrate into your existing authentication system.
MongoDB supports multiple authentication mechanisms:
SCRAM-SHA-1 (page 321)
MongoDB Challenge and Response (MONGODB-CR) (page 322)
Changed in version 3.0: New challenge-response users created in 3.0 will use SCRAM-SHA-1. If using 2.6 user
data, MongoDB 3.0 will continue to use the MONGODB-CR.
x.509 Certificate Authentication (page 322).
In addition to supporting the aforementioned mechanisms, MongoDB Enterprise also supports the following mecha-
nisms:
LDAP proxy authentication (page 325), and
Kerberos authentication (page 325).
In addition to verifying the identity of a client, MongoDB can require members of replica sets and sharded clusters to
authenticate their membership (page 329) to their respective replica set or sharded cluster. See Internal Authentication
(page 329) for more information.
In sharded clusters, clients generally authenticate directly to the mongos instances. However, some maintenance
operations may require authenticating directly to a specific shard. For more information on authentication and sharded
clusters, see Sharded Cluster Users (page 319).
Users
On this page
User Management Interface (page 318)
Authentication Database (page 319)
Authenticate a User (page 319)
Centralized User Data (page 319)
Sharded Cluster Users (page 319)
Localhost Exception (page 320)
To add a user, MongoDB provides the db.createUser() method. When adding a user, you can assign roles
(page 331) to the user in order to grant privileges.
Note: The first user created in the database should be a user administrator who has the privileges to manage other
users. See Enable Client Access Control (page 344).
You can also update existing users, such as to change password and grant or revoke roles. For a full list of user
management methods, see user-management-methods.
Authentication Database
When adding a user, you create the user in a specific database. This database is the authentication database for the
user.
A user can have privileges across different databases; i.e. a users privileges are not limited to the authentication
database. By assigning to the user roles in other databases, a user created in one database can have permissions to act
on other databases. For more information on roles, see Role-Based Access Control (page 331).
The users name and authentication database serve as a unique identifier for that user. That is, if two users have the
same name but are created in different databases, they are two separate users. If you intend to have a single user with
permissions on multiple databases, create a single user with roles in the applicable databases instead of creating the
user multiple times in different databases.
Authenticate a User
To create users for a sharded cluster, connect to the mongos instance and add the users. Clients then authenticate
these users through the mongos instances.
Changed in version 2.6: MongoDB stores these sharded cluster user data in the admin database of the config servers.
Previously, the credentials for authenticating to a database on a sharded cluster resided on the primary shard (page 734)
for that database.
Shard Local Users However, some maintenance operations, such as cleanupOrphaned, compact,
rs.reconfig(), require direct connections to specific shards in a sharded cluster. To perform these operations,
you must connect directly to the shard and authenticate as a shard local administrative user.
To create a shard local administrative user, connect directly to the shard and create the user. MongoDB stores shard
local users in the admin database of the shard itself.
These shard local users are completely independent from the users added to the sharded cluster via mongos. Shard
local users are local to the shard and are inaccessible by mongos.
Direct connections to a shard should only be for shard-specific maintenance and configuration. In general, clients
should connect to the sharded cluster through the mongos.
Localhost Exception
The localhost exception allows you to enable access control and then create the first user in the system. With the
localhost exception, after you enable access control, connect to the localhost interface and create the first user in
the admin database. The first user must have privileges to create other users, such as a user with the userAdmin
(page 417) or userAdminAnyDatabase (page 421) role.
Changed in version 3.0: The localhost exception changed so that these connections only have access to create the first
user on the admin database. In previous versions, connections that gained access using the localhost exception had
unrestricted access to the MongoDB instance.
The localhost exception applies only when there are no users created in the MongoDB instance.
In the case of a sharded cluster, the localhost exception applies to each shard individually as well as to the cluster as
a whole. Once you create a sharded cluster and add a user administrator through the mongos instance, you must still
prevent unauthorized access to the individual shards. Follow one of the following steps for each shard in your cluster:
Create an administrative user, or
Disable the localhost exception at startup. To disable the localhost exception, set the
enableLocalhostAuthBypass parameter to 0.
Authentication Mechanisms
On this page
Default Authentication Mechanism (page 321)
Specify Authentication Mechanism (page 321)
To specify the authentication mechanism to use, set the authenticationMechanisms parameter for mongod
and mongos.
Clients specify the authentication mechanism in the db.auth() method. For the mongo shell and the MongoDB
tools, you can also specify the authentication mechanism from the command line.
On this page
Note: A driver upgrade is necessary to use the SCRAM-SHA-1 authentication mechanism if your current driver
version does not support SCRAM-SHA-1. See required driver versions (page 950) for details.
SCRAM-SHA-1 and MongoDB-CR User Credentials SCRAM-SHA-1 is the default mechanism for MongoDB
versions beginning with the 3.0 series. However, if you are upgrading a MongoDB 2.6 instances that already have users
credentials, MongoDB will continue to use MONGODB-CR for challenge-response authentication until you upgrade the
authentication schema.
4 https://tools.ietf.org/html/rfc5802
Even when using the MONGODB-CR authentication mechanism, clients and drivers that support MongoDB 3.0 features
(see Driver Compatibility Changes (page 942)) will use the SCRAM communication protocol. That is, MONGODB-CR
authentication mechanism also implies SCRAM-SHA-1 (page 321).
For details on upgrading the authentication schema model to SCRAM-SHA-1, see Upgrade to SCRAM-SHA-1
(page 949).
Warning: The procedure to upgrade to SCRAM-SHA-1 discards the MONGODB-CR credentials used by 2.6. As
such, the procedure is irreversible, short of restoring from backups.
The procedure also disables MONGODB-CR as an authentication mechanism.
Additional Information
Blog Post: Improved Password-Based Authentication in MongoDB 3.0: SCRAM Explained (Part 1)5
Blog Post: Improved Password-Based Authentication in MongoDB 3.0: SCRAM Explained (Part 2)6
On this page
MONGODB-CR
MONGODB-CR and SCRAM-SHA-1 (page 322)
MONGODB-CR is a challenge-response mechanism that authenticates users through passwords. MONGODB-CR veri-
fies supplied user credentials against the users name (page 426), password (page 426) and authentication
database (page 426). The authentication database is the database where the user was created, and the users database
and the users name together serve to identify the user.
On this page
Certificate Authority For production use, your MongoDB deployment should use valid certificates generated and
signed by a single certificate authority. You or your organization can generate and maintain an independent certificate
authority, or use certificates generated by a third-party SSL vendor. Obtaining and managing certificates is beyond the
scope of this documentation.
Client x.509 Certificates To authenticate to servers, clients can use x.509 certificates instead of usernames and
passwords.
Client Certificate Requirements The client certificate must have the following properties:
A single Certificate Authority (CA) must issue the certificates for both the client and the server.
Client certificates must contain the following fields:
keyUsage = digitalSignature
extendedKeyUsage = clientAuth
Warning: If a client x.509 certificates subject has the same O, OU, and DC combination as the Member
x.509 Certificate (page 356), the client will be identified as a cluster member and granted full permission on
the system.
MongoDB User and $external Database To authenticate with a client certificate, you must first add the value
of the subject from the client certificate as a MongoDB user. Each unique x.509 client certificate corresponds to a
single MongoDB user; i.e. you cannot use a single client certificate to authenticate more than one MongoDB user.
Add the user in the $external database; i.e. the Authentication Database (page 319) is the $external database
Authenticate To authenticate using x.509 client certificate, connect to MongoDB over TLS/SSL connection; i.e.
include the --ssl and --sslPEMKeyFile command line options.
Then in the $external database, use db.auth() to authenticate the user corresponding to the client certificate
(page 323).
For an example, see Use x.509 Certificates to Authenticate Clients (page 353)
Member x.509 Certificates For internal authentication, members of sharded clusters and replica sets can use x.509
certificates instead of keyfiles, which use MONGODB-CR (page 322) authentication mechanism.
Member Certificate Requirements The member certificate, used for internal authentication to verify membership
to the sharded cluster or a replica set, must have the following properties:
A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or
a replica set.
The Distinguished Name (DN), found in the member certificates subject, must specify a non-empty value
for at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain
Component (DC).
The Organization attributes (Os), the Organizational Unit attributes (OUs), and the Domain Components (DCs)
must match those from the certificates for the other cluster members. To match, the certificate must match all
specifications of these attributes, or even the non-specification of these attributes. The order of the attributes
does not matter.
In the following example, the two DNs contain matching specifications for O, OU as well as the non-specification
of the DC attribute.
CN=host1,OU=Dept1,O=MongoDB,ST=NY,C=US
C=US, ST=CA, O=MongoDB, OU=Dept1, CN=host2
However, the following two DNs contain a mismatch for the OU attribute since one contains two OU specifica-
tions and the other, only one specification.
CN=host1,OU=Dept1,OU=Sales,O=MongoDB
CN=host2,OU=Dept1,O=MongoDB
Either the Common Name (CN) or one of the Subject Alternative Name (SAN) entries must match the hostname
of the server, used by the other members of the cluster.
For example, the certificates for a cluster could have the following subjects:
subject= CN=<myhostname1>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname2>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname3>,OU=Dept1,O=MongoDB,ST=NY,C=US
If the certificate includes the Extended Key Usage (extendedKeyUsage) setting, the value must include
clientAuth (TLS Web Client Authentication).
extendedKeyUsage = clientAuth
You can also use a certificate that does not include the Extended Key Usage (EKU).
MongoDB Configuration To specify x.509 for internal authentication, in addition to the other SSL configurations
appropriate for your deployment, for each member of the replica set or sharded cluster, include either:
security.clusterAuthMode and net.ssl.clusterFile if using a configuration file, or
--clusterAuthMode and --sslClusterFile command line options.
Member Certificate and PEMKeyFile To configure MongoDB for client certificate authentication, the mongod
and mongos specify a PEMKeyFile to prove its identity to clients, either through net.ssl.PEMKeyFile setting
in the configuration file or --sslPEMKeyFile command line option.
If no clusterFile certificate is specified for internal member authentication, MongoDB will attempt to use the
PEMKeyFile certificate for member authentication. In order to use PEMKeyFile certificate for internal authenti-
cation as well as for client authentication, then the PEMKeyFile certificate must either:
Omit extendedKeyUsage or
Specify extendedKeyUsage values that include clientAuth in addition to serverAuth.
For an example of x.509 internal authentication, see Use x.509 Certificate for Membership Authentication (page 355).
On this page
Kerberos Authentication (page 325)
LDAP Proxy Authority Authentication (page 325)
In addition to the authentication mechanisms offered, MongoDB Enterprise provides integration with the following
authentication mechanisms.
Kerberos Authentication
MongoDB Enterprise7 supports authentication using a Kerberos service. Kerberos is an industry standard authentica-
tion protocol for large client/server systems.
To use MongoDB with Kerberos, you must have a properly configured Kerberos deployment, configured Kerberos
service principals (page 326) for MongoDB, and added Kerberos user principal (page 326) to MongoDB.
For more information on Kerberos and MongoDB, see:
Kerberos Authentication (page 325),
Configure MongoDB with Kerberos Authentication on Linux (page 359) and
Configure MongoDB with Kerberos Authentication on Windows (page 363).
MongoDB Enterprise (excluding Windows version)8 supports proxy authentication through a Lightweight Directory
Access Protocol (LDAP) service.
LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as
the MongoDB server.
For more information on LDAP and MongoDB, see
LDAP Proxy Authority Authentication (page 328),
Authenticate Using SASL and LDAP with OpenLDAP (page 370) and
Authenticate Using SASL and LDAP with ActiveDirectory (page 367).
On this page
Overview (page 326)
Kerberos Authentication Kerberos Components and MongoDB (page 326)
Operational Considerations (page 327)
Kerberized MongoDB Environments (page 327)
Additional Resources (page 328)
Overview MongoDB Enterprise provides support for Kerberos authentication of MongoDB clients to mongod and
mongos. Kerberos is an industry standard authentication protocol for large client/server systems. Kerberos allows
MongoDB and applications to take advantage of existing authentication infrastructure and processes.
Principals In a Kerberos-based system, every participant in the authenticated communication is known as a princi-
pal, and every principal must have a unique name.
Principals belong to administrative units called realms. For each realm, the Kerberos Key Distribution Center (KDC)
maintains a database of the realms principal and the principals associated secret keys.
For a client-server authentication, the client requests from the KDC a ticket for access to a specific asset. KDC
uses the clients secret and the servers secret to construct the ticket which allows the client and server to mutually
authenticate each other, while keeping the secrets hidden.
For the configuration of MongoDB for Kerberos support, two kinds of principal names are of interest: user principals
(page 326) and service principals (page 326).
User Principal To authenticate using Kerberos, you must add the Kerberos user principals to MongoDB to the
$external database. User principal names have the form:
<username>@<KERBEROS REALM>
For every user you want to authenticate using Kerberos, you must create a corresponding user in MongoDB in the
$external database.
For examples of adding a user to MongoDB as well as authenticating as that user, see Configure MongoDB with
Kerberos Authentication on Linux (page 359) and Configure MongoDB with Kerberos Authentication on Windows
(page 363).
See also:
Configure Users and Roles (page 373) for general information regarding creating and managing users in MongoDB.
Service Principal Every MongoDB mongod and mongos instance (or mongod.exe or mongos.exe on Win-
dows) must have an associated service principal. Service principal names have the form:
<service>/<fully qualified domain name>@<KERBEROS REALM>
For MongoDB, the <service> defaults to mongodb. For example, if m1.example.com is a MongoDB server,
and example.com maintains the EXAMPLE.COM Kerberos realm, then m1 should have the service principal name
mongodb/m1.example.com@EXAMPLE.COM.
To specify a different value for <service>, use serviceName during the start up of mongod or mongos (or
mongod.exe or mongos.exe). mongo shell or other clients may also specify a different service principal name
using serviceName.
Service principal names must be reachable over the network using the fully qualified domain name (FQDN) part of its
service principal name.
By default, Kerberos attempts to identify hosts using the /etc/kerb5.conf file before using DNS to resolve hosts.
On Windows, if running MongoDB as a service, see Assign Service Principal Name to MongoDB Windows Service
(page 365).
Linux Keytab Files Linux systems can store Kerberos authentication keys for a service principal (page 326) in
keytab files. Each Kerberized mongod and mongos instance running on Linux must have access to a keytab file
containing keys for its service principal (page 326).
To keep keytab files secure, use file permissions that restrict access to only the user that runs the mongod or mongos
process.
Tickets On Linux, MongoDB clients can use Kerbeross kinit program to initialize a credential cache for authen-
ticating the user principal to servers.
Windows Active Directory Unlike on Linux systems, mongod and mongos instances running on Windows do
not require access to keytab files. Instead, the mongod and mongos instances read their server credentials from a
credential store specific to the operating system.
However, from the Windows Active Directory, you can export a keytab file for use on Linux systems. See Ktpass9 for
more information.
Authenticate With Kerberos To configure MongoDB for Kerberos support and authenticate, see Configure Mon-
goDB with Kerberos Authentication on Linux (page 359) and Configure MongoDB with Kerberos Authentication on
Windows (page 363).
Operational Considerations
The HTTP Console The MongoDB HTTP Console10 interface does not support Kerberos authentication.
Deprecated since version 3.2: HTTP interface for MongoDB
DNS Each host that runs a mongod or mongos instance must have both A and PTR DNS records to provide forward
and reverse lookup.
Without A and PTR DNS records, the host cannot resolve the components of the Kerberos domain or the Key Distri-
bution Center (KDC).
System Time Synchronization To successfully authenticate, the system time for each mongod and mongos in-
stance must be within 5 minutes of the system time of the other hosts in the Kerberos infrastructure.
Node.js15
PHP16
Python17
Ruby18
Use with Additional MongoDB Authentication Mechanism Although MongoDB supports the use of Ker-
beros authentication with other authentication mechanisms, only add the other mechanisms as necessary. See
the Incorporate Additional Authentication Mechanisms section in Configure MongoDB with Ker-
beros Authentication on Linux (page 359) and Configure MongoDB with Kerberos Authentication on Windows
(page 363) for details.
Additional Resources
MongoDB LDAP and Kerberos Authentication with Dell (Quest) Authentication Services19
MongoDB with Red Hat Enterprise Linux Identity Management and Kerberos20
On this page
Considerations (page 328)
LDAP Proxy Authority Authentication MongoDB Configuration (page 328)
LDAP User (page 329)
Additional Information (page 329)
MongoDB Enterprise21 supports proxy authentication through a Lightweight Directory Access Protocol (LDAP) ser-
vice.
Considerations MongoDB Enterprise for Windows does not include LDAP support for authentication. However,
MongoDB Enterprise for Linux supports using LDAP authentication with an ActiveDirectory server.
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 1001) for upgrade instructions.
Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the
LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You
should use only a trusted channel such as a VPN, a connection encrypted with TLS/SSL, or a trusted wired network.
MongoDB Configuration To configure the MongoDB server to use LDAP authentication mechanism, use the fol-
lowing command line options:
--auth to enable access control,
--authenticationMechanisms set to PLAIN, and
--saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance.
15 http://mongodb.github.io/node-mongodb-native/2.0/tutorials/enterprise_features/
16 http://php.net/manual/en/mongoclient.construct.php
17 http://api.mongodb.org/python/current/examples/authentication.html
18 https://docs.mongodb.org/ecosystem/tutorial/ruby-driver-tutorial/#gssapi-kerberos-mechanism
19 https://www.mongodb.com/blog/post/mongodb-ldap-and-kerberos-authentication-dell-quest-authentication-services?jmp=docs
20 http://docs.mongodb.org/ecosystem/tutorial/manage-red-hat-enterprise-linux-identity-management?jmp=docs
21 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
Or, if using the YAML configuration file, use the following settings:
security.authorization set to enabled,
setParameter.authenticationMechanisms set to PLAIN, and
setParameter.saslauthdPath set to the path to the Unix-domain Socket of the saslauthd instance.
LDAP User In order to authenticate a user with the LDAP authentication mechanism, add a corresponding user
(page 318) to the $external database. You do not need to save the users password in MongoDB.
The $external database is the authentication database (page 319) for the LDAP user. To authenticate the LDAP
user, you must authenticate against the $external database. When authenticating, specify PLAIN for the authenti-
cation mechanism .
LDAP authentication requires that MongoDB forward the users password in plan text. As such, you must specify
digestPassword set to false during authentication.
Additional Information For information on configuring MongoDB to use LDAP and authenticating users using
LDAP, see:
Authenticate Using SASL and LDAP with OpenLDAP (page 370) and
Authenticate Using SASL and LDAP with ActiveDirectory (page 367).
Internal Authentication
On this page
Keyfiles (page 329)
x.509 (page 330)
You can authenticate members of replica sets and sharded clusters. For the internal authentication of the members,
MongoDB can use either keyfiles or x.509 (page 322) certificates.
Note: Enabling internal authentication also enables client authorization (page 331).
Keyfiles
Keyfiles use SCRAM-SHA-1 (page 321) challenge and response authentication mechanism. The contents of the keyfiles
serve as the shared password for the members. A keys length must be between 6 and 1024 characters and may only
contain characters in the base64 set.
MongoDB strips whitespace characters (e.g. x0d, x09, and x20) for cross-platform convenience. As a result, the
following operations produce identical keys:
echo -e "my secret key" > key1
echo -e "my secret key\n" > key2
echo -e "my secret key" > key3
echo -e "my\r\nsecret\r\nkey\r\n" > key4
On UNIX systems, the keyfile must not have group or world permissions. On Windows systems, keyfile permissions
are not checked
The content of the keyfile must be the same on all mongod and mongos instances that connect to each other. You
must store the keyfile on each member of the replica set or sharded clusters.
To specify the keyfile, use the security.keyFile setting or --keyFile command line option.
For an example of keyfile internal authentication, see Enable Internal Authentication (page 347).
x.509
Members of a replica set or sharded cluster can use x.509 certificates for internal authentication instead of using
keyfiles. MongoDB supports x.509 certificate authentication for use with a secure TLS/SSL connection.
Member Certificate Requirements The member certificate, used for internal authentication to verify membership
to the sharded cluster or a replica set, must have the following properties:
A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or
a replica set.
The Distinguished Name (DN), found in the member certificates subject, must specify a non-empty value
for at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain
Component (DC).
The Organization attributes (Os), the Organizational Unit attributes (OUs), and the Domain Components (DCs)
must match those from the certificates for the other cluster members. To match, the certificate must match all
specifications of these attributes, or even the non-specification of these attributes. The order of the attributes
does not matter.
In the following example, the two DNs contain matching specifications for O, OU as well as the non-specification
of the DC attribute.
CN=host1,OU=Dept1,O=MongoDB,ST=NY,C=US
C=US, ST=CA, O=MongoDB, OU=Dept1, CN=host2
However, the following two DNs contain a mismatch for the OU attribute since one contains two OU specifica-
tions and the other, only one specification.
CN=host1,OU=Dept1,OU=Sales,O=MongoDB
CN=host2,OU=Dept1,O=MongoDB
Either the Common Name (CN) or one of the Subject Alternative Name (SAN) entries must match the hostname
of the server, used by the other members of the cluster.
For example, the certificates for a cluster could have the following subjects:
subject= CN=<myhostname1>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname2>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname3>,OU=Dept1,O=MongoDB,ST=NY,C=US
If the certificate includes the Extended Key Usage (extendedKeyUsage) setting, the value must include
clientAuth (TLS Web Client Authentication).
extendedKeyUsage = clientAuth
You can also use a certificate that does not include the Extended Key Usage (EKU).
MongoDB Configuration To specify x.509 for internal authentication, in addition to the other SSL configurations
appropriate for your deployment, for each member of the replica set or sharded cluster, include either:
security.clusterAuthMode and net.ssl.clusterFile if using a configuration file, or
Member Certificate and PEMKeyFile To configure MongoDB for client certificate authentication, the mongod
and mongos specify a PEMKeyFile to prove its identity to clients, either through net.ssl.PEMKeyFile setting
in the configuration file or --sslPEMKeyFile command line option.
If no clusterFile certificate is specified for internal member authentication, MongoDB will attempt to use the
PEMKeyFile certificate for member authentication. In order to use PEMKeyFile certificate for internal authenti-
cation as well as for client authentication, then the PEMKeyFile certificate must either:
Omit extendedKeyUsage or
Specify extendedKeyUsage values that include clientAuth in addition to serverAuth.
For an example of x.509 internal authentication, see Use x.509 Certificate for Membership Authentication (page 355).
To upgrade from keyfile internal authentication to x.509 internal authentication, see Upgrade from Keyfile Authentica-
tion to x.509 Authentication (page 357).
On this page
Enable Access Control (page 331)
Roles (page 331)
Users and Roles (page 332)
Built-In Roles and User-Defined Roles (page 332)
MongoDB employs Role-Based Access Control (RBAC) to govern access to a MongoDB system. A user is granted
one or more roles (page 331) that determine the users access to database resources and operations. Outside of role
assignments, the user has no access to the system.
MongoDB does not enable access control by default. You can enable authorization using the --auth or the
security.authorization setting. Enabling internal authentication (page 329) also enables client authoriza-
tion.
Once access control is enabled, users must authenticate (page 317) themselves.
6.3.2 Roles
A role grants privileges to perform the specified actions (page 429) on resource (page 427). Each privilege is either
specified explicitly in the role or inherited from another role or both.
Privileges
A privilege consists of a specified resource and the actions permitted on the resource.
A resource (page 427) is either a database, collection, set of collections, or the cluster. If the resource is the cluster,
the affiliated actions affect the state of the system rather than a specific database or collection. For information on the
resource documents, see Resource Document (page 427).
An action (page 429) specifies the operation allowed on the resource. For available actions see Privilege Actions
(page 429).
Inherited Privileges
A role can include one or more existing roles in its definition, in which case the role inherits all the privileges of the
included roles.
A role can inherit privileges from other roles in its database. A role created on the admin database can inherit
privileges from roles in any database.
You can view the privileges for a role by issuing the rolesInfo command with the showPrivileges and
showBuiltinRoles fields both set to true.
You can assign roles to users during the user creation. You can also update existing users to grant or revoke roles. For
a full list of user management methods, see user-management-methods
A user assigned a role receives all the privileges of that role. A user can have multiple roles. By assigning to the user
roles in various databases, a user created in one database can have permissions to act on other databases.
Note: The first user created in the database should be a user administrator who has the privileges to manage other
users. See Enable Client Access Control (page 344).
MongoDB provides built-in roles (page 332) that provide set of privileges commonly needed in a database system.
If these built-in-roles cannot provide the desired set of privileges, MongoDB provides methods to create and modify
user-defined roles (page 335).
Built-In Roles
On this page
Database User Roles (page 333)
Database Administration Roles (page 333)
Cluster Administration Roles (page 333)
Backup and Restoration Roles (page 334)
All-Database Roles (page 334)
Superuser Roles (page 334)
Internal Role (page 335)
MongoDB provides built-in roles that provide the different levels of access commonly needed in a database system.
Built-in database user roles (page 415) and database administration roles (page 416) roles exist in each database. The
admin database contains additional roles.
This page provides a brief description of the built-in roles. For the specific privileges granted by each role, see the
Built-In Roles (page 414) reference page.
The admin database includes the following roles for administering the whole system rather than a specific database.
These roles include but are not limited to replica set and sharded cluster administrative functions.
Role Short Description
clusterAdmin Provides the greatest cluster-management access. This role combines the privileges granted
(page 417) by the clusterManager (page 417), clusterMonitor (page 418), and
hostManager (page 419) roles. Additionally, the role provides the dropDatabase
(page 432) action.
clusterManagerProvides management and monitoring actions on the cluster. A user with this role can access
(page 417) the config and local databases, which are used in sharding and replication, respectively.
For the specific privileges granted by the role, see clusterManager (page 417).
clusterMonitorProvides read-only access to monitoring tools, such as the MongoDB Cloud Manager22 and
(page 418) Ops Manager23 monitoring agent.
For the specific privileges granted by the role, see clusterMonitor (page 418).
hostManager Provides the ability to monitor and manage servers.
(page 419) For the specific privileges granted by the role, see hostManager (page 419).
22 https://cloud.mongodb.com/?jmp=docs
The admin database includes the following roles for backing up and restoring data:
Role Short Description
backup Provides privileges needed to back up data. This role provides sufficient privileges to use the
(page 420) MongoDB Cloud Manager24 backup agent, Ops Manager25 backup agent, or to use mongodump.
For the specific privileges granted by the role, see backup (page 420).
restore Provides privileges needed to restore data with mongorestore without the --oplogReplay
(page 420) option or without system.profile collection data.
For the specific privileges granted by the role, see restore (page 420).
All-Database Roles
The admin database provides the following roles that apply to all databases in a mongod instance and are roughly
equivalent to their single-database equivalents:
Role Short Description
readAnyDatabase Provides the same read-only permissions as read (page 415), except it applies to all
(page 421) databases in the cluster. The role also provides the listDatabases (page 434) action
on the cluster as a whole.
For the specific privileges granted by the role, see readAnyDatabase (page 421).
Provides the same read and write permissions as readWrite (page 415), except it
readWriteAnyDatabase
(page 421) applies to all databases in the cluster. The role also provides the listDatabases
(page 434) action on the cluster as a whole.
For the specific privileges granted by the role, see readWriteAnyDatabase
(page 421).
Provides the same access to user administration operations as userAdmin (page 417),
userAdminAnyDatabase
(page 421) except it applies to all databases in the cluster.
Since the userAdminAnyDatabase (page 421) role allows users to grant any
privilege to any user, including themselves, the role also indirectly provides superuser
(page 422) access.
For the specific privileges granted by the role, see userAdminAnyDatabase
(page 421).
Provides the same access to database administration operations as dbAdmin (page 416),
dbAdminAnyDatabase
(page 422) except it applies to all databases in the cluster. The role also provides the
listDatabases (page 434) action on the cluster as a whole.
For the specific privileges granted by the role, see dbAdminAnyDatabase (page 422).
Superuser Roles
Internal Role
User-Defined Roles
On this page
Role Management Interface (page 335)
Scope (page 335)
Centralized Role Data (page 335)
To add a role, MongoDB provides the db.createRole() method. MongoDB also provides methods to update
existing user-defined roles. For a full list of role management methods, see role-management-methods.
Scope
When adding a role, you create the role in a specific database. MongoDB uses the combination of the database and
the role name to uniquely define a role.
Except for roles created in the admin database, a role can only include privileges that apply to its database and can
only inherit from other roles in its database.
A role created in the admin database can include privileges that apply to the admin database, other databases or to
the cluster (page 428) resource, and can inherit from roles in other databases as well as the admin database.
MongoDB stores all role information in the system.roles (page 423) collection in the admin database
Do not access this collection directly but instead use the role management commands to view and edit custom roles.
On this page
Privileges and Scope (page 336)
Additional Information (page 336)
Collection-level access control allows administrators to grant users privileges that are scoped to specific collections.
Administrators can implement collection-level access control through user-defined roles (page 335). By creating a role
with privileges (page 331) that are scoped to a specific collection in a particular database, administrators can provision
users with roles that grant privileges on a collection level.
A privilege consists of actions (page 429) and the resources (page 427) upon which the actions are permissible; i.e.
the resources define the scope of the actions for that privilege.
By specifying both the database and the collection in the resource document (page 427) for a privilege, administrator
can limit the privilege actions just to a specific collection in a specific database. Each privilege action in a role can be
scoped to a different collection.
For example, a user defined role can contain the following privileges:
privileges: [
{ resource: { db: "products", collection: "inventory" }, actions: [ "find", "update", "insert" ] },
{ resource: { db: "products", collection: "orders" }, actions: [ "find" ] }
]
The first privilege scopes its actions to the inventory collection of the products database. The second privilege
scopes its actions to the orders collection of the products database.
Additional Information
For more information on user-defined roles and MongoDB authorization model, see Role-Based Access Control
(page 331). For a tutorial on creating user-defined roles, see Manage User and Roles (page 373).
6.4 Encryption
On this page
Transport Encryption (page 336)
Encryption at Rest (page 337)
You can use TLS/SSL (Transport Layer Security/Secure Sockets Layer) to encrypt all of MongoDBs network traffic.
TLS/SSL ensures that MongoDB network traffic is only readable by the intended client.
See Transport Encryption (page 337) for more information.
There are two broad classes of approaches to encrypting data at rest with MongoDB: Application Level Encryption
and Storage Encryption. You can use these solutions together or independently.
New in version 3.2: MongoDB Enterprise 3.2 introduces a native encryption option for the WiredTiger storage engine.
This feature allows MongoDB to encrypt data files such that only parties with the decryption key can decode and read
the data.
See Encryption At Rest (page 338) for more information.
Transport Encryption
On this page
TLS/SSL (page 337)
Certificates (page 337)
Identity Verification (page 337)
FIPS Mode (page 338)
TLS/SSL
MongoDB supports TLS/SSL (Transport Layer Security/Secure Sockets Layer) to encrypt all of MongoDBs network
traffic. TLS/SSL ensures that MongoDB network traffic is only readable by the intended client.
MongoDB TLS/SSL implementation uses OpenSSL libraries. MongoDBs SSL encryption only allows use of strong
SSL ciphers with a minimum of 128-bit key length for all connections.
Certificates
Before you can use SSL, you must have a .pem file containing a public key certificate and its associated private key.
MongoDB can use any valid SSL certificate issued by a certificate authority or a self-signed certificate. If you use a
self-signed certificate, although the communications channel will be encrypted, there will be no validation of server
identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-
the-middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify
the servers identity.
For example, see TLS/SSL Configuration for Clients (page 386).
Identity Verification
In addition to encrypting connections, SSL allows for authentication using certificates, both for client authentication
(page 317) and for internal authentication (page 329) of members of replica sets and sharded clusters.
For more information, see:
Configure mongod and mongos for TLS/SSL (page 382)
TLS/SSL Configuration for Clients (page 386)
Use x.509 Certificates to Authenticate Clients (page 353)
Use x.509 Certificate for Membership Authentication (page 355)
FIPS Mode
Enterprise Feature
Available in MongoDB Enterprise only.
The Federal Information Processing Standard (FIPS) is a U.S. government computer security standard used to certify
software modules and libraries that encrypt and decrypt data securely. You can configure MongoDB to run with a
FIPS 140-2 certified library for OpenSSL. Configure FIPS to run by default or as needed from the command line.
For an example, see Configure MongoDB for FIPS (page 391).
Encryption At Rest
On this page
Encrypted Storage Engine (page 338)
Application Level Encryption (page 339)
Third Party Storage Encryption (page 339)
Encryption at rest, when used in conjunction with transport encryption and good security policies that protect relevant
accounts, passwords, and encryption keys, can help ensure compliance with security and privacy standards, including
HIPAA, PCI-DSS, and FERPA.
Enterprise Feature
Available in MongoDB Enterprise only.
MongoDB Enterprise 3.2 introduces a native encryption option for the WiredTiger storage engine. This feature allows
MongoDB to encrypt data files such that only parties with the decryption key can decode and read the data.
Encryption Process If encryption is enabled, the default encryption mode that MongoDB Enterprise uses is the
AES256-CBC (or 256-bit Advanced Encryption Standard in Cipher Block Chaining mode) via OpenSSL. AES-256
uses a symmetric key; i.e. the same key to encrypt and decrypt text. MongoDB Enterprise also supports authen-
ticated encryption AES256-GCM (or 256-bit Advanced Encryption Standard in Galois/Counter Mode). FIPS mode
encryption is also available.
The data encryption includes:
Generating a master key.
Generating keys for each database.
Encrypting data with the database keys.
Encrypting the database keys with the master key.
The encryption occur transparently in the storage layer; i.e. all data files are fully encrypted from a filesystem per-
spective, and data only exists in an unencrypted state in memory and during transmission.
To encrypt all of MongoDBs network traffic, you can use TLS/SSL (Transport Layer Security/Secure Sockets Layer).
See Configure mongod and mongos for TLS/SSL (page 382) and TLS/SSL Configuration for Clients (page 386).
Key Management
Important: Secure management of the encryption keys is critical.
The database keys are internal to the server and are only paged to disk in an encrypted format. MongoDB never pages
the master key to disk under any circumstances.
Only the master key is external to the server (i.e. kept separate from the data and the database keys), and requires
external management. To manage the master key, MongoDBs encrypted storage engine supports two key management
options:
Integration with a third party key management appliance via the Key Management Interoperability Protocol
(KMIP). Recommended
Local key management via a keyfile.
To configure MongoDB for encryption and use one of the two key management options, see Configure Encryption
(page 399).
Application Level Encryption provides encryption on a per-field or per-document basis within the application layer.
To encrypt document or field level data, write custom encryption and decryption routines or use a commercial solution
such as the Vormetric Data Security Platform26 .
A number of third-party libraries can integrate with the operating system to provide transparent disk-level encryption.
For example:
Linux Unified Key Setup (LUKS) LUKS is available for most Linux distributions. For configuration expla-
nation, see the LUKS documentation from Red Hat27 .
IBM Guardium Data Encryption IBM Guardium Data Encryption28 provides support for disk-level encryp-
tion for Linux and Windows operating systems.
26 http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf
27 https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Security_Guide/sec-Encryption.html
28 http://www-03.ibm.com/software/products/en/infosphere-guardium-data-encryption
Vormetric Data Security Platform The Vormetric Data Security Platform29 provides disk and file-level en-
cryption in addition to application level encryption.
Bitlocker Drive Encryption Bitlocker Drive Encryption30 is a feature available on Windows Server 2008 and
2012 that provides disk encryption.
6.5 Auditing
On this page
Enable and Configure Audit Output (page 340)
Audit Events and Filter (page 340)
Audit Guarantee (page 340)
The auditing facility can write audit events to the console, the syslog, a JSON file, or a BSON file. To enable auditing
for MongoDB Enterprise, see Configure Auditing (page 404).
For information on the audit log messages, see System Event Audit Messages (page 434).
Once enabled, the auditing system can record the following operations:
schema (DDL),
replica set and sharded cluster,
authentication and authorization, and
CRUD operations (requires auditAuthorizationSuccess set to true).
For details on audited actions, see Audit Event Actions, Details, and Results (page 435).
With the auditing system, you can set up filters (page 406) to restrict the events captured. To set up filters, see Configure
Audit Filters (page 406).
The auditing system writes every audit event 31 to an in-memory buffer of audit events. MongoDB writes this buffer to
disk periodically. For events collected from any single connection, the events have a total order: if MongoDB writes
one event to disk, the system guarantees that it has written all prior events for that connection to disk.
29 http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf
30 http://technet.microsoft.com/en-us/library/hh831713.aspx
31 Audit configuration can include a filter (page 406) to limit events to audit.
If an audit event entry corresponds to an operation that affects the durable state of the database, such as a modification
to data, MongoDB will always write the audit event to disk before writing to the journal for that entry.
That is, before adding an operation to the journal, MongoDB writes all audit events on the connection that triggered
the operation, up to and including the entry for the operation.
These auditing guarantees require that MongoDB run with journaling enabled.
Warning: MongoDB may lose events if the server terminates before it commits the events to the audit log.
The client may receive confirmation of the event before MongoDB commits to the audit log. For example, while
auditing an aggregation operation, the server might crash after returning the result but before the audit log flushes.
On this page
MongoDB Configuration Hardening (page 341)
Network Hardening (page 341)
To reduce the risk exposure of the entire MongoDB system, ensure that only trusted hosts have access to MongoDB.
For MongoDB, ensure that HTTP status interface and the REST API are disabled in production to prevent potential
data exposure to attackers.
Deprecated since version 3.2: HTTP interface for MongoDB
For more information, see MongoDB Configuration Hardening (page 341).
To restrict exposure to MongoDB, configure firewalls to control access to MongoDB systems. Use of VPNs can also
provide a secure tunnel.
For more information, see Hardening Network Infrastructure (page 343).
On this page
HTTP Status Interface (page 342)
REST API (page 342)
bind_ip (page 343)
Warning: Ensure that the HTTP status interface, the REST API, and the JSON API are all disabled in production
environments to prevent potential data exposure and vulnerability to attackers.
Warning: If you enable the interface, you should only allow trusted clients to access this port. See Firewalls
(page 343).
REST API
Warning: Ensure that the HTTP status interface, the REST API, and the JSON API are all disabled in production
environments to prevent potential data exposure and vulnerability to attackers.
The REST API to MongoDB provides additional information and write access on top of the HTTP status interface.
While the REST API does not provide any support for insert, update, or remove operations, it does provide adminis-
trative access, and its accessibility represents a vulnerability in a secure environment.
Deprecated since version 3.2: HTTP interface for MongoDB
The REST interface is disabled by default and is not recommended for production use.
The net.http.RESTInterfaceEnabled setting for mongod enables a fully interactive administrative REST
interface, which is disabled by default. Enabling the REST API enables the HTTP interface, even if the HTTP interface
option is disabled, and makes the HTTP interface fully interactive.
The REST API does not include support for authentication other than MONGODB-CR.
Warning: If you enable the interface, you should only allow trusted clients to access this port. See Firewalls
(page 343).
Changed in version 3.0: Neither the HTTP status interface nor the REST API support the SCRAM-SHA-1 (page 321)
challenge-response user authentication mechanism introduced in version 3.0.
bind_ip
The net.bindIp setting (or the --bind_ip command line option) for mongod and mongos instances limits the
network interfaces on which MongoDB programs will listen for incoming connections.
Warning: Make sure that your mongod and mongos instances are only accessible on trusted networks. If your
system has more than one network interface, bind MongoDB programs to the private or internal network interface.
See also:
Firewalls (page 343), Security Considerations (page 211)
On this page
Firewalls (page 343)
Virtual Private Networks (page 343)
Firewalls
Firewalls allow administrators to filter and control access to a system by providing granular control over network
communications. For administrators of MongoDB, the following capabilities are important: limiting incoming traffic
on a specific port to specific systems and limiting incoming traffic from untrusted hosts.
On Linux systems, the iptables interface provides access to the underlying netfilter firewall. On Windows
systems, netsh command line interface provides access to the underlying Windows Firewall. For additional infor-
mation about firewall configuration, see:
Configure Linux iptables Firewall for MongoDB (page 392) and
Configure Windows netsh Firewall for MongoDB (page 396).
For best results and to minimize overall exposure, ensure that only traffic from trusted sources can reach mongod and
mongos instances and that the mongod and mongos instances can only connect to trusted outputs.
See also:
For MongoDB deployments on Amazons web services, see the Amazon EC232 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Virtual private networks, or VPNs, make it possible to link two networks over an encrypted and limited-access trusted
network. Typically, MongoDB users who use VPNs use TLS/SSL rather than IPSEC VPNs for performance issues.
Depending on configuration and implementation, VPNs provide for certificate validation and a choice of encryption
protocols, which requires a rigorous level of authentication and identification of all clients. Furthermore, because
32 https://docs.mongodb.org/ecosystem/platforms/amazon-ec2
VPNs provide a secure tunnel, by using a VPN connection to control access to your MongoDB instance, you can
prevent tampering and man-in-the-middle attacks.
The following tutorials provide instructions for enabling and using the security features available in MongoDB.
Before enabling role based access control, you should first consider the users of the system. Once the users have been
identified, determine the roles required by the users. Roles may inherit from other roles to provide a hierarchy.
Enable Access Control (page 344) Tutorials for enabling access control.
Authentication Mechanisms (page 352) Tutorials for specifying various authentication mechanisms supported by
MongoDB.
Configure Users and Roles (page 373) Tutorials for managing users and roles.
Network (page 382) Tutorials for securing your network via TLS/SSL and firewall configuration.
Encryption (page 399) Tutorials for storage encryption.
Auditing (page 404) Tutorials for configuring auditing.
Miscellaneous (page 409) Tutorial illustrating field-level redaction or instructions for reporting a security vulnerabil-
ity to MongoDB.
The tutorials in this section enable access control. Once access control is enabled, users must authenticate (page 317)
themselves. The following tutorials use the default authentication mechanism (page 321) .
Important: Before enabling role based access control, you should first consider the users of the system. Once the
users have been identified, determine the roles required by the users. Roles may inherit from other roles to provide a
hierarchy.
A user should have only the minimal set of privileges required to ensure a system of least privilege.
Each application and user of a MongoDB system should map to a distinct user in MongoDB; i.e. do not create a group
user that is shared among multiple individuals. This access isolation facilitates access revocation and ongoing user
maintenance.
Enable Client Access Control (page 344) Describes the process for enabling client access control for MongoDB de-
ployments.
Enable Internal Authentication (page 347) Describes the process for enabling internal authentication members of
replica sets and sharded clusters. Enabling internal authentication implicitly enables client access control.
On this page
Overview (page 345)
Considerations (page 345)
Procedures (page 345)
Additional Information (page 347)
Overview
Enabling access control requires authentication of every user. Once authenticated, users only have the privileges as
defined in the roles granted to the users.
To enable access control, use either the command line option --auth or security.authorization configura-
tion file setting.
Note: The tutorial enables access control and uses the default authentication mechanism (page 321). To specify a
different authentication mechanism, see Authentication Mechanisms (page 352).
You can also enable client access control by enabling internal authentication (page 347) of replica sets or sharded
clusters. For instructions on enabling internal authentication, see Enable Internal Authentication (page 347).
Considerations
With access control enabled, ensure you have a user with userAdmin (page 417) or userAdminAnyDatabase
(page 421) role in the admin database.
This tutorial assumes a standalone environment.
The Enable Internal Authentication (page 347) tutorial has steps specific to enabling access control on replica sets and
sharded clusters.
You can create users before enabling access control or you can create users after enabling access control. If you
enable access control before creating any user, MongoDB provides a localhost exception (page 320) which allows you
to create a user administrator in the admin database. Once created, authenticate as the user administrator to create
additional users as needed.
Procedures
Add Users Before Enabling Access Control The following procedure first adds a user administrator to a MongoDB
instance running without access control and then enables access control.
Step 1: Start MongoDB without access control. For example, the following starts a standalone mongod instance
without access control.
mongod --port 27017 --dbpath /data/db1
For details on starting a mongod or mongos, see Manage mongod Processes (page 245) or Deploy a Sharded Cluster
(page 757).
Step 2: Connect to the instance. For example, connect a mongo shell to the instance.
mongo --port 27017
Specify additional command line options as appropriate to connect the mongo shell to your deployment, such as
--host.
Step 3: Create the user administrator. Add a user with the userAdminAnyDatabase (page 421) role. For
example, the following creates the user myUserAdmin on the admin database:
use admin
db.createUser(
{
user: "myUserAdmin",
pwd: "abc123",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
}
)
Step 4: Re-start the MongoDB instance with access control. Re-start the mongod instance with the --auth
command line option or, if using a configuration file, the security.authorization setting.
mongod --auth --port 27017 --dbpath /data/db1
Step 5: Authenticate as the user administrator. Either connect a new mongo shell to the MongoDB instance with
the -u <username>, -p <password>, and the --authenticationDatabase <database>:
mongo --port 27017 -u "myUserAdmin" -p "abc123" --authenticationDatabase "admin"
The mongo shell executes a number of commands at start up. As a result, when you log in as the user administrator,
you may see authentication errors from one or more commands. You may ignore these errors, which are expected,
because the userAdminAnyDatabase (page 421) role does not have permissions to run some of the start up
commands.
Or, in the mongo shell connected without authentication, switch to the authentication database, and use db.auth()
method to authenticate:
use admin
db.auth("myUserAdmin", "abc123" )
Step 5: Create additional users as needed for your deployment. If you need to disable access control for any
reason, restart the MongoDB instance without the --auth command line option, or if using a configuration file, the
security.authorization setting.
Add Users After Enabling Access Control The following procedure first enables access control, and then uses
localhost exception (page 320) to add a user administrator.
Step 1: Start the MongoDB instance with access control. Start the mongod instance with the --auth command
line option or, if using a configuration file, the security.authorization setting.
mongod --auth --port 27017 --dbpath /data/db1
Step 2: Connect to the MongoDB instance via the localhost exception. To add the first user using Localhost
Exception (page 320), connect a mongo shell to the mongod instance. Run the mongo shell from the same host as
the mongod instance.
Step 3: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 421)
role, and only that role.
The following example creates the user myUserAdmin user on the admin database:
use admin
db.createUser(
{
user: "myUserAdmin",
pwd: "abc123",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
}
)
After you create the user administrator, the localhost exception (page 320) is no longer available.
Step 4: Authenticate as the user administrator. Either connect a new mongo shell to the MongoDB instance with
the -u <username>, -p <password>, and the --authenticationDatabase <database>:
mongo --port 27017 -u "myUserAdmin" -p "abc123" --authenticationDatabase "admin"
The mongo shell executes a number of commands at start up. As a result, when you log in as the user administrator,
you may see authentication errors from one or more commands. You may ignore these errors, which are expected,
because the userAdminAnyDatabase (page 421) role does not have permissions to run some of the start up
commands.
Or, in the mongo shell connected without authentication, switch to the authentication database, and use db.auth()
method to authenticate:
use admin
db.auth("myUserAdmin", "abc123" )
Additional Information
On this page
Overview (page 347)
Considerations (page 348)
Procedures (page 348)
x.509 Internal Authentication (page 352)
Overview
When authentication is enabled on a replica set or a sharded cluster, members of the replica set or the sharded clusters
must provide credentials to authenticate.
To enable authentication on a replica set or a sharded cluster, you must enable authentication individually for each
member. For a sharded cluster, this means enabling authentication on each mongos and each mongod, including the
config servers and each member of a shards replica set.
The following tutorial uses a keyfile (page 329) to enable internal authentication. You can also use x.509 certificate
for internal authentication. For details on using x.509, see Use x.509 Certificate for Membership Authentication
(page 355).
Considerations
Access Control Enabling internal authentication enables access control (page 331). The following tutorial as-
sumes no users have been created in the system before enabling internal authentication, and uses Localhost Exception
(page 320) to add a user administrator after access control has been enabled.
If you prefer, you can create the users before enabling internal authentication.
Sharded Cluster It is not possible to convert an existing sharded cluster that does not enforce access control to
require authentication without taking all components of the cluster offline for a short period of time.
For sharded clusters, the Localhost Exception (page 320) will apply to the individual shards unless you either create
an administrative user or disable the localhost exception on each shard.
Procedures
Step 1: Create a keyfile. Create the keyfile (page 329) your deployment will use to authenticate to members to each
other. You can generate a keyfile using any method you choose. Ensure that the password stored in the keyfile is both
long and contains a high amount of randomness.
For example, the following operation uses openssl command to generate pseudo-random data to use for a keyfile:
openssl rand -base64 741 > /srv/mongodb/mongodb-keyfile
chmod 600 mongodb-keyfile
Step 2: Enable authentication for each member of the sharded cluster or replica set. For each mongod in the
replica set or for each mongos and mongod in the sharded cluster, including all config servers and shards, specify
the keyfile using either a configuration file or a command line option.
In a configuration file, set the security.keyFile option to the keyfiles path and then start the component, as in
the following example:
security:
keyFile: /srv/mongodb/keyfile
Step 3: Connect to the MongoDB instance via the localhost exception. To add the first user using Localhost
Exception (page 320):
For a replica set, connect a mongo shell to the primary. Run the mongo shell from the same host as the primary.
For a sharded cluster, connect a mongo shell to the mongos. Run the mongo shell from same host as the
mongos.
Step 4: Add first user. Add a user with the userAdminAnyDatabase (page 421) role. For example, the follow-
ing creates the user myUserAdmin on the admin database:
use admin
db.createUser(
{
user: "myUserAdmin",
pwd: "abc123",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
}
)
After you create the user administrator, for a replica set, the localhost exception (page 320) is no longer available.
For sharded clusters, you must still prevent unauthorized access to the individual shards. Follow one of the following
steps for each shard in your cluster:
Create an administrative user, or
Disable the Localhost Exception (page 320) at startup. To disable the localhost exception, set the
enableLocalhostAuthBypass to 0.
Step 5: Authenticate as the user administrator. Either connect a new mongo shell to the MongoDB instance with
the -u <username>, -p <password>, and the --authenticationDatabase <database>:
mongo --port 27017 -u "myUserAdmin" -p "abc123" --authenticationDatabase "admin"
The mongo shell executes a number of commands at start up. As a result, when you log in as the user administrator,
you may see authentication errors from one or more commands. You may ignore these errors, which are expected,
because the userAdminAnyDatabase (page 421) role does not have permissions to run some of the start up
commands.
Or, in the mongo shell connected without authentication, switch to the authentication database, and use db.auth()
method to authenticate:
use admin
db.auth("myUserAdmin", "abc123" )
Step 1: Start one member of the replica set. This mongod should not enable auth.
Step 2: Create administrative users. The following operations will create two users: a user administrator that will
be able to create and modify users (myUserAdmin), and a root (page 422) user (siteRootAdmin) that you will
use to complete the remainder of the tutorial:
use admin
db.createUser( {
user: "myUserAdmin",
pwd: "<password>",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
});
db.createUser( {
user: "siteRootAdmin",
pwd: "<password>",
roles: [ { role: "root", db: "admin" } ]
});
Step 4: Create the key file to be used by each member of the replica set. Create the key file your deployment will
use to authenticate servers to each other.
To generate pseudo-random data to use for a keyfile, issue the following openssl command:
openssl rand -base64 741 > mongodb-keyfile
chmod 600 mongodb-keyfile
You may generate a key file using any method you choose. Always ensure that the password stored in the key file is
both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key.
Step 5: Copy the key file to each member of the replica set. Copy the mongodb-keyfile to all hosts where
components of a MongoDB deployment run. Set the permissions of these files to 600 so that only the owner of the
file can read or write this file to prevent other users on the system from accessing the shared secret.
Step 6: Start each member of the replica set with the appropriate options. For each member, start a mongod
and specify the key file and the name of the replica set. Also specify other parameters as needed for your deployment.
For replication-specific parameters, see cli-mongod-replica-set required by your deployment.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
The following example specifies parameters through the --keyFile and --replSet command-line options:
mongod --keyFile /mysecretdirectory/mongodb-keyfile --replSet "rs0"
In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of
this document.
Step 7: Connect to the member of the replica set where you created the administrative users. Connect to
the replica set member you started and authenticate as the siteRootAdmin user. From the mongo shell, use the
following operation to authenticate:
use admin
db.auth("siteRootAdmin", "<password>");
Step 8: Initiate the replica set. Use rs.initiate() on one and only one member of the replica set:
rs.initiate()
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration.
Step 9: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object
(page 709):
rs.conf()
Step 10: Add the remaining members to the replica set. Add the remaining members with the rs.add()
method. You must be connected to the primary to add members to a replica set.
rs.add() can, in some cases, trigger an election. If the mongod you are connected to becomes a secondary, you
need to connect the mongo shell to the new primary to continue adding new replica set members. Use rs.status()
to identify the primary in the replica set.
The following example adds two members:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you have a fully functional replica set. The new replica set will elect a primary.
Step 11: Check the status of the replica set. Use the rs.status() operation:
rs.status()
Step 12: Create additional users to address operational requirements. You can use built-in roles (page 414) to
create common types of database users, such as the dbOwner (page 416) role to create a database administrator, the
readWrite (page 415) role to create a user who can update data, or the read (page 415) role to create user who
can search data but no more. You also can define custom roles (page 335).
For example, the following creates a database administrator for the products database:
use products
db.createUser(
{
user: "productsDBAdmin",
pwd: "password",
roles:
[
{
role: "dbOwner",
db: "products"
}
]
}
)
For an overview of roles and privileges, see Role-Based Access Control (page 331). For more information on adding
users, see Manage User and Roles (page 373).
For details on using x.509 for internal authentication, see Use x.509 Certificate for Membership Authentication
(page 355).
To upgrade from keyfile internal authentication to x.509 internal authentication, see Upgrade from Keyfile Authentica-
tion to x.509 Authentication (page 357).
The following tutorials provide information on configuring MongoDB to use authentication mechanisms other than
the default authentication mechanism (page 321). For tutorials on using default authentication mechanism (page 321),
see Enable Access Control (page 344).
Use x.509 Certificates to Authenticate Clients (page 353) Use x.509 for client authentication.
Use x.509 Certificate for Membership Authentication (page 355) Use x.509 for internal member authentication for
replica sets and sharded clusters.
Upgrade from Keyfile Authentication to x.509 Authentication (page 357) Upgrade from keyfile internal authentica-
tion to x.509 internal authentication.
Configure MongoDB with Kerberos Authentication on Linux (page 359) For MongoDB Enterprise Linux, de-
scribes the process to enable Kerberos-based authentication for MongoDB deployments.
Configure MongoDB with Kerberos Authentication on Windows (page 363) For MongoDB Enterprise for Win-
dows, describes the process to enable Kerberos-based authentication for MongoDB deployments.
Troubleshoot Kerberos Authentication (page 365) Steps to troubleshoot Kerberos-based authentication for Mon-
goDB deployments.
Authenticate Using SASL and LDAP with ActiveDirectory (page 367) Describes the process for authentication us-
ing SASL/LDAP with ActiveDirectory.
Authenticate Using SASL and LDAP with OpenLDAP (page 370) Describes the process for authentication using
SASL/LDAP with OpenLDAP.
On this page
Prerequisites (page 353)
Procedures (page 353)
Prerequisites
Important: A full description of TLS/SSL, PKI (Public Key Infrastructure) certificates, in particular x.509 cer-
tificates, and Certificate Authority is beyond the scope of this document. This tutorial assumes prior knowledge of
TLS/SSL as well as access to valid x.509 certificates.
Certificate Authority For production use, your MongoDB deployment should use valid certificates generated and
signed by a single certificate authority. You or your organization can generate and maintain an independent certificate
authority, or use certificates generated by a third-party SSL vendor. Obtaining and managing certificates is beyond the
scope of this documentation.
Client x.509 Certificate The client certificate must have the following properties:
A single Certificate Authority (CA) must issue the certificates for both the client and the server.
Client certificates must contain the following fields:
keyUsage = digitalSignature
extendedKeyUsage = clientAuth
Warning: If a client x.509 certificates subject has the same O, OU, and DC combination as the Member
x.509 Certificate (page 356), the client will be identified as a cluster member and granted full permission on
the system.
Procedures
Use Command-line Options You can configure the MongoDB server from the command line, e.g.:
mongod --clusterAuthMode x509 --sslMode requireSSL --sslPEMKeyFile <path to SSL certificate and key P
Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-
cation will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of
processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate
chain.
As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified.
Use Configuration File You may also specify these options in the configuration file.
Starting in MongoDB 2.6, you can specify the configuration for MongoDB in YAML format, e.g.:
security:
clusterAuthMode: x509
net:
ssl:
mode: requireSSL
PEMKeyFile: <path to TLS/SSL certificate and key PEM file>
CAFile: <path to root CA PEM file>
For backwards compatibility, you can also specify the configuration using the older configuration file format33 , e.g.:
clusterAuthMode = x509
sslMode = requireSSL
sslPEMKeyFile = <path to TLS/SSL certificate and key PEM file>
sslCAFile = <path to the root CA PEM file>
Include any additional options, TLS/SSL or otherwise, that are required for your specific configuration.
Add x.509 Certificate subject as a User To authenticate with a client certificate, you must first add the value of
the subject from the client certificate as a MongoDB user. Each unique x.509 client certificate corresponds to a
single MongoDB user; i.e. you cannot use a single client certificate to authenticate more than one MongoDB user.
Note: The RDNs in the subject string must be compatible with the RFC225334 standard.
1. You can retrieve the RFC2253 formatted subject from the client certificate with the following command:
openssl x509 -in <pathToClient PEM> -inform PEM -subject -nameopt RFC2253
2. Add the RFC2253 compliant value of the subject as a user. Omit spaces as needed.
For example, in the mongo shell, to add the user with both the readWrite role in the test database and the
userAdminAnyDatabase role which is defined only in the admin database:
33 https://docs.mongodb.org/v2.4/reference/configuration-options
34 https://www.ietf.org/rfc/rfc2253.txt
db.getSiblingDB("$external").runCommand(
{
createUser: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry",
roles: [
{ role: 'readWrite', db: 'test' },
{ role: 'userAdminAnyDatabase', db: 'admin' }
],
writeConcern: { w: "majority" , wtimeout: 5000 }
}
)
In the above example, to add the user with the readWrite role in the test database, the role specification
document specified test in the db field. To add userAdminAnyDatabase role for the user, the above
example specified admin in the db field.
Note: Some roles are defined only in the admin database, including: clusterAdmin,
readAnyDatabase, readWriteAnyDatabase, dbAdminAnyDatabase, and
userAdminAnyDatabase. To add a user with these roles, specify admin in the db.
See Manage User and Roles (page 373) for details on adding a user with roles.
Authenticate with a x.509 Certificate To authenticate with a client certificate, you must first add a MongoDB user
that corresponds to the client certificate. See Add x.509 Certificate subject as a User (page 354).
To authenticate, use the db.auth() method in the $external database, specifying "MONGODB-X509" for the
mechanism field, and the user that corresponds to the client certificate (page 354) for the user field.
For example, if using the mongo shell,
1. Connect mongo shell to the mongod set up for SSL:
mongo --ssl --sslPEMKeyFile <path to CA signed client PEM file> --sslCAFile <path to root CA PEM
2. To perform the authentication, use the db.auth() method in the $external database. For the mechanism
field, specify "MONGODB-X509", and for the user field, specify the user, or the subject, that corresponds
to the client certificate.
db.getSiblingDB("$external").auth(
{
mechanism: "MONGODB-X509",
user: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry"
}
)
On this page
Member x.509 Certificate (page 356)
Configure Replica Set/Sharded Cluster (page 357)
Additional Information (page 357)
MongoDB supports x.509 certificate authentication for use with a secure TLS/SSL connection (page 382). Sharded
cluster members and replica set members can use x.509 certificates to verify their membership to the cluster or the
replica set instead of using keyfiles (page 329). The membership authentication is an internal process.
For client authentication with x.509, see Use x.509 Certificates to Authenticate Clients (page 353).
Important: A full description of TLS/SSL, PKI (Public Key Infrastructure) certificates, in particular x.509 cer-
tificates, and Certificate Authority is beyond the scope of this document. This tutorial assumes prior knowledge of
TLS/SSL as well as access to valid x.509 certificates.
Certificate Requirements The member certificate, used for internal authentication to verify membership to the
sharded cluster or a replica set, must have the following properties:
A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or
a replica set.
The Distinguished Name (DN), found in the member certificates subject, must specify a non-empty value
for at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain
Component (DC).
The Organization attributes (Os), the Organizational Unit attributes (OUs), and the Domain Components (DCs)
must match those from the certificates for the other cluster members. To match, the certificate must match all
specifications of these attributes, or even the non-specification of these attributes. The order of the attributes
does not matter.
In the following example, the two DNs contain matching specifications for O, OU as well as the non-specification
of the DC attribute.
CN=host1,OU=Dept1,O=MongoDB,ST=NY,C=US
C=US, ST=CA, O=MongoDB, OU=Dept1, CN=host2
However, the following two DNs contain a mismatch for the OU attribute since one contains two OU specifica-
tions and the other, only one specification.
CN=host1,OU=Dept1,OU=Sales,O=MongoDB
CN=host2,OU=Dept1,O=MongoDB
Either the Common Name (CN) or one of the Subject Alternative Name (SAN) entries must match the hostname
of the server, used by the other members of the cluster.
For example, the certificates for a cluster could have the following subjects:
subject= CN=<myhostname1>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname2>,OU=Dept1,O=MongoDB,ST=NY,C=US
subject= CN=<myhostname3>,OU=Dept1,O=MongoDB,ST=NY,C=US
If the certificate includes the Extended Key Usage (extendedKeyUsage) setting, the value must include
clientAuth (TLS Web Client Authentication).
extendedKeyUsage = clientAuth
You can also use a certificate that does not include the Extended Key Usage (EKU).
Member Certificate and PEMKeyFile To configure MongoDB for client certificate authentication, the mongod
and mongos specify a PEMKeyFile to prove its identity to clients, either through net.ssl.PEMKeyFile setting
in the configuration file or --sslPEMKeyFile command line option.
If no clusterFile certificate is specified for internal member authentication, MongoDB will attempt to use the
PEMKeyFile certificate for member authentication. In order to use PEMKeyFile certificate for internal authenti-
cation as well as for client authentication, then the PEMKeyFile certificate must either:
Omit extendedKeyUsage or
Specify extendedKeyUsage values that include clientAuth in addition to serverAuth.
Use Command-line Options To specify the x.509 certificate for internal cluster member authentication, append the
additional TLS/SSL options --clusterAuthMode and --sslClusterFile, as in the following example for a
member of a replica set:
mongod --replSet <name> --sslMode requireSSL --clusterAuthMode x509 --sslClusterFile <path to members
Include any additional options, TLS/SSL or otherwise, that are required for your specific configuration. For instance,
if the membership key is encrypted, set the --sslClusterPassword to the passphrase to decrypt the key or have
MongoDB prompt for the passphrase. See SSL Certificate Passphrase (page 386) for details.
Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-
cation will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of
processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate
chain.
As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified.
Use Configuration File You can specify the configuration for MongoDB in a YAML formatted configuration
file, as in the following example:
security:
clusterAuthMode: x509
net:
ssl:
mode: requireSSL
PEMKeyFile: <path to TLS/SSL certificate and key PEM file>
CAFile: <path to root CA PEM file>
clusterFile: <path to x.509 membership certificate and key PEM file>
Additional Information
To upgrade from keyfile internal authentication to x.509 internal authentication, see Upgrade from Keyfile Authentica-
tion to x.509 Authentication (page 357).
On this page
Clusters Currently Using TLS/SSL (page 358)
Clusters Currently Not Using TLS/SSL (page 358)
To upgrade clusters that are currently using keyfile authentication (page 329) to x.509 authentication, use the following
rolling upgrade processes.
For clusters using TLS/SSL and keyfile authentication, to upgrade to x.509 cluster authentication, use the following
rolling upgrade process:
1. For each node of a cluster, start the node with the option --clusterAuthMode set to sendKeyFile and
the option --sslClusterFile set to the appropriate path of the nodes certificate. Include other TLS/SSL
options (page 382) as well as any other options that are required for your specific configuration. For example:
mongod --replSet <name> --sslMode requireSSL --clusterAuthMode sendKeyFile --sslClusterFile <pat
With this setting, each node continues to use its keyfile to authenticate itself as a member. However, each
node can now accept either a keyfile or an x.509 certificate from other members to authenticate those members.
Upgrade all nodes of the cluster to this setting.
2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the
clusterAuthMode to sendX509. 35 For example,
db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "sendX509" } )
With this setting, each node uses its x.509 certificate, specified with the --sslClusterFile option in the
previous step, to authenticate itself as a member. However, each node continues to accept either a keyfile or an
x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to this
setting.
3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the
setParameter command to update the clusterAuthMode to x509 to only use the x.509 certificate for
authentication. 1 For example:
db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "x509" } )
4. After the upgrade of all nodes, edit the configuration file with the appropriate x.509 settings to ensure
that upon subsequent restarts, the cluster uses x.509 authentication.
See --clusterAuthMode for the various modes and their descriptions.
For clusters using keyfile authentication but not TLS/SSL, to upgrade to x.509 authentication, use the following rolling
upgrade process:
1. For each node of a cluster, start the node with the option --sslMode set to allowSSL, the option
--clusterAuthMode set to sendKeyFile and the option --sslClusterFile set to the appropri-
ate path of the nodes certificate. Include other TLS/SSL options (page 382) as well as any other options that are
required for your specific configuration. For example:
mongod --replSet <name> --sslMode allowSSL --clusterAuthMode sendKeyFile --sslClusterFile <path
The --sslMode allowSSL setting allows the node to accept both TLS/SSL and non-TLS/non-SSL incom-
ing connections. Its outgoing connections do not use TLS/SSL.
35 As an alternative to using the setParameter command, you can also restart the nodes with the appropriate TLS/SSL and x509 options and
values.
The --clusterAuthMode sendKeyFile setting allows each node continues to use its keyfile to authen-
ticate itself as a member. However, each node can now accept either a keyfile or an x.509 certificate from other
members to authenticate those members.
Upgrade all nodes of the cluster to these settings.
2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the
sslMode to preferSSL and the clusterAuthMode to sendX509. 1 For example:
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL", clusterAuthMode: "
With the sslMode set to preferSSL, the node accepts both TLS/SSL and non-TLS/non-SSL incoming con-
nections, and its outgoing connections use TLS/SSL.
With the clusterAuthMode set to sendX509, each node uses its x.509 certificate, specified with the
--sslClusterFile option in the previous step, to authenticate itself as a member. However, each node
continues to accept either a keyfile or an x.509 certificate from other members to authenticate those members.
Upgrade all nodes of the cluster to these settings.
3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the
setParameter command to update the sslMode to requireSSL and the clusterAuthMode to x509.
1
For example:
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL", clusterAuthMode:
With the sslMode set to requireSSL, the node only uses TLS/SSLs connections.
With the clusterAuthMode set to x509, the node only uses the x.509 certificate for authentication.
4. After the upgrade of all nodes, edit the configuration file with the appropriate TLS/SSL and x.509
settings to ensure that upon subsequent restarts, the cluster uses x.509 authentication.
See --clusterAuthMode for the various modes and their descriptions.
On this page
Overview (page 359)
Prerequisites (page 359)
Procedure (page 360)
Additional Considerations (page 361)
Additional Resources (page 363)
Overview
MongoDB Enterprise supports authentication using a Kerberos service (page 325). Kerberos is an industry standard
authentication protocol for large client/server system.
Prerequisites
Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes you
have configured a Kerberos service principal (page 326) for each mongod and mongos instance in your MongoDB
deployment, and you have a valid keytab file (page 327) for for each mongod and mongos instance.
To verify MongoDB Enterprise binaries:
mongod --version
In the output from this command, look for the string modules: subscription or modules: enterprise
to confirm your system has MongoDB Enterprise.
Procedure
The following procedure outlines the steps to add a Kerberos user principal to MongoDB, configure a standalone
mongod instance for Kerberos support, and connect using the mongo shell and authenticate the user principal.
Step 1: Start mongod without Kerberos. For the initial addition of Kerberos users, start mongod without Kerberos
support.
If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod with
Kerberos support.
Step 2: Connect to mongod. Connect via the mongo shell to the mongod instance. If mongod has --auth
enabled, ensure you connect with the privileges required to create a user.
Add additional principals as needed. For every user you want to authenticate using Kerberos, you must
create a corresponding user in MongoDB. For more information about creating and managing users, see
https://docs.mongodb.org/manual/reference/command/nav-user-management.
Step 4: Start mongod with Kerberos support. To start mongod with Kerberos support, set the environmental
variable KRB5_KTNAME to the path of the keytab file and the mongod parameter authenticationMechanisms
to GSSAPI in the following form:
env KRB5_KTNAME=<path to keytab file> \
mongod \
--setParameter authenticationMechanisms=GSSAPI
<additional mongod options>
For example, the following starts a standalone mongod instance with Kerberos support:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
/opt/mongodb/bin/mongod --auth \
--setParameter authenticationMechanisms=GSSAPI \
--dbpath /opt/mongodb/data
The path to your mongod as well as your keytab file (page 327) may differ. Modify or include additional mongod
options as required for your configuration. The keytab file (page 327) must be only accessible to the owner of the
mongod process.
With the official .deb or .rpm packages, you can set the KRB5_KTNAME in a environment settings file. See
KRB5_KTNAME (page 361) for details.
Step 5: Connect mongo shell to mongod and authenticate. Connect the mongo shell client as the Kerberos prin-
cipal application/reporting@EXAMPLE.NET. Before connecting, you must have used Kerbeross kinit
program to get credentials for application/reporting@EXAMPLE.NET.
You can connect and authenticate from the command line.
mongo --authenticationMechanism=GSSAPI --authenticationDatabase='$external' \
--username application/reporting@EXAMPLE.NET
Or, alternatively, you can first connect mongo to the mongod, and then from the mongo shell, use the db.auth()
method to authenticate in the $external database.
use $external
db.auth( { mechanism: "GSSAPI", user: "application/reporting@EXAMPLE.NET" } )
Additional Considerations
KRB5_KTNAME If you installed MongoDB Enterprise using one of the official .deb or .rpm packages, and you
use the included init/upstart scripts to control the mongod instance, you can set the KR5_KTNAME variable in the
default environment settings file instead of setting the variable each time.
For .rpm packages, the default environment settings file is /etc/sysconfig/mongod.
For .deb packages, the file is /etc/default/mongodb.
Set the KRB5_KTNAME value in a line that resembles the following:
export KRB5_KTNAME="<path to keytab>"
Configure mongos for Kerberos To start mongos with Kerberos support, set the environmen-
tal variable KRB5_KTNAME to the path of its keytab file (page 327) and the mongos parameter
authenticationMechanisms to GSSAPI in the following form:
env KRB5_KTNAME=<path to keytab file> \
mongos \
--setParameter authenticationMechanisms=GSSAPI \
<additional mongos options>
For example, the following starts a mongos instance with Kerberos support:
env KRB5_KTNAME=/opt/mongodb/mongos.keytab \
mongos \
--setParameter authenticationMechanisms=GSSAPI \
--configdb shard0.example.net, shard1.example.net,shard2.example.net \
--keyFile /opt/mongodb/mongos.keyfile
The path to your mongos as well as your keytab file (page 327) may differ. The keytab file (page 327) must be only
accessible to the owner of the mongos process.
Modify or include any additional mongos options as required for your configuration. For example, instead of us-
ing --keyFile for internal authentication of sharded cluster members, you can use x.509 member authentication
(page 355) instead.
Use a Config File To configure mongod or mongos for Kerberos support using a configuration file,
specify the authenticationMechanisms setting in the configuration file:
If using the YAML configuration file format:
setParameter:
authenticationMechanisms: GSSAPI
Modify or include any additional mongod options as required for your configuration. For example, if
/opt/mongodb/mongod.conf contains the following configuration settings for a standalone mongod:
security:
authorization: enabled
setParameter:
authenticationMechanisms: GSSAPI
storage:
dbPath: /opt/mongodb/data
The path to your mongod, keytab file (page 327), and configuration file may differ. The keytab file (page 327) must
be only accessible to the owner of the mongod process.
Troubleshoot Kerberos Setup for MongoDB If you encounter problems when starting mongod or mongos with
Kerberos authentication, see Troubleshoot Kerberos Authentication (page 365).
Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI (page 325) (Kerberos))
can work alongside MongoDBs challenge/response authentication mechanisms (SCRAM-SHA-1 (page 321) and
MONGODB-CR (page 322)), MongoDBs authentication mechanism for LDAP (PLAIN (page 325) (LDAP SASL)),
and MongoDBs authentication mechanism for x.509 ( MONGODB-X509 (page 322)). Specify the mechanisms as
follows:
--setParameter authenticationMechanisms=GSSAPI,SCRAM-SHA-1
Only add the other mechanisms if in use. This parameter setting does not affect MongoDBs internal authentication of
cluster members.
36 https://docs.mongodb.org/v2.4/reference/configuration-options
Additional Resources
MongoDB LDAP and Kerberos Authentication with Dell (Quest) Authentication Services37
MongoDB with Red Hat Enterprise Linux Identity Management and Kerberos38
On this page
Overview (page 363)
Prerequisites (page 363)
Procedures (page 363)
Additional Considerations (page 364)
Overview
MongoDB Enterprise supports authentication using a Kerberos service (page 325). Kerberos is an industry standard
authentication protocol for large client/server system. Kerberos allows MongoDB and applications to take advantage
of existing authentication infrastructure and processes.
Prerequisites
Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes have
configured a Kerberos service principal (page 326) for each mongod.exe and mongos.exe instance.
Procedures
Step 1: Start mongod.exe without Kerberos. For the initial addition of Kerberos users, start mongod.exe
without Kerberos support.
If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod.exe
with Kerberos support.
Step 2: Connect to mongod. Connect via the mongo.exe shell to the mongod.exe instance. If mongod.exe
has --auth enabled, ensure you connect with the privileges required to create a user.
use $external
db.createUser(
{
user: "reportingapp@EXAMPLE.NET",
roles: [ { role: "read", db: "records" } ]
}
)
Add additional principals as needed. For every user you want to authenticate using Kerberos, you must
create a corresponding user in MongoDB. For more information about creating and managing users, see
https://docs.mongodb.org/manual/reference/command/nav-user-management.
Step 4: Start mongod.exe with Kerberos support. You must start mongod.exe as the service principal ac-
count (page 365).
To start mongod.exe with Kerberos support, set the mongod.exe parameter authenticationMechanisms
to GSSAPI:
mongod.exe --setParameter authenticationMechanisms=GSSAPI <additional mongod.exe options>
For example, the following starts a standalone mongod.exe instance with Kerberos support:
mongod.exe --auth --setParameter authenticationMechanisms=GSSAPI
Step 5: Connect mongo.exe shell to mongod.exe and authenticate. Connect the mongo.exe shell client as
the Kerberos principal application@EXAMPLE.NET.
You can connect and authenticate from the command line.
mongo.exe --authenticationMechanism=GSSAPI --authenticationDatabase='$external' \
--username reportingapp@EXAMPLE.NET
Or, alternatively, you can first connect mongo.exe to the mongod.exe, and then from the mongo.exe shell, use
the db.auth() method to authenticate in the $external database.
use $external
db.auth( { mechanism: "GSSAPI", user: "reportingapp@EXAMPLE.NET" } )
Additional Considerations
Configure mongos.exe for Kerberos To start mongos.exe with Kerberos support, set the mongos.exe pa-
rameter authenticationMechanisms to GSSAPI. You must start mongos.exe as the service principal ac-
count (page 365).:
mongos.exe --setParameter authenticationMechanisms=GSSAPI <additional mongos options>
For example, the following starts a mongos instance with Kerberos support:
mongos.exe --setParameter authenticationMechanisms=GSSAPI --configdb shard0.example.net, shard1.examp
Modify or include any additional mongos.exe options as required for your configuration. For example, instead of
using --keyFile for internal authentication of sharded cluster members, you can use x.509 member authentication
(page 355) instead.
Assign Service Principal Name to MongoDB Windows Service Use setspn.exe to assign the service principal
name (SPN) to the account running the mongod.exe and the mongos.exe service:
setspn.exe -A <service>/<fully qualified domain name> <service account name>
For example, if mongod.exe runs as a service named mongodb on testserver.mongodb.com with the ser-
vice account name mongodtest, assign the SPN as follows:
setspn.exe -A mongodb/testserver.mongodb.com mongodtest
Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI (page 325) (Kerberos))
can work alongside MongoDBs challenge/response authentication mechanisms (SCRAM-SHA-1 (page 321) and
MONGODB-CR (page 322)), MongoDBs authentication mechanism for LDAP (PLAIN (page 325) (LDAP SASL)),
and MongoDBs authentication mechanism for x.509 ( MONGODB-X509 (page 322)). Specify the mechanisms as
follows:
--setParameter authenticationMechanisms=GSSAPI,SCRAM-SHA-1
Only add the other mechanisms if in use. This parameter setting does not affect MongoDBs internal authentication of
cluster members.
On this page
Kerberos Configuration Checklist (page 365)
Debug with More Verbose Logs on Linux (page 366)
Common Error Messages (page 366)
If you have difficulty starting mongod or mongos with Kerberos (page 325), ensure that:
The mongod and the mongos binaries are from MongoDB Enterprise.
To verify MongoDB Enterprise binaries:
mongod --version
In the output from this command, look for the string modules: subscription or modules:
enterprise to confirm your system has MongoDB Enterprise.
You are not using the HTTP Console39 . MongoDB Enterprise does not support Kerberos authentication over the
HTTP Console interface.
On Linux, either the service principal name (SPN) in the keytab file (page 327) matches the SPN for
the mongod or mongos instance, or the mongod or the mongos instance use the --setParameter
saslHostName=<host name> to match the name in the keytab file.
The canonical system hostname of the system that runs the mongod or mongos instance is a resolvable, fully
qualified domain for this host. You can test the system hostname resolution with the hostname -f command
at the system prompt.
39 https://docs.mongodb.org/ecosystem/tools/http-interface/#http-console
Each host that runs a mongod or mongos instance has both the A and PTR DNS records to provide forward
and reverse lookup. The records allow the host to resolve the components of the Kerberos infrastructure.
Both the Kerberos Key Distribution Center (KDC) and the system running mongod instance or mongos must
be able to resolve each other using DNS. By default, Kerberos attempts to resolve hosts using the content of the
/etc/kerb5.conf before using DNS to resolve hosts.
The time synchronization of the systems running mongod or the mongos instances and the Kerberos infras-
tructure are within the maximum time skew (default is 5 minutes) of each other. Time differences greater than
the maximum time skew will prevent successful authentication.
If you still encounter problems with Kerberos on Linux, you can start both mongod and mongo (or another client)
with the environment variable KRB5_TRACE set to different files to produce more verbose logging of the Kerberos
process to help further troubleshooting. For example, the following starts a standalone mongod with KRB5_TRACE
set:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
KRB5_TRACE=/opt/mongodb/log/mongodb-kerberos.log \
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data \
--fork --logpath /opt/mongodb/log/mongod.log \
--auth --setParameter authenticationMechanisms=GSSAPI
In some situations, MongoDB will return error messages from the GSSAPI interface if there is a problem with the
Kerberos service. Some common error messages are:
GSSAPI error in client while negotiating security context. This error occurs on the
client and reflects insufficient credentials or a malicious attempt to authenticate.
If you receive this error, ensure that you are using the correct credentials and the correct fully qualified domain
name when connecting to the host.
GSSAPI error acquiring credentials. This error occurs during the start of the mongod or mongos
and reflects improper configuration of the system hostname or a missing or incorrectly configured keytab file.
If you encounter this problem, consider the items in the Kerberos Configuration Checklist (page 365), in partic-
ular, whether the SPN in the keytab file (page 327) matches the SPN for the mongod or mongos instance.
To determine whether the SPNs match:
1. Examine the keytab file, with the following command:
klist -k <keytab>
Ensure that this name matches the name in the keytab file, or start mongod or mongos with the
--setParameter saslHostName=<hostname>.
See also:
Kerberos Authentication (page 325)
On this page
Considerations (page 367)
Configure saslauthd (page 367)
Configure MongoDB (page 368)
MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure
a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory
Access Protocol (LDAP) service.
Considerations
MongoDB Enterprise for Windows does not include LDAP support for authentication. However, MongoDB Enterprise
for Linux supports using LDAP authentication with an ActiveDirectory server.
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 1001) for upgrade instructions.
Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the
LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You
should use only a trusted channel such as a VPN, a connection encrypted with TLS/SSL, or a trusted wired network.
Configure saslauthd
LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as
the MongoDB server.
Step 1: Specify the mechanism. On systems that configure saslauthd with the
/etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon
Linux AMI, set the mechanism MECH to ldap:
MECH=ldap
On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the
MECHANISMS option to ldap:
MECHANISMS="ldap"
Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-
cation credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server
to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its
cache, even in the LDAP server is down or if the cached users credentials are revoked.
To set the expiration time (in seconds) for the authentication cache, see the -t option40 of saslauthd.
40 http://www.linuxcommand.org/man_pages/saslauthd8.html
Step 3: Configure LDAP Options with ActiveDirectory. If the saslauthd.conf file does not exist, create it.
The saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option41
of saslauthd.
To use with ActiveDirectory, start saslauthd with the following configuration options set in the
saslauthd.conf file:
ldap_servers: <ldap uri>
ldap_use_sasl: yes
ldap_mech: DIGEST-MD5
ldap_auth_method: fastbind
For the <ldap uri>, specify the uri of the ldap server. For example, ldap_servers:
ldaps://ad.example.net.
For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd.
Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration.
For example:
testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux
Note: /var/run/saslauthd directory must have permissions set to 755 for MongoDB to successfully authen-
ticate.
Configure MongoDB
Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To
specify the users privileges, assign roles (page 331) to the user.
For example, the following adds a user with read-only access to the records database.
db.getSiblingDB("$external").createUser(
{
user : <username>,
roles: [ { role: "read", db: "records" } ]
}
)
Add additional principals as needed. For more information about creating and managing users, see
https://docs.mongodb.org/manual/reference/command/nav-user-management.
Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy
authentication, start the mongod with the following options:
--auth,
authenticationMechanisms parameter set to PLAIN, and
saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance.
Configure the MongoDB server using either the command line option --setParameter or the configuration
file. Specify additional configurations as appropriate for your configuration.
If you use the authorization option to enforce authentication, you will need privileges to create a user.
41 http://www.linuxcommand.org/man_pages/saslauthd8.html
Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the
saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example:
mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authentication
Or if using a YAML format configuration file, specify the following settings in the file:
security:
authorization: enabled
setParameter:
saslauthdPath: /<some>/<path>/saslauthd/mux
authenticationMechanisms: PLAIN
Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to
the empty string "", as in the following command line example:
mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN
Or if using a YAML format configuration file, specify the following settings in the file:
security:
authorization: enabled
setParameter:
saslauthdPath: ""
authenticationMechanisms: PLAIN
Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the
db.auth() method in the $external database.
Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively,
and the value false in the digestPassword field. You must specify false for digestPassword since the
server must receive an undigested password to forward on to saslauthd, as in the following example:
db.getSiblingDB("$external").auth(
{
mechanism: "PLAIN",
user: <username>,
pwd: <cleartext password>,
digestPassword: false
}
)
42 https://docs.mongodb.org/v2.4/reference/configuration-options
43 https://docs.mongodb.org/v2.4/reference/configuration-options
The server forwards the password in plain text. In general, use only on a trusted channel (VPN, TLS/SSL, trusted
wired network). See Considerations.
On this page
Considerations (page 370)
Configure saslauthd (page 370)
Configure MongoDB (page 371)
MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure
a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory
Access Protocol (LDAP) service.
Considerations
MongoDB Enterprise for Windows does not include LDAP support for authentication. However, MongoDB Enterprise
for Linux supports using LDAP authentication with an ActiveDirectory server.
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 1001) for upgrade instructions.
Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the
LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You
should use only a trusted channel such as a VPN, a connection encrypted with TLS/SSL, or a trusted wired network.
Configure saslauthd
LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as
the MongoDB server.
Step 1: Specify the mechanism. On systems that configure saslauthd with the
/etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon
Linux AMI, set the mechanism MECH to ldap:
MECH=ldap
On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the
MECHANISMS option to ldap:
MECHANISMS="ldap"
Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-
cation credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server
to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its
cache, even in the LDAP server is down or if the cached users credentials are revoked.
To set the expiration time (in seconds) for the authentication cache, see the -t option44 of saslauthd.
44 http://www.linuxcommand.org/man_pages/saslauthd8.html
Step 3: Configure LDAP Options with OpenLDAP. If the saslauthd.conf file does not exist, create it. The
saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option45 of
saslauthd.
To connect to an OpenLDAP server, update the saslauthd.conf file with the following configuration options:
ldap_servers: <ldap uri>
ldap_search_base: <search base>
ldap_filter: <filter>
The ldap_servers specifies the uri of the LDAP server used for authentication. In general, for OpenLDAP installed
on the local machine, you can specify the value ldap://localhost:389 or if using LDAP over TLS/SSL, you
can specify the value ldaps://localhost:636.
The ldap_search_base specifies distinguished name to which the search is relative. The search includes the base
or objects below.
The ldap_filter specifies the search filter.
The values for these configuration options should correspond to the values specific for your test. For example, to filter
on email, specify ldap_filter: (mail=%n) instead.
OpenLDAP Example A sample saslauthd.conf file for OpenLDAP includes the following content:
ldap_servers: ldaps://ad.example.net
ldap_search_base: ou=Users,dc=example,dc=com
ldap_filter: (uid=%u)
To use this sample OpenLDAP configuration, create users with a uid attribute (login name) and place under the
Users organizational unit (ou) under the domain components (dc) example and com.
For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd.
Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration.
For example:
testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux
Note: /var/run/saslauthd directory must have permissions set to 755 for MongoDB to successfully authen-
ticate.
Configure MongoDB
Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To
specify the users privileges, assign roles (page 331) to the user.
For example, the following adds a user with read-only access to the records database.
db.getSiblingDB("$external").createUser(
{
user : <username>,
roles: [ { role: "read", db: "records" } ]
}
)
45 http://www.linuxcommand.org/man_pages/saslauthd8.html
Add additional principals as needed. For more information about creating and managing users, see
https://docs.mongodb.org/manual/reference/command/nav-user-management.
Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy
authentication, start the mongod with the following options:
--auth,
authenticationMechanisms parameter set to PLAIN, and
saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance.
Configure the MongoDB server using either the command line option --setParameter or the configuration
file. Specify additional configurations as appropriate for your configuration.
If you use the authorization option to enforce authentication, you will need privileges to create a user.
Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the
saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example:
mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authentication
Or if using a YAML format configuration file, specify the following settings in the file:
security:
authorization: enabled
setParameter:
saslauthdPath: /<some>/<path>/saslauthd/mux
authenticationMechanisms: PLAIN
Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to
the empty string "", as in the following command line example:
mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN
Or if using a YAML format configuration file, specify the following settings in the file:
security:
authorization: enabled
setParameter:
saslauthdPath: ""
authenticationMechanisms: PLAIN
Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the
db.auth() method in the $external database.
Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively,
and the value false in the digestPassword field. You must specify false for digestPassword since the
server must receive an undigested password to forward on to saslauthd, as in the following example:
db.getSiblingDB("$external").auth(
{
mechanism: "PLAIN",
user: <username>,
pwd: <cleartext password>,
digestPassword: false
}
)
The server forwards the password in plain text. In general, use only on a trusted channel (VPN, TLS/SSL, trusted
wired network). See Considerations.
The following tutorials provide instructions on creating and managing users and roles.
Manage User and Roles (page 373) Manage users by creating new users, creating new roles, and modifying existing
users.
Change Your Password and Custom Data (page 380) Create role with sufficient privileges to allow users to change
their own passwords and modify the optional custom data associated with their user credential.
On this page
Overview (page 373)
Prerequisites (page 374)
Add a User (page 374)
Create a User-Defined Role (page 375)
Modify Access for Existing User (page 376)
Modify Password for Existing User (page 378)
View a Users Role (page 378)
View Roles Privileges (page 379)
Overview
Changed in version 2.6: MongoDB 2.6 introduces a new authorization model (page 331).
MongoDB employs Role-Based Access Control (RBAC) to determine access for users. A user is granted one or more
roles (page 331) that determine the users access or privileges to MongoDB resources (page 427) and the actions
(page 429) that user can perform. A user should have only the minimal set of privileges required to ensure a system of
least privilege.
Each application and user of a MongoDB system should map to a distinct application or user. This access isolation
facilitates access revocation and ongoing user maintenance.
This tutorial provides examples for user and role management under the MongoDBs authorization model.
Prerequisites
Important: If you have enabled access control (page 344) for your deployment, you must authenticate as a user
with the required privileges specified in each section. A user administrator with the userAdminAnyDatabase
(page 421) role, or userAdmin (page 417) role in the specific databases, provides the required privileges to per-
form the operations listed in this tutorial. See Enable Client Access Control (page 344) for details on adding user
administrator as the first user.
Add a User
To create a user, specify the user name, password, and roles (page 331). For users that authenticate using external
mechanisms 48 , you do not need to provide the password when creating users.
When assigning roles, select the roles that have the exact required privileges (page 331). If the correct roles does not
exist, you can create new roles (page 375).
Prerequisites
To create a new user in a database, you must have createUser (page 430) action (page 429) on that database
resource (page 427).
To grant roles to a user, you must have the grantRole (page 430) action (page 429) on the roles database.
Built-in roles userAdmin (page 417) and userAdminAnyDatabase (page 421) provide createUser
(page 430) and grantRole (page 430) actions on their respective resources (page 427).
Procedure
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos as a user with
the privileges specified in the prerequisite section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Create the new user. Create the user in the database to which the user will belong. Pass a well formed user
document to the db.createUser() method.
The following operation creates a user in the reporting database with the specified name, password, and roles.
use reporting
db.createUser(
{
user: "reportsUser",
pwd: "12345678",
roles: [
{ role: "read", db: "reporting" },
{ role: "read", db: "products" },
{ role: "read", db: "sales" },
48 See x.509 (page 322), Kerberos Authentication (page 325), and LDAP Proxy Authority Authentication (page 325)
To authenticate the reportsUser, you must authenticate the user in the reporting database; i.e. specify
--authenticationDatabase reporting.
You can create a user without assigning roles, choosing instead to assign the roles later. To do so, create the user with
an empty roles (page 426) array.
Roles grant users access to MongoDB resources. MongoDB provides a number of built-in roles (page 414) that
administrators can use to control access to a MongoDB system. However, if these roles cannot describe the desired set
of privileges, you can create new roles in a particular database.
Except for roles created in the admin database, a role can only include privileges that apply to its database and can
only inherit from other roles in its database.
A role created in the admin database can include privileges that apply to the admin database, other databases or to
the cluster (page 428) resource, and can inherit from roles in other databases as well as the admin database.
To create a new role, use the db.createRole() method, specifying the privileges in the privileges array and
the inherited roles in the roles array.
MongoDB uses the combination of the database name and the role name to uniquely define a role. Each role is scoped
to the database in which you create the role, but MongoDB stores all role information in the admin.system.roles
(page 299) collection in the admin database.
Create a Role to Manage Current Operations The following example creates a role named manageOpRole
which provides only the privileges to run both db.currentOp() and db.killOp(). 49
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos with the privi-
leges specified in the Prerequisites (page 375) section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
The myUserAdmin has privileges to create roles in the admin as well as other databases.
49 The built-in role clusterMonitor (page 418) also provides the privilege to run db.currentOp() along with other privileges, and the
built-in role hostManager (page 419) provides the privilege to run db.killOp() along with other privileges.
Step 2: Create a new role to manage current operations. manageOpRole has privileges that act on multiple
databases as well as the cluster resource (page 428). As such, you must create the role in the admin database.
use admin
db.createRole(
{
role: "manageOpRole",
privileges: [
{ resource: { cluster: true }, actions: [ "killop", "inprog" ] },
{ resource: { db: "", collection: "" }, actions: [ "killCursors" ] }
],
roles: []
}
)
Warning: Terminate running operations with extreme caution. Only use db.killOp() to terminate operations
initiated by clients and do not terminate internal database operations.
Create a Role to Run mongostat The following example creates a role named mongostatRole that provides
only the privileges to run mongostat. 50
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos with the privi-
leges specified in the Prerequisites (page 375) section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
The myUserAdmin has privileges to create roles in the admin as well as other databases.
Step 2: Create a new role to manage current operations. mongostatRole has privileges that act on the cluster
resource (page 428). As such, you must create the role in the admin database.
use admin
db.createRole(
{
role: "mongostatRole",
privileges: [
{ resource: { cluster: true }, actions: [ "serverStatus" ] }
],
roles: []
}
)
Prerequisites
You must have the grantRole (page 430) action (page 429) on a database to grant a role on that database.
You must have the revokeRole (page 430) action (page 429) on a database to revoke a role on that database.
50 The built-in role clusterMonitor (page 418) also provides the privilege to run mongostat along with other privileges.
To view a roles information, you must be either explicitly granted the role or must have the viewRole
(page 430) action (page 429) on the roles database.
Procedure
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos as a user with
the privileges specified in the prerequisite section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Identify the users roles and privileges. To display the roles and privileges of the user to be modified, use
the db.getUser() and db.getRole() methods.
For example, to view roles for reportsUser created in Add a User (page 374), issue:
use reporting
db.getUser("reportsUser")
To display the privileges granted to the user by the readWrite role on the "accounts" database, issue:
use accounts
db.getRole( "readWrite", { showPrivileges: true } )
Step 3: Identify the privileges to grant or revoke. If the user requires additional privileges, grant to the user the
role, or roles, with the required set of privileges. If such a role does not exist, create a new role (page 375) with the
appropriate set of privileges.
To revoke a subset of privileges provided by an existing role: revoke the original role and grant a role that contains
only the required privileges. You may need to create a new role (page 375) if a role does not exist.
Revoke a Role Revoke a role with the db.revokeRolesFromUser() method. The following example opera-
tion removes the readWrite (page 415) role on the accounts database from the reportsUser:
use reporting
db.revokeRolesFromUser(
"reportsUser",
[
{ role: "readWrite", db: "accounts" }
]
)
Grant a Role Grant a role using the db.grantRolesToUser() method. For example, the following operation
grants the reportsUser user the read (page 415) role on the accounts database:
use reporting
db.grantRolesToUser(
"reportsUser",
[
{ role: "read", db: "accounts" }
]
)
For sharded clusters, the changes to the user are instant on the mongos on which the command runs. How-
ever, for other mongos instances in the cluster, the user cache may wait up to 10 minutes to refresh. See
userCacheInvalidationIntervalSecs.
Prerequisites To modify the password of another user on a database, you must have the changeAnyPassword
action (page 429) on that database.
Procedure
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the
privileges specified in the Prerequisites (page 378) section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Change the password. Pass the users username and the new password to the
db.changeUserPassword() method.
The following operation changes the reporting users password to SOh3TbYhxuLiW8ypJPxmt1oOfL:
db.changeUserPassword("reporting", "SOh3TbYhxuLiW8ypJPxmt1oOfL")
See also:
Change Your Password and Custom Data (page 380)
Prerequisites To view another users information, you must have the viewUser (page 430) action (page 429) on
the other users database.
Users can view their own information.
Procedure
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos as a user with
the privileges specified in the prerequisite section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Identify the users roles. Use the usersInfo command or db.getUser() method to display user
information.
For example, to view roles for reportsUser created in Add a User (page 374), issue:
use reporting
db.getUser("reportsUser")
In the returned document, the roles (page 426) field displays all roles for reportsUser:
...
"roles" : [
{ "role" : "readWrite", "db" : "accounts" },
{ "role" : "read", "db" : "reporting" },
{ "role" : "read", "db" : "products" },
{ "role" : "read", "db" : "sales" }
]
Prerequisites To view a roles information, you must be either explicitly granted the role or must have the
viewRole (page 430) action (page 429) on the roles database.
Procedure
Step 1: Connect to MongoDB with the appropriate privileges. Connect to mongod or mongos as a user with
the privileges specified in the prerequisite section.
The following procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Identify the privileges granted by a role. For a given role, use the db.getRole() method, or the
rolesInfo command, with the showPrivileges option:
For example, to view the privileges granted by read role on the products database, use the following operation,
issue:
use products
db.getRole( "read", { showPrivileges: true } )
In the returned document, the privileges and inheritedPrivileges arrays. The privileges lists
the privileges directly specified by the role and excludes those privileges inherited from other roles. The
inheritedPrivileges lists all privileges granted by this role, both directly specified and inherited. If the role
does not inherit from other roles, the two fields are the same.
...
"privileges" : [
{
"resource": { "db" : "products", "collection" : "" },
"actions": [ "collStats","dbHash","dbStats","find","killCursors","planCacheRead" ]
},
{
"resource" : { "db" : "products", "collection" : "system.js" },
"actions": [ "collStats","dbHash","dbStats","find","killCursors","planCacheRead" ]
}
],
"inheritedPrivileges" : [
{
"resource": { "db" : "products", "collection" : "" },
"actions": [ "collStats","dbHash","dbStats","find","killCursors","planCacheRead" ]
},
{
"resource" : { "db" : "products", "collection" : "system.js" },
"actions": [ "collStats","dbHash","dbStats","find","killCursors","planCacheRead" ]
}
]
On this page
Overview (page 380)
Considerations (page 380)
Prerequisites (page 380)
Procedure (page 381)
Overview
Users with appropriate privileges can change their own passwords and custom data. Custom data (page 426) stores
optional user information.
Considerations
To generate a strong password for use in this procedure, you can use the openssl utilitys rand command. For
example, issue openssl rand with the following options to create a base64-encoded string of 48 pseudo-random
bytes:
openssl rand -base64 48
Prerequisites
To modify your own password and custom data, you must have privileges that grant changeOwnPassword
(page 430) and changeOwnCustomData (page 430) actions (page 429) respectively on the users database.
Step 1: Connect as a user with privileges to manage users and roles. Connect to the mongod or mongos with
privileges to manage users and roles, such as a user with userAdminAnyDatabase (page 421) role. The following
procedure uses the myUserAdmin created in Enable Client Access Control (page 344).
mongo --port 27017 -u myUserAdmin -p abc123 --authenticationDatabase admin
Step 2: Create a role with appropriate privileges. In the admin database, create a new role with
changeOwnPassword (page 430) and changeOwnCustomData (page 430).
use admin
db.createRole(
{ role: "changeOwnPasswordCustomDataRole",
privileges: [
{
resource: { db: "", collection: ""},
actions: [ "changeOwnPassword", "changeOwnCustomData" ]
}
],
roles: []
}
)
Step 3: Add a user with this role. In the test database, create a new user with the created
"changeOwnPasswordCustomDataRole" role. For example, the following operation creates a user with both
the built-in role readWrite (page 415) and the user-created "changeOwnPasswordCustomDataRole".
use test
db.createUser(
{
user:"user123",
pwd:"12345678",
roles:[ "readWrite", { role:"changeOwnPasswordCustomDataRole", db:"admin" } ]
}
)
Procedure
Step 1: Connect with the appropriate privileges. Connect to the mongod or mongos as a user with appropriate
privileges.
For example, the following operation connects to MongoDB as user123 created in the Prerequisites (page 380)
section.
mongo --port 27017 -u user123 -p 12345678 --authenticationDatabase test
To check that you have the privileges specified in the Prerequisites (page 380) section as well as to see user information,
use the usersInfo command with the showPrivileges option.
Step 2: Change your password and custom data. Use the db.updateUser() method to update the password
and custom data.
For example, the following operation changes thw users password to KNlZmiaNUp0B and custom data to {
title: "Senior Manager" }:
use test
db.updateUser(
"user123",
{
pwd: "KNlZmiaNUp0B",
customData: { title: "Senior Manager" }
}
)
6.7.4 Network
The following tutorials provide information on handling network security for MongoDB.
Configure mongod and mongos for TLS/SSL (page 382) Configure MongoDB to support TLS/SSL.
TLS/SSL Configuration for Clients (page 386) Configure clients to connect to MongoDB instances that use
TLS/SSL.
Upgrade a Cluster to Use TLS/SSL (page 390) Rolling upgrade process to use TLS/SSL.
Configure MongoDB for FIPS (page 391) Configure for Federal Information Processing Standard (FIPS).
Configure Linux iptables Firewall for MongoDB (page 392) Basic firewall configuration patterns and examples for
iptables on Linux systems.
Configure Windows netsh Firewall for MongoDB (page 396) Basic firewall configuration patterns and examples for
netsh on Windows systems.
On this page
Overview (page 382)
Prerequisites (page 382)
Procedures (page 383)
Overview
This document helps you to configure MongoDB to support TLS/SSL. MongoDB clients can use TLS/SSL to encrypt
connections to mongod and mongos instances. MongoDB TLS/SSL implementation uses OpenSSL libraries.
Note: Although TLS is the successor to SSL, this page uses the more familiar term SSL to refer to TLS/SSL.
These instructions assume that you have already installed a build of MongoDB that includes SSL support and that your
client driver supports SSL. For instructions on upgrading a cluster currently not using SSL to using SSL, see Upgrade
a Cluster to Use TLS/SSL (page 390).
Changed in version 2.6: MongoDBs SSL encryption only allows use of strong SSL ciphers with a minimum of 128-bit
key length for all connections.
Prerequisites
Important: A full description of TLS/SSL, PKI (Public Key Infrastructure) certificates, and Certificate Authority
is beyond the scope of this document. This page assumes prior knowledge of TLS/SSL as well as access to valid
certificates.
MongoDB Support New in version 3.0: Most MongoDB distributions now include support for SSL.
Certain distributions of MongoDB51 do not contain support for SSL. To use SSL, be sure to choose a package that
supports SSL. All MongoDB Enterprise52 supported platforms include SSL support.
Client Support See TLS/SSL Configuration for Clients (page 386) to learn about SSL support for Python, Java,
Ruby, and other clients.
Certificate Authorities For production use, your MongoDB deployment should use valid certificates generated and
signed by a single certificate authority. You or your organization can generate and maintain an independent certificate
authority, or use certificates generated by a third-party SSL vendor. Obtaining and managing certificates is beyond the
scope of this documentation.
.pem File Before you can use SSL, you must have a .pem file containing a public key certificate and its associated
private key.
MongoDB can use any valid SSL certificate issued by a certificate authority, or a self-signed certificate. If you use a
self-signed certificate, although the communications channel will be encrypted, there will be no validation of server
identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-
the-middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify
the servers identity.
In general, avoid using self-signed certificates unless the network is trusted.
Additionally, with regards to authentication among replica set/sharded cluster members (page 329), in order to mini-
mize exposure of the private key and allow hostname validation, it is advisable to use different certificates on different
servers.
For testing purposes, you can generate a self-signed certificate and private key on a Unix system with a command that
resembles the following:
cd /etc/ssl/
openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.k
This operation generates a new, self-signed certificate with no passphrase that is valid for 365 days. Once you have
the certificate, concatenate the certificate and private key to a .pem file, as in the following example:
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem
See also:
Use x.509 Certificates to Authenticate Clients (page 353)
Procedures
Set Up mongod and mongos with SSL Certificate and Key To use SSL in your MongoDB deployment, include
the following run-time options with mongod and mongos:
net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections.
You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a
port. See net.ssl.mode for details.
PEMKeyfile with the .pem file that contains the SSL certificate and key.
51 http://www.mongodb.org/downloads?jmp=docs
52 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
For example, given an SSL certificate located at /etc/ssl/mongodb.pem, configure mongod to use SSL encryp-
tion for all connections with the following command:
mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem
Note:
Specify <pem> with the full path name to the certificate.
If the private key portion of the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase
(page 386).
You may also specify these options in the configuration file, as in the following examples:
If using the YAML configuration file format:
net:
ssl:
mode: requireSSL
PEMKeyFile: /etc/ssl/mongodb.pem
To connect, to mongod and mongos instances using SSL, the mongo shell and MongoDB tools must include the
--ssl option. See TLS/SSL Configuration for Clients (page 386) for more information on connecting to mongod
and mongos running with SSL.
See also:
Upgrade a Cluster to Use TLS/SSL (page 390)
Set Up mongod and mongos with Certificate Validation To set up mongod or mongos for SSL encryption
using an SSL certificate signed by a certificate authority, include the following run-time options during startup:
net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections.
You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a
port. See net.ssl.mode for details.
PEMKeyfile with the name of the .pem file that contains the signed SSL certificate and key.
CAFile with the name of the .pem file that contains the root certificate chain from the Certificate Authority.
Consider the following syntax for mongod:
mongod --sslMode requireSSL --sslPEMKeyFile <pem> --sslCAFile <ca>
For example, given a signed SSL certificate located at /etc/ssl/mongodb.pem and the certificate authority file
at /etc/ssl/ca.pem, you can configure mongod for SSL encryption as follows:
mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem
Note:
53 https://docs.mongodb.org/v2.4/reference/configuration-options
Specify the <pem> file and the <ca> file with either the full path name or the relative path name.
If the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase (page 386).
You may also specify these options in the configuration file, as in the following examples:
If using the YAML configuration file format:
net:
ssl:
mode: requireSSL
PEMKeyFile: /etc/ssl/mongodb.pem
CAFile: /etc/ssl/ca.pem
To connect, to mongod and mongos instances using SSL, the mongo tools must include the both the --ssl and
--sslPEMKeyFile option. See TLS/SSL Configuration for Clients (page 386) for more information on connecting
to mongod and mongos running with SSL.
See also:
Upgrade a Cluster to Use TLS/SSL (page 390)
Block Revoked Certificates for Clients To prevent clients with revoked certificates from connecting, include the
sslCRLFile to specify a .pem file that contains revoked certificates.
For example, the following mongod with SSL configuration includes the sslCRLFile setting:
mongod --sslMode requireSSL --sslCRLFile /etc/ssl/ca-crl.pem --sslPEMKeyFile /etc/ssl/mongodb.pem --s
Clients with revoked certificates in the /etc/ssl/ca-crl.pem will not be able to connect to this mongod in-
stance.
Validate Only if a Client Presents a Certificate In most cases it is important to ensure that clients present valid
certificates. However, if you have clients that cannot present a client certificate, or are transitioning to using a certificate
authority you may only want to validate certificates from clients that present a certificate.
If you want to bypass validation for clients that dont present certificates, include the
allowConnectionsWithoutCertificates run-time option with mongod and mongos. If the client
does not present a certificate, no validation occurs. These connections, though not validated, are still encrypted using
SSL.
For example, consider the following mongod with an SSL configuration that includes the
allowConnectionsWithoutCertificates setting:
mongod --sslMode requireSSL --sslAllowConnectionsWithoutCertificates --sslPEMKeyFile /etc/ssl/mongodb
Then, clients can connect either with the option --ssl and no certificate or with the option --ssl and a valid
certificate. See TLS/SSL Configuration for Clients (page 386) for more information on SSL connections for clients.
Note: If the client presents a certificate, the certificate must be a valid certificate.
All connections, including those that have not presented certificates are encrypted using SSL.
54 https://docs.mongodb.org/v2.4/reference/configuration-options
For more information, including the protocols recognized by the option, see net.ssl.disabledProtocols or
the --sslDisabledProtocols option for mongod and mongos.
SSL Certificate Passphrase The PEM files for PEMKeyfile and ClusterFile may be encrypted. With en-
crypted PEM files, you must specify the passphrase at startup with a command-line or a configuration file option or
enter the passphrase when prompted.
Changed in version 2.6: In previous versions, you can only specify the passphrase with a command-line or a configu-
ration file option.
To specify the passphrase in clear text on the command line or in a configuration file, use the PEMKeyPassword
and/or the ClusterPassword option.
To have MongoDB prompt for the passphrase at the start of mongod or mongos and avoid specifying the passphrase
in clear text, omit the PEMKeyPassword and/or the ClusterPassword option. MongoDB will prompt for each
passphrase as necessary.
Important: The passphrase prompt option is available if you run the MongoDB instance in the foreground with
a connected terminal. If you run mongod or mongos in a non-interactive session (e.g. without a terminal or as a
service on Windows), you cannot use the passphrase prompt option.
See Configure MongoDB for FIPS (page 391) for more details.
55 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
On this page
mongo Shell SSL Configuration (page 387)
MongoDB Cloud Manager and Ops Manager Monitoring Agent (page 388)
MongoDB Drivers (page 389)
MongoDB Tools (page 389)
Clients must have support for TLS/SSL to work with a mongod or a mongos instance that has TLS/SSL support
enabled.
Important: A full description of TLS/SSL, PKI (Public Key Infrastructure) certificates, and Certificate Authority
is beyond the scope of this document. This page assumes prior knowledge of TLS/SSL as well as access to valid
certificates.
Note: Although TLS is the successor to SSL, this page uses the more familiar term SSL to refer to TLS/SSL.
See also:
Configure mongod and mongos for TLS/SSL (page 382).
For SSL connections, you must use the mongo shell built with SSL support or distributed with MongoDB Enterprise.
New in version 3.0: Most MongoDB distributions now include support for SSL.
The mongo shell provides various mongo-shell-ssl settings, including:
--ssl
--sslPEMKeyFile with the name of the .pem file that contains the SSL certificate and key.
--sslCAFile with the name of the .pem file that contains the certificate from the Certificate Authority (CA).
Changed in version 3.0: When running mongo with the --ssl option, you must include either --sslCAFile
or --sslAllowInvalidCertificates.
This restriction does not apply to the MongoDB tools. However, running the tools without -sslCAFile
creates the same vulnerability to invalid certificates.
Warning: For SSL connections (--ssl) to mongod and mongos, if the mongo shell (or Mon-
goDB tools (page 389)) runs without the --sslCAFile <CAFile> option (i.e. specifies the
--sslAllowInvalidCertificates instead), the mongo shell (or MongoDB tools (page 389)) will
not attempt to validate the server certificates. This creates a vulnerability to expired mongod and mongos
certificates as well as to foreign processes posing as valid mongod or mongos instances. Ensure that you
always specify the CA file to validate the server certificates in cases where intrusion is a possibility.
Connect to MongoDB Instance with SSL Encryption To connect to a mongod or mongos instance that requires
only a SSL encryption mode (page 383), start mongo shell with --ssl and include the --sslCAFile to validate
the server certificates.
Changed in version 3.0: When running mongo with the --ssl option, you must include either --sslCAFile or
--sslAllowInvalidCertificates.
This restriction does not apply to the MongoDB tools. However, running the tools without -sslCAFile creates the
same vulnerability to invalid certificates.
Connect to MongoDB Instance that Requires Client Certificates To connect to a mongod or mongos that re-
quires CA-signed client certificates (page 384), start the mongo shell with --ssl, the --sslPEMKeyFile option
to specify the signed certificate-key file, and the --sslCAFile to validate the server certificates.
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem --sslCAFile /etc/ssl/ca.pem
Changed in version 3.0: When running mongo with the --ssl option, you must include either --sslCAFile or
--sslAllowInvalidCertificates.
This restriction does not apply to the MongoDB tools. However, running the tools without -sslCAFile creates the
same vulnerability to invalid certificates.
Connect to MongoDB Instance that Validates when Presented with a Certificate To connect to a mongod or
mongos instance that only requires valid certificates when the client presents a certificate (page 385), start mongo
shell either:
with the --ssl, --sslCAFile, and no certificate or
with the --ssl, --sslCAFile, and a valid signed certificate.
Changed in version 3.0: When running mongo with the --ssl option, you must include either --sslCAFile or
--sslAllowInvalidCertificates.
This restriction does not apply to the MongoDB tools. However, running the tools without -sslCAFile creates the
same vulnerability to invalid certificates.
For example, if mongod is running with weak certificate validation, both of the following mongo shell clients can
connect to that mongod:
mongo --ssl --sslCAFile /etc/ssl/ca.pem
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem --sslCAFile /etc/ssl/ca.pem
The MongoDB Cloud Manager Monitoring agent will also have to connect via SSL in order to gather its statistics.
Because the agent already utilizes SSL for its communications to the MongoDB Cloud Manager servers, this is just
a matter of enabling SSL support in MongoDB Cloud Manager itself on a per host basis. See the MongoDB Cloud
Manager documentation56 for more information about SSL configuration.
For Ops Manager, see Ops Manager documentation57 .
56 https://docs.cloud.mongodb.com/
57 https://docs.opsmanager.mongodb.com/current/
MongoDB Drivers
The MongoDB Drivers support for connection to SSL enabled MongoDB. See:
C Driver58
C++ Driver59
C# Driver60
Java Driver61
Node.js Driver62
Perl Driver63
PHP Driver64
Python Driver65
Ruby Driver66
Scala Driver67
MongoDB Tools
Changed in version 3.0: Most MongoDB distributions now include support for TLS/SSL. See Configure mongod
and mongos for TLS/SSL (page 382) and TLS/SSL Configuration for Clients (page 386) for more information about
TLS/SSL and MongoDB.
Important: A full description of TLS/SSL, PKI (Public Key Infrastructure) certificates, and Certificate Authority
is beyond the scope of this document. This page assumes prior knowledge of TLS/SSL as well as access to valid
certificates.
2. Switch all clients to use TLS/SSL. See TLS/SSL Configuration for Clients (page 386).
3. For each node of a cluster, use the setParameter command to update the sslMode to preferSSL. 69
With preferSSL as its net.ssl.mode, the node accepts both TLS/SSL and non-TLS/non-SSL incoming
connections, and its connections to other servers use TLS/SSL. For example:
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL" } )
5. After the upgrade of all nodes, edit the configuration file with the appropriate TLS/SSL settings to
ensure that upon subsequent restarts, the cluster uses TLS/SSL.
On this page
Overview (page 391)
Prerequisites (page 391)
Considerations (page 392)
Procedure (page 392)
Overview
The Federal Information Processing Standard (FIPS) is a U.S. government computer security standard used to certify
software modules and libraries that encrypt and decrypt data securely. You can configure MongoDB to run with a
FIPS 140-2 certified library for OpenSSL. Configure FIPS to run by default or as needed from the command line.
Prerequisites
Important: A full description of FIPS and TLS/SSL is beyond the scope of this document. This tutorial assumes
prior knowledge of FIPS and TLS/SSL.
Only the MongoDB Enterprise70 version supports FIPS mode. See Install MongoDB Enterprise (page 33) to download
and install MongoDB Enterprise71 to use FIPS mode.
Your system must have an OpenSSL library configured with the FIPS 140-2 module. At the command line, type
openssl version to confirm your OpenSSL software includes FIPS support.
For Red Hat Enterprise Linux 6.x (RHEL 6.x) or its derivatives such as CentOS 6.x, the OpenSSL toolkit must be
at least openssl-1.0.1e-16.el6_5 to use FIPS mode. To upgrade the toolkit for these platforms, issue the
following command:
sudo yum update openssl
Some versions of Linux periodically execute a process to prelink dynamic libraries with pre-assigned addresses. This
process modifies the OpenSSL libraries, specifically libcrypto. The OpenSSL FIPS mode will subsequently fail
the signature check performed upon startup to ensure libcrypto has not been modified since compilation.
To configure the Linux prelink process to not prelink libcrypto:
sudo bash -c "echo '-b /usr/lib64/libcrypto.so.*' >>/etc/prelink.conf.d/openssl-prelink.conf"
70 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
71 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
Considerations
FIPS is property of the encryption system and not the access control system. However, if your environment re-
quires FIPS compliant encryption and access control, you must ensure that the access control system uses only FIPS-
compliant encryption.
MongoDBs FIPS support covers the way that MongoDB uses OpenSSL for network encryption and X509 authen-
tication. If you use Kerberos or LDAP Proxy authentication, you muse ensure that these external mechanisms are
FIPS-compliant. MONGODB-CR authentication is not FIPS compliant.
Procedure
Configure MongoDB to use TLS/SSL See Configure mongod and mongos for TLS/SSL (page 382) for details about
configuring OpenSSL.
Run mongod or mongos instance in FIPS mode Perform these steps after you Configure mongod and mongos
for TLS/SSL (page 382).
Step 1: Change configuration file. To configure your mongod or mongos instance to use FIPS mode, shut down
the instance and update the configuration file with the following setting:
net:
ssl:
FIPSMode: true
Step 2: Start mongod or mongos instance with configuration file. For example, run this command to start the
mongod instance with its configuration file:
mongod --config /etc/mongod.conf
Confirm FIPS mode is running Check the server log file for a message FIPS is active:
FIPS 140-2 mode activated
On this page
Overview (page 393)
Patterns (page 393)
Change Default Policy to DROP (page 395)
Manage and Maintain iptables Configuration (page 395)
On contemporary Linux systems, the iptables program provides methods for managing the Linux Kernels
netfilter or network packet filtering capabilities. These firewall rules make it possible for administrators to
control what hosts can connect to the system, and limit risk exposure by limiting the hosts that can connect to a
system.
This document outlines basic firewall configurations for iptables firewalls on Linux. Use these approaches as a
starting point for your larger networking organization. For a detailed overview of security practices and risk manage-
ment for MongoDB, see Security (page 315).
See also:
For MongoDB deployments on Amazons web services, see the Amazon EC272 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Overview
Rules in iptables configurations fall into chains, which describe the process for filtering and processing specific
streams of traffic. Chains have an order, and packets must pass through earlier rules in a chain to reach later rules.
This document addresses only the following two chains:
INPUT Controls all incoming traffic.
OUTPUT Controls all outgoing traffic.
Given the default ports of all MongoDB processes, you must configure networking rules that permit only re-
quired communication between your application and the appropriate mongod and mongos instances.
Be aware that, by default, the default policy of iptables is to allow all connections and traffic unless explicitly
disabled. The configuration changes outlined in this document will create rules that explicitly allow traffic from
specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. When
you have properly configured your iptables rules to allow only the traffic that you want to permit, you can Change
Default Policy to DROP (page 395).
Patterns
This section contains a number of patterns and examples for configuring iptables for use with MongoDB deploy-
ments. If you have configured different ports using the port configuration setting, you will need to modify the rules
accordingly.
Traffic to and from mongod Instances This pattern is applicable to all mongod instances running as standalone
instances or as part of a replica set.
The goal of this pattern is to explicitly allow traffic to the mongod instance from the application server. In the
following examples, replace <ip-address> with the IP address of the application server:
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27017 -m state --state NEW,ESTABLISHED -j
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27017 -m state --state ESTABLISHED -j ACCEPT
The first rule allows all incoming traffic from <ip-address> on port 27017, which allows the application server to
connect to the mongod instance. The second rule, allows outgoing traffic from the mongod to reach the application
server.
Optional
If you have only one application server, you can replace <ip-address> with either the IP address itself, such as:
198.51.100.55. You can also express this using CIDR notation as 198.51.100.55/32. If you want to permit
a larger block of possible IP addresses you can allow traffic from a /24 using one of the following specifications for
the <ip-address>, as follows:
72 https://docs.mongodb.org/ecosystem/platforms/amazon-ec2
10.10.10.10/24
10.10.10.10/255.255.255.0
Traffic to and from mongos Instances mongos instances provide query routing for sharded clusters. Clients
connect to mongos instances, which behave from the clients perspective as mongod instances. In turn, the mongos
connects to all mongod instances that are components of the sharded cluster.
Use the same iptables command to allow traffic to and from these instances as you would from the mongod
instances that are members of the replica set. Take the configuration outlined in the Traffic to and from mongod
Instances (page 393) section as an example.
Traffic to and from a MongoDB Config Server Config servers, host the config database that stores metadata
for sharded clusters. Each production cluster has three config servers, initiated using the mongod --configsvr
option. 73 Config servers listen for connections on port 27019. As a result, add the following iptables rules to the
config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers.
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27019 -m state --state ESTABLISHED -j ACCEPT
Replace <ip-address> with the address or address space of all the mongod that provide config servers.
Additionally, config servers need to allow incoming connections from all of the mongos instances in the cluster and
all mongod instances in the cluster. Add rules that resemble the following:
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j
Replace <ip-address> with the address of the mongos instances and the shard mongod instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 74 Because
the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must
configure the following iptables rules to allow traffic to and from each shard:
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27018 -m state --state NEW,ESTABLISHED -j
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT
Replace the <ip-address> specification with the IP address of all mongod. This allows you to permit incoming
and outgoing traffic between all shards including constituent replica set members, to:
all mongod instances in the shards replica sets.
75
all mongod instances in other shards.
Furthermore, shards need to be able make outgoing connections to:
all mongod instances in the config servers.
Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers
and the mongos instances:
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT
73 You also can run a config server by using the configsvr value for the clusterRole setting in a configuration file.
74 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members
are also often conventional replica sets using the default port.
75 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For config server mongod instances running with the shardsvr value for the clusterRole setting, the
rule would resemble the following:
iptables -A INPUT -s <ip-address> -p tcp --destination-port 28018 -m state --state NEW,ESTABLISH
For config server mongod instances running with the configsvr value for the clusterRole setting, the
rule would resemble the following:
iptables -A INPUT -s <ip-address> -p tcp --destination-port 28019 -m state --state NEW,ESTABLISH
The default policy for iptables chains is to allow all traffic. After completing all iptables configuration changes,
you must change the default policy to DROP so that all traffic that isnt explicitly allowed as above will not be able to
reach components of the MongoDB deployment. Issue the following commands to change this policy:
iptables -P INPUT DROP
This section contains a number of basic operations for managing and using iptables. There are various front end
tools that automate some aspects of iptables configuration, but at the core all iptables front ends provide the
same basic functionality:
Make all iptables Rules Persistent By default all iptables rules are only stored in memory. When your
system restarts, your firewall rules will revert to their defaults. When you have tested a rule set and have guaranteed
that it effectively controls traffic you can use the following operations to you should make the rule set persistent.
On Red Hat Enterprise Linux, Fedora Linux, and related distributions you can issue the following command:
service iptables save
On Debian, Ubuntu, and related distributions, you can use the following command to dump the iptables rules to
the /etc/iptables.conf file:
iptables-save > /etc/iptables.conf
Place this command in your rc.local file, or in the /etc/network/if-up.d/iptables file with other
similar operations.
List all iptables Rules To list all of currently applied iptables rules, use the following operation at the system
shell.
iptables -L
Flush all iptables Rules If you make a configuration mistake when entering iptables rules or simply need to
revert to the default rule set, you can use the following operation at the system shell to flush all rules:
iptables -F
If youve already made your iptables rules persistent, you will need to repeat the appropriate procedure in the
Make all iptables Rules Persistent (page 395) section.
On this page
Overview (page 396)
Patterns (page 397)
Manage and Maintain Windows Firewall Configurations (page 399)
On Windows Server systems, the netsh program provides methods for managing the Windows Firewall. These
firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk
exposure by limiting the hosts that can connect to a system.
This document outlines basic Windows Firewall configurations. Use these approaches as a starting point for your
larger networking organization. For a detailed over view of security practices and risk management for MongoDB, see
Security (page 315).
See also:
Windows Firewall76 documentation from Microsoft.
Overview
Windows Firewall processes rules in an ordered determined by rule type, and parsed in the following order:
1. Windows Service Hardening
2. Connection security rules
3. Authenticated Bypass Rules
4. Block Rules
5. Allow Rules
6. Default Rules
76 http://technet.microsoft.com/en-us/network/bb545423.aspx
By default, the policy in Windows Firewall allows all outbound connections and blocks all incoming connections.
Given the default ports of all MongoDB processes, you must configure networking rules that permit only re-
quired communication between your application and the appropriate mongod.exe and mongos.exe instances.
The configuration changes outlined in this document will create rules which explicitly allow traffic from specific
addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed.
You can configure the Windows Firewall with using the netsh command line tool or through a windows application.
On Windows Server 2008 this application is Windows Firewall With Advanced Security in Administrative Tools. On
previous versions of Windows Server, access the Windows Firewall application in the System and Security control
panel.
The procedures in this document use the netsh command line tool.
Patterns
This section contains a number of patterns and examples for configuring Windows Firewall for use with MongoDB
deployments. If you have configured different ports using the port configuration setting, you will need to modify the
rules accordingly.
Traffic to and from mongod.exe Instances This pattern is applicable to all mongod.exe instances running as
standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the mongod.exe
instance from the application server.
netsh advfirewall firewall add rule name="Open mongod port 27017" dir=in action=allow protocol=TCP lo
This rule allows all incoming traffic to port 27017, which allows the application server to connect to the
mongod.exe instance.
Windows Firewall also allows enabling network access for an entire application rather than to a specific port, as in the
following example:
netsh advfirewall firewall add rule name="Allowing mongod" dir=in action=allow program=" C:\mongodb\b
You can allow all access for a mongos.exe server, with the following invocation:
netsh advfirewall firewall add rule name="Allowing mongos" dir=in action=allow program=" C:\mongodb\b
Traffic to and from mongos.exe Instances mongos.exe instances provide query routing for sharded clusters.
Clients connect to mongos.exe instances, which behave from the clients perspective as mongod.exe instances.
In turn, the mongos.exe connects to all mongod.exe instances that are components of the sharded cluster.
Use the same Windows Firewall command to allow traffic to and from these instances as you would from the
mongod.exe instances that are members of the replica set.
netsh advfirewall firewall add rule name="Open mongod shard port 27018" dir=in action=allow protocol=
Traffic to and from a MongoDB Config Server Configuration servers, host the config database that stores meta-
data for sharded clusters. Each production cluster has three configuration servers, initiated using the mongod
--configsvr option. 77 Configuration servers listen for connections on port 27019. As a result, add the fol-
lowing Windows Firewall rules to the config server to allow incoming and outgoing connection on port 27019, for
connection to the other config servers.
77 You also can run a config server by using the configsrv value for the clusterRole setting in a configuration file.
netsh advfirewall firewall add rule name="Open mongod config svr port 27019" dir=in action=allow prot
Additionally, config servers need to allow incoming connections from all of the mongos.exe instances in the cluster
and all mongod.exe instances in the cluster. Add rules that resemble the following:
netsh advfirewall firewall add rule name="Open mongod config svr inbound" dir=in action=allow protoco
Replace <ip-address> with the addresses of the mongos.exe instances and the shard mongod.exe instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 78 Because
the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must
configure the following Windows Firewall rules to allow traffic to and from each shard:
netsh advfirewall firewall add rule name="Open mongod shardsvr inbound" dir=in action=allow protocol=
netsh advfirewall firewall add rule name="Open mongod shardsvr outbound" dir=out action=allow protoco
Replace the <ip-address> specification with the IP address of all mongod.exe instances. This allows you to
permit incoming and outgoing traffic between all shards including constituent replica set members to:
all mongod.exe instances in the shards replica sets.
79
all mongod.exe instances in other shards.
Furthermore, shards need to be able make outgoing connections to:
all mongos.exe instances.
all mongod.exe instances in the config servers.
Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers
and the mongos.exe instances:
netsh advfirewall firewall add rule name="Open mongod config svr outbound" dir=out action=allow proto
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For config server mongod instances running with the shardsvr value for the clusterRole setting, the
rule would resemble the following:
netsh advfirewall firewall add rule name="Open mongos HTTP monitoring inbound" dir=in action=all
For config server mongod instances running with the configsvr value for the clusterRole setting, the
rule would resemble the following:
78 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members
are also often conventional replica sets using the default port.
79 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
netsh advfirewall firewall add rule name="Open mongod configsvr HTTP monitoring inbound" dir=in
This section contains a number of basic operations for managing and using netsh. While you can use the GUI front
ends to manage the Windows Firewall, all core functionality is accessible is accessible from netsh.
Delete all Windows Firewall Rules To delete the firewall rule allowing mongod.exe traffic:
netsh advfirewall firewall delete rule name="Open mongod port 27017" protocol=tcp localport=27017
netsh advfirewall firewall delete rule name="Open mongod shard port 27018" protocol=tcp localport=270
List All Windows Firewall Rules To return a list of all Windows Firewall rules:
netsh advfirewall firewall show rule name=all
Backup and Restore Windows Firewall Rules To simplify administration of larger collection of systems, you can
export or import firewall systems from different servers) rules very easily on Windows:
Export all firewall rules with the following command:
netsh advfirewall export "C:\temp\MongoDBfw.wfw"
Replace "C:\temp\MongoDBfw.wfw" with a path of your choosing. You can use a command in the following
form to import a file created using this operation:
netsh advfirewall import "C:\temp\MongoDBfw.wfw"
6.7.5 Encryption
Configure Encryption
On this page
Overview (page 400)
Key Manager (page 400)
Local Key Management (page 401)
Overview
Enterprise Feature
Available in MongoDB Enterprise only.
MongoDB Enterprise 3.2 introduces a native encryption option for the WiredTiger storage engine. With storage
encryption, the secure management of the encryption keys is critical.
Only the master key is external to the server and requires external management. To manage the master key, MongoDBs
encrypted storage engine supports two key management options:
Integration with a third party key management appliance via the Key Management Interoperability Protocol
(KMIP). Recommended
Use of local key management via a keyfile.
The following tutorial outlines the procedures to configure MongoDB for encryption and key management.
Key Manager
MongoDB Enterprise supports secure transfer of keys with compatible key management appliances. Using a key
manager allows for the keys to be stored in the key manager.
MongoDB Enterprise supports secure transfer of keys with Key Management Interoperability Protocol (KMIP) com-
pliant key management appliances. While any appliance vendor that provides support for KMIP is expected to be
compatible, MongoDB has certified against SafeNet KeySecure and Vormetric Data Security Manager (DSM).
Recommended
Using a key manager meets regulatory key management guidelines, such as HIPAA, PCI-DSS, and FERPA, and is
recommended over the local key management.
Prerequisites
Your key manager must support the KMIP communication protocol, such as Vormetric DSM and Safenet Key-
Secure.
For Vormetric or Safenet, you must have a valid certificates issued by the specific appliance vendor in order to
authenticate MongoDB to the KMIP server.
Encrypt Using a New Key To create a new key, connect mongod to the key manager by starting mongod with the
following options:
--enableEncryption,
--kmipServerName <KMIP Server Hostname>,
--kmipServerCAFile <path to KMIP Servers CA File>, and
--kmipClientCertificateFile <path to valid client certificate>.
Include any other options specific to your configuration.
This operation creates a new master key in your key manager for use by the mongod to wrap the keys mongod
generates for each database.
To verify that the key creation and usage was successful, check the log file. If successful, the process will log the
following messages:
[initandlisten] Created KMIP key with id: <UID>
[initandlisten] Encryption key manager initialized using master key with id: <UID>
See also:
encryption-key-management-options,
Encrypt Using an Existing Key You can use an existing master key created and managed by your KMIP. To use an
existing key, connect mongod to the key manager by starting mongod with the following options:
--enableEncryption,
--kmipServerName <KMIP Server Hostname,
--kmipServerCAFile <path to KMIP Servers CA File>,
--kmipClientCertificateFile <path to valid client certificate>, and
--kmipKeyIdentifier <UID>.
Include any other options specific to your configuration.
mongod --enableEncryption --kmipServerName <KMIP Server HostName> \
--kmipServerCAFile ca.pem --kmipClientCertificateFile client.pem \
--kmipKeyIdentifier <UID>
Important: If data is already encrypted with a key, you must specify that keys <UID> for the
--kmipKeyIdentifier option. Otherwise, MongoDB will not start and log an error.
See also:
encryption-key-management-options
Important: Using the keyfile method does not meet most regulatory key management guidelines and requires users
to securely manage their own keys.
The safe management of the keyfile is critical.
To encrypt using a keyfile, you must have a base64 encoded keyfile that contains a 16 or 32 character string. The
keyfile must only be accessible by the owner of the mongod process.
1. Create the base64 encoded keyfile with the 16 or 32 character string. You can generate the encoded keyfile using
any method you prefer. For example,
openssl rand -base64 32 > mongodb-keyfile
3. To use the key file, start mongod with the following options:
--enableEncryption,
--encryptionKeyFile <path to keyfile>,
mongod --enableEncryption --encryptionKeyFile mongodb-keyfile
4. Verify if the encryption key manager successfully initialized with the keyfile. If the operation was successful,
the process will log the following message:
[initandlisten] Encryption key manager initialized using master key with id:
See also:
encryption-key-management-options
On this page
Rotate a Member of Replica Set (page 402)
KMIP Master Key Rotation (page 403)
Most regulatory requirements mandate that a managed key used to decrypt sensitive data must be rotated out and
replaced with a new key once a year.
MongoDB provides two options for key rotation. You can rotate out the binary with a new instance that uses a new
key. Or, if you are using a KMIP server for key management, you can rotate the master key.
During the initial sync process, the re-encryption of the data with an entirely new set of database keys as well as
a new system key occurs.
4. Once the new node completes its initial sync process, remove the old node from the replica set and delete all its
data. For instructions, see Remove Members from Replica Set (page 673)
If you are using a KMIP server for key management, you can rotate the master key, the only externally managed
key. With the new master key, the internal keystore will be re-encrypted but the database keys will be otherwise left
unchanged. This obviates the need to re-encrypt the entire data set.
1. Rotate the master key for the secondary (page ??) members of the replica set one at a time.
(a) Restart the secondary, including the --kmipRotateMasterKey parameter. Include any other options
specific to your configuration. If the member already includes the --kmipKeyIdentifier option,
either update the --kmipKeyIdentifier option with the new key to use or omit to request a new key
from the KMIP server:
mongod --enableEncryption --kmipRotateMasterKey \
--kmipServerName <KMIP Server HostName> \
--kmipServerCAFile ca.pem --kmipClientCertificateFile client.pem
3. When rs.status() shows that the primary has stepped down and another member has assumed PRIMARY
state, rotate the master key for the stepped down member:
(a) Restart the stepped-down member, including the --kmipRotateMasterKey parameter. In-
clude any other options specific to your configuration. If the member already includes the
--kmipKeyIdentifier option, either update the --kmipKeyIdentifier option with the new
key to use or omit.
mongod --enableEncryption --kmipRotateMasterKey \
--kmipServerName <KMIP Server HostName> \
--kmipServerCAFile ca.pem --kmipClientCertificateFile client.pem
6.7.6 Auditing
The following tutorials provide instructions on how to enable auditing for system events and specify which events to
audit.
Configure Auditing (page 404) Enable and configure MongoDB Enterprise system event auditing feature.
Configure Audit Filters (page 406) Specify which events to audit.
Configure Auditing
On this page
Enable and Configure Audit Output (page 404)
Use the --auditDestination option to enable auditing and specify where to output the audit events.
Warning: For sharded clusters, if you enable auditing on mongos instances, you must enable auditing on all
mongod instances in the cluster, i.e. shards and config servers.
Output to Syslog To enable auditing and print audit events to the syslog (option is unavailable on Windows) in
JSON format, specify syslog for the --auditDestination setting. For example:
mongod --dbpath data/db --auditDestination syslog
Warning: The syslog message limit can result in the truncation of the audit messages. The auditing system will
neither detect the truncation nor error upon its occurrence.
Output to Console To enable auditing and print the audit events to standard output (i.e. stdout), specify
console for the --auditDestination setting. For example:
80 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
Output to JSON File To enable auditing and print audit events to a file in JSON format, specify file
for the --auditDestination setting, JSON for the --auditFormat setting, and the output filename for
the --auditPath. The --auditPath option accepts either full path name or relative path name. For
example, the following enables auditing and records audit events to a file with the relative path name of
data/db/auditLog.json:
mongod --dbpath data/db --auditDestination file --auditFormat JSON --auditPath data/db/auditLog.json
The audit file rotates at the same time as the server log file.
You may also specify these options in the configuration file:
storage:
dbPath: data/db
auditLog:
destination: file
format: JSON
path: data/db/auditLog.json
Note: Printing audit events to a file in JSON format degrades server performance more than printing to a file in BSON
format.
Output to BSON File To enable auditing and print audit events to a file in BSON binary format, specify file
for the --auditDestination setting, BSON for the --auditFormat setting, and the output filename for
the --auditPath. The --auditPath option accepts either full path name or relative path name. For ex-
ample, the following enables auditing and records audit events to a BSON file with the relative path name of
data/db/auditLog.bson:
mongod --dbpath data/db --auditDestination file --auditFormat BSON --auditPath data/db/auditLog.bson
The audit file rotates at the same time as the server log file.
You may also specify these options in the configuration file:
storage:
dbPath: data/db
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
To view the contents of the file, pass the file to the MongoDB utility bsondump. For example, the following converts
the audit log into a human-readable form and output to the terminal:
bsondump data/db/auditLog.bson
See also:
Configure Audit Filters (page 406), Auditing (page 340), System Event Audit Messages (page 434)
On this page
--auditFilter Option (page 406)
Examples (page 406)
MongoDB Enterprise81 supports auditing (page 340) of various operations. When enabled (page 404), the audit
facility, by default, records all auditable operations as detailed in Audit Event Actions, Details, and Results (page 435).
To specify which events to record, the audit feature includes the --auditFilter option.
--auditFilter Option
The --auditFilter option takes a string representation of a query document of the form:
{ <field1>: <expression1>, ... }
The <field> can be any field in the audit message (page 434), including fields returned in the param
(page 435) document.
The <expression> is a query condition expression.
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
Examples
Filter for Multiple Operation Types The following example audits only the createCollection (page 430)
and dropCollection (page 430) actions by using the filter:
{ atype: { $in: [ "createCollection", "dropCollection" ] } }
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
mongod --dbpath data/db --auditDestination file --auditFilter '{ atype: { $in: [ "createCollection",
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ atype: { $in: [ "createCollection", "dropCollection" ] } }'
Filter on Authentication Operations on a Single Database The <field> can include any field in the audit
message (page 434). For authentication operations (i.e. atype: "authenticate"), the audit messages include
a db field in the param document.
The following example audits only the authenticate operations that occur against the test database by using
the filter:
81 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
mongod --dbpath data/db --auth --auditDestination file --auditFilter '{ atype: "authenticate", "param
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
security:
authorization: enabled
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ atype: "authenticate", "param.db": "test" }'
To filter on all authenticate operations across databases, use the filter { atype: "authenticate" }.
Filter on Collection Creation and Drop Operations for a Single Database The <field> can include any field in
the audit message (page 434). For collection creation and drop operations (i.e. atype: "createCollection"
and atype: "dropCollection"), the audit messages include a namespace ns field in the param document.
The following example audits only the createCollection and dropCollection operations that occur against
the test database by using the filter:
Note: The regular expression requires two backslashes (\\) to escape the dot (.).
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
mongod --dbpath data/db --auth --auditDestination file --auditFilter '{ atype: { $in: [ "createCollec
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
security:
authorization: enabled
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ atype: { $in: [ "createCollection", "dropCollection" ] }, "param.ns": /^test\\./ } }'
Filter by Authorization Role The following example audits operations by users with readWrite (page 415) role
on the test database, including users with roles that inherit from readWrite (page 415), by using the filter:
{ roles: { role: "readWrite", db: "test" } }
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
mongod --dbpath data/db --auth --auditDestination file --auditFilter '{ roles: { role: "readWrite", d
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
security:
authorization: enabled
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ roles: { role: "readWrite", db: "test" } }'
Filter on Read and Write Operations To capture read and write operations in the audit, you must also enable
the audit system to log authorization successes using the auditAuthorizationSuccess parameter. 82
Note: Enabling auditAuthorizationSuccess degrades performance more than logging only the authorization
failures.
The following example audits the find(), insert(), remove(), update(), save(), and
findAndModify() operations by using the filter:
{ atype: "authCheck", "param.command": { $in: [ "find", "insert", "delete", "update", "findandmodify"
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
mongod --dbpath data/db --auth --setParameter auditAuthorizationSuccess=true --auditDestination file
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
security:
authorization: enabled
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ atype: "authCheck", "param.command": { $in: [ "find", "insert", "delete", "update", "fi
setParameter: { auditAuthorizationSuccess: true }
Filter on Read and Write Operations for a Collection To capture read and write operations in the audit,
you must also enable the audit system to log authorization successes using the auditAuthorizationSuccess
parameter. 1
Note: Enabling auditAuthorizationSuccess degrades performance more than logging only the authorization
failures.
The following example audits the find(), insert(), remove(), update(), save(), and
findAndModify() operations for the collection orders in the database test by using the filter:
{ atype: "authCheck", "param.ns": "test.orders", "param.command": { $in: [ "find", "insert", "delete"
To specify an audit filter, enclose the filter document in single quotes to pass the document as a string.
82 You can enable auditAuthorizationSuccess parameter without enabling --auth; however, all operations will return success for
authorization checks.
To specify the audit filter in a configuration file, you must use the YAML format of the configuration file.
storage:
dbPath: data/db
security:
authorization: enabled
auditLog:
destination: file
format: BSON
path: data/db/auditLog.bson
filter: '{ atype: "authCheck", "param.ns": "test.orders", "param.command": { $in: [ "find", "inser
setParameter: { auditAuthorizationSuccess: true }
See also:
Configure Auditing (page 404), Auditing (page 340), System Event Audit Messages (page 434)
6.7.7 Miscellaneous
On this page
Procedure (page 409)
The $redact pipeline operator restricts the contents of the documents based on information stored in the documents
themselves.
To store the access criteria data, add a field to the documents and embedded documents. To allow for multiple com-
binations of access levels for the same data, consider setting the access field to an array of arrays. Each array element
contains a required set that allows a user with that set to access the data.
Then, include the $redact stage in the db.collection.aggregate() operation to restrict contents of the
result set based on the access required to view the data.
For more information on the $redact pipeline operator, including its syntax and associated system variables as well
as additional examples, see $redact.
Procedure
For example, a forecasts collection contains documents of the following form where the tags field determines
the access levels required to view the data:
{
_id: 1,
title: "123 Department Report",
tags: [ [ "G" ], [ "FDW" ] ],
year: 2014,
subsections: [
{
subtitle: "Section 1: Overview",
tags: [ [ "SI", "G" ], [ "FDW" ] ],
content: "Section 1: This is the content of section 1."
},
{
subtitle: "Section 2: Analysis",
tags: [ [ "STLW" ] ],
content: "Section 2: This is the content of section 2."
},
{
subtitle: "Section 3: Budgeting",
tags: [ [ "TK" ], [ "FDW", "TGE" ] ],
content: {
text: "Section 3: This is the content of section3.",
tags: [ [ "HCS"], [ "FDW", "TGE", "BX" ] ]
}
}
]
}
For each document, the tags field contains various access groupings necessary to view the data. For example, the
value [ [ "G" ], [ "FDW", "TGE" ] ] can specify that a user requires either access level ["G"] or both [
"FDW", "TGE" ] to view the data.
Consider a user who only has access to view information tagged with either "FDW" or "TGE". To run a query on all
documents with year 2014 for this user, include a $redact stage as in the following:
The aggregation operation returns the following redacted document for the user:
{ "_id" : 1,
"title" : "123 Department Report",
"tags" : [ [ "G" ], [ "FDW" ] ],
"year" : 2014,
"subsections" :
[
{
"subtitle" : "Section 1: Overview",
"tags" : [ [ "SI", "G" ], [ "FDW" ] ],
"content" : "Section 1: This is the content of section 1."
},
{
"subtitle" : "Section 3: Budgeting",
"tags" : [ [ "TK" ], [ "FDW", "TGE" ] ]
}
]
}
See also:
$map, $setIsSubset, $anyElementTrue
On this page
Create the Report in JIRA (page 412)
Information to Provide (page 412)
Send the Report via Email (page 412)
Evaluation of a Vulnerability Report (page 412)
Disclosure (page 413)
If you believe you have discovered a vulnerability in MongoDB or have experienced a security incident related to
MongoDB, please report the issue to aid in its resolution.
To report an issue, we strongly suggest filing a ticket in the SECURITY83 project in JIRA. MongoDB, Inc responds to
vulnerability notifications within 48 hours.
Submit a Ticket84 in the Security85 project on our JIRA. The ticket number will become the reference identification
for the issue for its lifetime. You can use this identifier for tracking purposes.
Information to Provide
All vulnerability reports should contain as much information as possible so MongoDBs developers can move quickly
to resolve the issue. In particular, please include the following:
The name of the product.
Common Vulnerability information, if applicable, including:
CVSS (Common Vulnerability Scoring System) Score.
CVE (Common Vulnerability and Exposures) Identifier.
Contact information, including an email address and/or phone number, if applicable.
While JIRA is the preferred reporting method, you may also report vulnerabilities via email to secu-
rity@mongodb.com86 .
You may encrypt email using MongoDBs public key at https://docs.mongodb.org/10gen-security-gpg-key.asc.
MongoDB, Inc. responds to vulnerability reports sent via email with a response email that contains a reference number
for a JIRA ticket posted to the SECURITY87 project.
MongoDB, Inc. validates all submitted vulnerabilities and uses Jira to track all communications regarding a vulner-
ability, including requests for clarification or additional information. If needed, MongoDB representatives set up a
conference call to exchange information regarding the vulnerability.
83 https://jira.mongodb.org/browse/SECURITY
84 https://jira.mongodb.org/secure/CreateIssue!default.jspa?project-field=%22Security%22
85 https://jira.mongodb.org/browse/SECURITY
86 security@mongodb.com
87 https://jira.mongodb.org/browse/SECURITY
Disclosure
MongoDB, Inc. requests that you do not publicly disclose any information regarding the vulnerability or exploit the
issue until it has had the opportunity to analyze the vulnerability, to respond to the notification, and to notify key users,
customers, and partners.
The amount of time required to validate a reported vulnerability depends on the complexity and severity of the issue.
MongoDB, Inc. takes all required vulnerabilities very seriously and will always ensure that there is a clear and open
channel of communication with the reporter.
After validating an issue, MongoDB, Inc. coordinates public disclosure of the issue with the reporter in a mutually
agreed timeframe and format. If required or requested, the reporter of a vulnerability will receive credit in the published
security bulletin.
On this page
Security Methods in the mongo Shell (page 413)
Security Reference Documentation (page 414)
The following lists the security related methods available in the mongo shell as well as additional security reference
material (page 414).
Name Description
db.auth() Authenticates a user to a database.
db.createUser() Creates a new user.
db.updateUser() Updates user data.
db.changeUserPassword() Changes an existing users password.
db.removeUser() Deprecated. Removes a user from a database.
db.dropAllUsers() Deletes all users associated with a database.
db.dropUser() Removes a single user.
db.grantRolesToUser() Grants a role and its privileges to a user.
db.revokeRolesFromUser() Removes a role from a user.
db.getUser() Returns information about the specified user.
db.getUsers() Returns information about all users associated with a database.
Name Description
db.createRole() Creates a role and specifies its privileges.
db.updateRole() Updates a user-defined role.
db.dropRole() Deletes a user-defined role.
db.dropAllRoles() Deletes all user-defined roles associated with a database.
db.grantPrivilegesToRole() Assigns privileges to a user-defined role.
db.revokePrivilegesFromRole() Removes the specified privileges from a user-defined role.
db.grantRolesToRole() Specifies roles from which a user-defined role inherits privileges.
db.revokeRolesFromRole() Removes inherited roles from a role.
db.getRole() Returns information for the specified role.
db.getRoles() Returns information for all the user-defined roles in a database.
Built-In Roles (page 414) Reference on MongoDB provided roles and corresponding access.
system.roles Collection (page 423) Describes the content of the collection that stores user-defined roles.
system.users Collection (page 425) Describes the content of the collection that stores users credentials and role as-
signments.
Resource Document (page 427) Describes the resource document for roles.
Privilege Actions (page 429) List of the actions available for privileges.
System Event Audit Messages (page 434) Reference on system event audit messages.
Built-In Roles
On this page
Database User Roles (page 415)
Database Administration Roles (page 416)
Cluster Administration Roles (page 417)
Backup and Restoration Roles (page 420)
All-Database Roles (page 421)
Superuser Roles (page 422)
Internal Role (page 422)
MongoDB grants access to data and commands through role-based authorization (page 331) and provides built-in
roles that provide the different levels of access commonly needed in a database system. You can additionally create
user-defined roles (page 335).
A role grants privileges to perform sets of actions (page 429) on defined resources (page 427). A given role applies to
the database on which it is defined and can grant access down to a collection level of granularity.
Each of MongoDBs built-in roles defines access at the database level for all non-system collections in the roles
database and at the collection level for all system collections (page 299).
MongoDB provides the built-in database user (page 415) and database administration (page 416) roles on every
database. MongoDB provides all other built-in roles only on the admin database.
This section describes the privileges for each built-in role. You can also view the privileges for a built-in role at any
time by issuing the rolesInfo command with the showPrivileges and showBuiltinRoles fields both set
to true.
dbOwner
The database owner can perform any administrative action on the database. This role combines the privileges
granted by the readWrite (page 415), dbAdmin (page 416) and userAdmin (page 417) roles.
userAdmin
Provides the ability to create and modify roles and users on the current database. This role also indirectly
provides superuser (page 422) access to either the database or, if scoped to the admin database, the cluster.
The userAdmin (page 417) role allows users to grant any user any privilege, including themselves.
The userAdmin (page 417) role explicitly provides the following actions:
changeCustomData (page 430)
changePassword (page 430)
createRole (page 430)
createUser (page 430)
dropRole (page 430)
dropUser (page 430)
grantRole (page 430)
revokeRole (page 430)
viewRole (page 430)
viewUser (page 430)
The admin database includes the following roles for administering the whole system rather than just a single database.
These roles include but are not limited to replica set and sharded cluster administrative functions.
clusterAdmin
Provides the greatest cluster-management access. This role combines the privileges granted by the
clusterManager (page 417), clusterMonitor (page 418), and hostManager (page 419) roles. Ad-
ditionally, the role provides the dropDatabase (page 432) action.
clusterManager
Provides management and monitoring actions on the cluster. A user with this role can access the config and
local databases, which are used in sharding and replication, respectively.
Provides the following actions on the cluster as a whole:
addShard (page 432)
applicationMessage (page 432)
cleanupOrphaned (page 431)
flushRouterConfig (page 432)
listShards (page 432)
removeShard (page 432)
replSetConfigure (page 431)
replSetGetStatus (page 431)
replSetStateChange (page 431)
The admin database includes the following roles for backing up and restoring data:
backup
Provides minimal privileges needed for backing up data. This role provides sufficient privileges to use the
MongoDB Cloud Manager90 backup agent, Ops Manager91 backup agent, or to use mongodump to back up an
entire mongod instance.
Provides the following actions (page 429) on the mms.backup collection in the admin database:
insert (page 429)
update (page 429)
Provides the listDatabases (page 434) action on the cluster as a whole.
Provides the listCollections (page 434) action on all databases.
Provides the listIndexes (page 434) action for all collections.
Provides the bypassDocumentValidation (page 429) action for collections that have document valida-
tion (page 160).
Provides the find (page 429) action on the following:
all non-system collections in the cluster
all the following system collections in the cluster: system.indexes (page 300),
system.namespaces (page 300), and system.js (page 300)
the admin.system.users (page 300) and admin.system.roles (page 299) collections
legacy system.users collections from versions of MongoDB prior to 2.6
Changed in version 3.2.1: The backup (page 420) role provides additional privileges to back up the
system.profile (page 300) collections that exist when running with database profiling (page 234). Previ-
ously, users required an additional read access on this collection.
restore
Provides privileges needed to restore data from backups that do not include system.profile (page 300) col-
lection data. This role is sufficient when restoring data with mongorestore without the --oplogReplay
option.
If the backup data includes system.profile (page 300) collection data and the target database does not
contain the system.profile (page 300) collection, mongorestore attempts to create the collection
even though the program does not actually restore system.profile documents. As such, the user
requires additional privileges to perform createCollection (page 430) and convertToCapped
(page 432) actions on the system.profile (page 300) collection for a database.
The built-in roles dbAdmin (page 416) and dbAdminAnyDatabase (page 422) provide the additional
privileges.
90 https://cloud.mongodb.com/?jmp=docs
91 https://docs.opsmanager.mongodb.com/current/
If running mongorestore with --oplogReplay, the restore (page 420) role is insufficient to
replay the oplog. To replay the oplog, create a user-defined role (page 375) that has anyAction
(page 434) on anyResource (page 429) and grant only to users who must run mongorestore with
--oplogReplay.
Provides the following actions on all non-system collections and system.js (page 300) collections in the
cluster; on the admin.system.users (page 300) and admin.system.roles (page 299) collections in
the admin database; and on legacy system.users collections from versions of MongoDB prior to 2.6:
collMod (page 432)
createCollection (page 430)
createIndex (page 430)
dropCollection (page 430)
insert (page 429)
Provides the listCollections (page 434) action on all databases.
Provides the following additional actions on admin.system.users (page 300) and legacy
system.users collections:
find (page 429)
remove (page 429)
update (page 429)
Provides the find (page 429) action on all the system.namespaces (page 300) collections in the cluster.
Although, restore (page 420) includes the ability to modify the documents in the admin.system.users
(page 300) collection using normal modification operations, only modify these data using the user management
methods.
All-Database Roles
The admin database provides the following roles that apply to all databases in a mongod instance and are roughly
equivalent to their single-database equivalents:
readAnyDatabase
Provides the same read-only permissions as read (page 415), except it applies to all databases in the cluster.
The role also provides the listDatabases (page 434) action on the cluster as a whole.
readWriteAnyDatabase
Provides the same read and write permissions as readWrite (page 415), except it applies to all databases in
the cluster. The role also provides the listDatabases (page 434) action on the cluster as a whole.
userAdminAnyDatabase
Provides the same access to user administration operations as userAdmin (page 417), except it applies to all
databases in the cluster. The role also provides the following actions on the cluster as a whole:
authSchemaUpgrade (page 431)
invalidateUserCache (page 431)
listDatabases (page 434)
The role also provides the following actions on the admin.system.users (page 300) and
admin.system.roles (page 299) collections on the admin database, and on legacy system.users
collections from versions of MongoDB prior to 2.6:
Superuser Roles
Internal Role
__system
MongoDB assigns this role to user objects that represent cluster members, such as replica set members and
mongos instances. The role entitles its holder to take any action against any object in the database.
Do not assign this role to user objects representing applications or human administrators, other than in excep-
tional circumstances.
If you need access to all actions on all resources, for example to run applyOps commands, do not assign
this role. Instead, create a user-defined role (page 375) that grants anyAction (page 434) on anyResource
(page 429) and ensure that only the users who need access to these operations have this access.
system.roles Collection
On this page
system.roles Schema (page 423)
Examples (page 424)
The system.roles collection in the admin database stores the user-defined roles. To create and manage these
user-defined roles, MongoDB provides role management commands.
system.roles Schema
or
{ cluster : true }
Examples
Consider the following sample documents found in system.roles collection of the admin database.
A User-Defined Role Specifies Privileges The following is a sample document for a user-defined role appUser
defined for the myApp database:
{
_id: "myApp.appUser",
role: "appUser",
db: "myApp",
privileges: [
{ resource: { db: "myApp" , collection: "" },
actions: [ "find", "createCollection", "dbStats", "collStats" ] },
{ resource: { db: "myApp", collection: "logs" },
actions: [ "insert" ] },
{ resource: { db: "myApp", collection: "data" },
actions: [ "insert", "update", "remove", "compact" ] },
{ resource: { db: "myApp", collection: "system.js" },
actions: [ "find" ] },
],
roles: []
}
The privileges array lists the five privileges that the appUser role specifies:
The first privilege permits its actions ( "find", "createCollection", "dbStats", "collStats") on
all the collections in the myApp database excluding its system collections. See Specify a Database as Resource
(page 427).
The next two privileges permits additional actions on specific collections, logs and data, in the myApp
database. See Specify a Collection of a Database as Resource (page 427).
The last privilege permits actions on one system collections (page 299) in the myApp database. While the first
privilege gives database-wide permission for the find action, the action does not apply to myApps system
collections. To give access to a system collection, a privilege must explicitly specify the collection. See Resource
Document (page 427).
As indicated by the empty roles array, appUser inherits no additional privileges from other roles.
User-Defined Role Inherits from Other Roles The following is a sample document for a user-defined role
appAdmin defined for the myApp database: The document shows that the appAdmin role specifies privileges
as well as inherits privileges from other roles:
{
_id: "myApp.appAdmin",
role: "appAdmin",
db: "myApp",
privileges: [
{
resource: { db: "myApp", collection: "" },
actions: [ "insert", "dbStats", "collStats", "compact", "repairDatabase" ]
}
],
roles: [
{ role: "appUser", db: "myApp" }
]
}
The privileges array lists the privileges that the appAdmin role specifies. This role has a single privilege that
permits its actions ( "insert", "dbStats", "collStats", "compact", "repairDatabase") on all the
collections in the myApp database excluding its system collections. See Specify a Database as Resource (page 427).
The roles array lists the roles, identified by the role names and databases, from which the role appAdmin inherits
privileges.
system.users Collection
On this page
system.users Schema (page 426)
Example (page 426)
The system.users collection in the admin database stores user authentication (page 317) and authorization
(page 331) information. To manage data in this collection, MongoDB provides user management commands.
system.users Schema
Example
{
_id : "home.Kari",
user : "Kari",
db : "home",
credentials : {
"SCRAM-SHA-1" : {
"iterationCount" : 10000,
"salt" : nkHYXEZTTYmn+hrY994y1Q==",
"storedKey" : "wxWGN3ElQ25WbPjACeXdUmN4nNo=",
"serverKey" : "h7vBq5tACT/BtrIElY2QTm+pQzM="
}
},
roles : [
{ role: "read", db: "home" },
{ role: "readWrite", db: "test" },
{ role: "appUser", db: "myApp" }
],
customData : { zipCode: "64157" }
}
The document shows that a user Kari is associated with the home database. Kari has the read (page 415) role
in the home database, the readWrite (page 415) role in the test database, and the appUser role in the myApp
database.
Resource Document
On this page
Database and/or Collection Resource (page 427)
Cluster Resource (page 428)
anyResource (page 429)
The resource document specifies the resources upon which a privilege permits actions.
Specify a Collection of a Database as Resource If the resource document species both the db and collection
fields as non-empty strings, the resource is the specified collection in the specified database. For example, the following
document specifies a resource of the inventory collection in the products database:
{ db: "products", collection: "inventory" }
For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the
same database as the role. User-defined roles scoped for the admin database can specify other databases.
Specify a Database as Resource If only the collection field is an empty string (""), the resource is the specified
database, excluding the system collections (page 299). For example, the following resource document specifies the
resource of the test database, excluding the system collections:
For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the
same database as the role. User-defined roles scoped for the admin database can specify other databases.
Note: When you specify a database as the resource, system collections are excluded, unless you name them explicitly,
as in the following:
{ db: "test", collection: "system.js" }
Specify Collections Across Databases as Resource If only the db field is an empty string (""), the resource is all
collections with the specified name across all databases. For example, the following document specifies the resource
of all the accounts collections across all the databases:
{ db: "", collection: "accounts" }
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-
leges.
Specify All Non-System Collections in All Databases If both the db and collection fields are empty strings
(""), the resource is all collections, excluding the system collections (page 299), in all the databases:
{ db: "", collection: "" }
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-
leges.
Cluster Resource
Use the cluster resource for actions that affect the state of the system rather than act on specific set of databases
or collections. Examples of such actions are shutdown, replSetReconfig, and addShard. For example, the
following document grants the action shutdown on the cluster.
{ resource: { cluster : true }, actions: [ "shutdown" ] }
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-
leges.
anyResource
The internal resource anyResource gives access to every resource in the system and is intended for internal use.
Do not use this resource, other than in exceptional circumstances. The syntax for this resource is { anyResource:
true }.
Privilege Actions
On this page
Query and Write Actions (page 429)
Database Management Actions (page 430)
Deployment Management Actions (page 431)
Replication Actions (page 431)
Sharding Actions (page 432)
Server Administration Actions (page 432)
Diagnostic Actions (page 433)
Internal Actions (page 434)
Privilege actions define the operations a user can perform on a resource (page 427). A MongoDB privilege (page 331)
comprises a resource (page 427) and the permitted actions. This page lists available actions grouped by common
purpose.
MongoDB provides built-in roles with pre-defined pairings of resources and permitted actions. For lists of the actions
granted, see Built-In Roles (page 414). To define custom roles, see Create a User-Defined Role (page 375).
find
User can perform the db.collection.find() method. Apply this action to database or collection re-
sources.
insert
User can perform the insert command. Apply this action to database or collection resources.
remove
User can perform the db.collection.remove() method. Apply this action to database or collection
resources.
update
User can perform the update command. Apply this action to database or collection resources.
bypassDocumentValidation
New in version 3.2.
User can bypass document validation on commands that support the bypassDocumentValidation option.
For a list of commands that support the bypassDocumentValidation option, see Document Validation
(page 883). Apply this action to database or collection resources.
changeCustomData
User can change the custom information of any user in the given database. Apply this action to database
resources.
changeOwnCustomData
Users can change their own custom information. Apply this action to database resources. See also Change Your
Password and Custom Data (page 380).
changeOwnPassword
Users can change their own passwords. Apply this action to database resources. See also Change Your Password
and Custom Data (page 380).
changePassword
User can change the password of any user in the given database. Apply this action to database resources.
createCollection
User can perform the db.createCollection() method. Apply this action to database or collection re-
sources.
createIndex
Provides access to the db.collection.createIndex() method and the createIndexes command.
Apply this action to database or collection resources.
createRole
User can create new roles in the given database. Apply this action to database resources.
createUser
User can create new users in the given database. Apply this action to database resources.
dropCollection
User can perform the db.collection.drop() method. Apply this action to database or collection re-
sources.
dropRole
User can delete any role from the given database. Apply this action to database resources.
dropUser
User can remove any user from the given database. Apply this action to database resources.
emptycapped
User can perform the emptycapped command. Apply this action to database or collection resources.
enableProfiler
User can perform the db.setProfilingLevel() method. Apply this action to database resources.
grantRole
User can grant any role in the database to any user from any database in the system. Apply this action to database
resources.
killCursors
User can kill cursors on the target collection.
revokeRole
User can remove any role from any user from any database in the system. Apply this action to database resources.
unlock
User can perform the db.fsyncUnlock() method. Apply this action to the cluster resource.
viewRole
User can view information about any role in the given database. Apply this action to database resources.
viewUser
User can view the information of any user in the given database. Apply this action to database resources.
authSchemaUpgrade
User can perform the authSchemaUpgrade command. Apply this action to the cluster resource.
cleanupOrphaned
User can perform the cleanupOrphaned command. Apply this action to the cluster resource.
cpuProfiler
User can enable and use the CPU profiler. Apply this action to the cluster resource.
inprog
User can use the db.currentOp() method to return pending and active operations. Apply this action to the
cluster resource.
invalidateUserCache
Provides access to the invalidateUserCache command. Apply this action to the cluster resource.
killop
User can perform the db.killOp() method. Apply this action to the cluster resource.
planCacheRead
User can perform the planCacheListPlans and planCacheListQueryShapes commands and the
PlanCache.getPlansByQuery() and PlanCache.listQueryShapes() methods. Apply this ac-
tion to database or collection resources.
planCacheWrite
User can perform the planCacheClear command and the PlanCache.clear() and
PlanCache.clearPlansByQuery() methods. Apply this action to database or collection resources.
storageDetails
User can perform the storageDetails command. Apply this action to database or collection resources.
Replication Actions
appendOplogNote
User can append notes to the oplog. Apply this action to the cluster resource.
replSetConfigure
User can configure a replica set. Apply this action to the cluster resource.
replSetGetStatus
User can perform the replSetGetStatus command. Apply this action to the cluster resource.
replSetHeartbeat
User can perform the replSetHeartbeat command. Apply this action to the cluster resource.
replSetStateChange
User can change the state of a replica set through the replSetFreeze, replSetMaintenance,
replSetStepDown, and replSetSyncFrom commands. Apply this action to the cluster resource.
resync
User can perform the resync command. Apply this action to the cluster resource.
Sharding Actions
addShard
User can perform the addShard command. Apply this action to the cluster resource.
enableSharding
User can enable sharding on a database using the enableSharding command and can shard a collection
using the shardCollection command. Apply this action to database or collection resources.
flushRouterConfig
User can perform the flushRouterConfig command. Apply this action to the cluster resource.
getShardMap
User can perform the getShardMap command. Apply this action to the cluster resource.
getShardVersion
User can perform the getShardVersion command. Apply this action to database resources.
listShards
User can perform the listShards command. Apply this action to the cluster resource.
moveChunk
User can perform the moveChunk command. In addition, user can perform the movePrimary command
provided that the privilege is applied to an appropriate database resource. Apply this action to database or
collection resources.
removeShard
User can perform the removeShard command. Apply this action to the cluster resource.
shardingState
User can perform the shardingState command. Apply this action to the cluster resource.
splitChunk
User can perform the splitChunk command. Apply this action to database or collection resources.
splitVector
User can perform the splitVector command. Apply this action to database or collection resources.
applicationMessage
User can perform the logApplicationMessage command. Apply this action to the cluster resource.
closeAllDatabases
User can perform the closeAllDatabases command. Apply this action to the cluster resource.
collMod
User can perform the collMod command. Apply this action to database or collection resources.
compact
User can perform the compact command. Apply this action to database or collection resources.
connPoolSync
User can perform the connPoolSync command. Apply this action to the cluster resource.
convertToCapped
User can perform the convertToCapped command. Apply this action to database or collection resources.
dropDatabase
User can perform the dropDatabase command. Apply this action to database resources.
dropIndex
User can perform the dropIndexes command. Apply this action to database or collection resources.
fsync
User can perform the fsync command. Apply this action to the cluster resource.
getParameter
User can perform the getParameter command. Apply this action to the cluster resource.
hostInfo
Provides information about the server the MongoDB instance runs on. Apply this action to the cluster
resource.
logRotate
User can perform the logRotate command. Apply this action to the cluster resource.
reIndex
User can perform the reIndex command. Apply this action to database or collection resources.
renameCollectionSameDB
Allows the user to rename collections on the current database using the renameCollection command.
Apply this action to database resources.
Additionally, the user must either have find (page 429) on the source collection or not have find (page 429)
on the destination collection.
If a collection with the new name already exists, the user must also have the dropCollection (page 430)
action on the destination collection.
repairDatabase
User can perform the repairDatabase command. Apply this action to database resources.
setParameter
User can perform the setParameter command. Apply this action to the cluster resource.
shutdown
User can perform the shutdown command. Apply this action to the cluster resource.
touch
User can perform the touch command. Apply this action to the cluster resource.
Diagnostic Actions
collStats
User can perform the collStats command. Apply this action to database or collection resources.
connPoolStats
User can perform the connPoolStats and shardConnPoolStats commands. Apply this action to the
cluster resource.
cursorInfo
User can perform the cursorInfo command. Apply this action to the cluster resource.
dbHash
User can perform the dbHash command. Apply this action to database or collection resources.
dbStats
User can perform the dbStats command. Apply this action to database resources.
diagLogging
User can perform the diagLogging command. Apply this action to the cluster resource.
getCmdLineOpts
User can perform the getCmdLineOpts command. Apply this action to the cluster resource.
getLog
User can perform the getLog command. Apply this action to the cluster resource.
indexStats
User can perform the indexStats command. Apply this action to database or collection resources.
Changed in version 3.0: MongoDB 3.0 removes the indexStats command.
listDatabases
User can perform the listDatabases command. Apply this action to the cluster resource.
listCollections
User can perform the listCollections command. Apply this action to database resources.
listIndexes
User can perform the ListIndexes command. Apply this action to database or collection resources.
netstat
User can perform the netstat command. Apply this action to the cluster resource.
serverStatus
User can perform the serverStatus command. Apply this action to the cluster resource.
validate
User can perform the validate command. Apply this action to database or collection resources.
top
User can perform the top command. Apply this action to the cluster resource.
Internal Actions
anyAction
Allows any action on a resource. Do not assign this action except for exceptional circumstances.
internal
Allows internal actions. Do not assign this action except for exceptional circumstances.
On this page
Audit Message (page 434)
Audit Event Actions, Details, and Results (page 435)
Audit Message
The event auditing feature (page 340) can record events in JSON format. To configure auditing output, see Configure
Auditing (page 404)
92 http://www.mongodb.com/products/mongodb-enterprise?jmp=docs
field string atype Action type. See Audit Event Actions, Details, and Results (page 435).
field document ts Document that contains the date and UTC time of the event, in ISO 8601 format.
field document local Document that contains the local ip address and the port number of the running
instance.
field document remote Document that contains the remote ip address and the port number of the
incoming connection associated with the event.
field array users Array of user identification documents. Because MongoDB allows a session to log in
with different user per database, this array can have more than one user. Each document contains a
user field for the username and a db field for the authentication database for that user.
field array roles Array of documents that specify the roles (page 331) granted to the user. Each document
contains a role field for the name of the role and a db field for the database associated with the
role.
field document param Specific details for the event. See Audit Event Actions, Details, and Results
(page 435).
field integer result Error code. See Audit Event Actions, Details, and Results (page 435).
The following table lists for each atype or action type, the associated param details and the result values, if any.
createDatabase 0 - Success
{ ns: <database> }
renameCollection 0 - Success
{
old: <database>.<collection>,
new: <database>.<collection>
}
93 Enabling auditAuthorizationSuccess degrades performance more than logging only the authorization failures.
dropAllUsersFromDatabase 0 - Success
{ db: <database> }
updateUser 0 - Success
{
user: <user name>,
db: <database>,
passwordChanged: <boolean>,
customData: <document>,
roles: [
{
role: <role name>,
db: <database>
},
...
]
}
The customData field is optional.
grantRolesToUser 0 - Success
{
user: <user name>,
db: <database>,
roles: [
{
role: <role name>,
db: <database>
},
...
]
}
dropAllRolesFromDatabase 0 - Success
{ db: <database> }
grantRolesToRole 0 - Success
{
role: <role name>,
db: <database>,
roles: [
{
role: <role name>,
db: <database>
},
...
]
}
grantPrivilegesToRole 0 - Success
{
role: <role name>,
db: <database>,
privileges: [
{
resource: <resource document>,
actions: [ <action>, ... ]
},
...
]
}
For details on the resource document,
see Resource Document (page 427).
For a list of actions, see Privilege Ac-
tions (page 429).
revokePrivilegesFromRole 0 - Success
{
role: <role name>,
db: <database name>,
privileges: [
{
resource: <resource document>,
actions: [ <action>, ... ]
},
...
]
}
For details on the resource document,
see Resource Document (page 427).
For a list of actions, see Privilege Ac-
tions (page 429).
Continued on next page
shardCollection 0 - Success
{
ns: <database>.<collection>,
key: <shard key pattern>,
options: { unique: <boolean> }
}
94 https://www.mongodb.com/blog/post/making-hipaa-compliant-applications-mongodb?jmp=docs
95 https://www.mongodb.com/lp/white-paper/mongodb-security-architecture?jmp=docs
96 http://www.mongodb.com/presentations/webinar-securing-your-mongodb-deployment?jmp=docs
Aggregation
On this page
Aggregation Pipeline (page 443)
Map-Reduce (page 445)
Single Purpose Aggregation Operations (page 446)
Additional Features and Behaviors (page 446)
Additional Resources (page 485)
Aggregations operations process data records and return computed results. Aggregation operations group values from
multiple documents together, and can perform a variety of operations on the grouped data to return a single result.
MongoDB provides three ways to perform aggregation: the aggregation pipeline (page 443), the map-reduce function
(page 445), and single purpose aggregation methods (page 446).
MongoDB 2.2 introduced a new aggregation framework (page 447), modeled on the concept of data processing
pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.
The most basic pipeline stages provide filters that operate like queries and document transformations that modify the
form of the output document.
Other pipeline operations provide tools for grouping and sorting documents by specific field or fields as well as tools
for aggregating the contents of arrays, including arrays of documents. In addition, pipeline stages can use operators
for tasks such as calculating the average or concatenating a string.
The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method
for data aggregation in MongoDB.
The aggregation pipeline can operate on a sharded collection (page 725).
The aggregation pipeline can use indexes to improve its performance during some of its stages. In addition, the aggre-
gation pipeline has an internal optimization phase. See Pipeline Operators and Indexes (page 448) and Aggregation
Pipeline Optimization (page 449) for details.
443
MongoDB Documentation, Release 3.2.4
7.2 Map-Reduce
MongoDB also provides map-reduce (page 462) operations to perform aggregation. In general, map-reduce operations
have two phases: a map stage that processes each document and emits one or more objects for each input document,
and reduce phase that combines the output of the map operation. Optionally, map-reduce can have a finalize stage to
make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to
select the input documents as well as sort and limit the results.
Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional finalize
operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, map-
reduce is less efficient and more complex than the aggregation pipeline.
Map-reduce can operate on a sharded collection (page 725). Map reduce operations can also output to a sharded
collection. See Aggregation Pipeline and Sharded Collections (page 453) and Map-Reduce and Sharded Collections
(page 463) for details.
Note: Starting in MongoDB 2.4, certain mongo shell functions and properties are inaccessible in map-reduce op-
erations. MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before
MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce.
For a feature comparison of the aggregation pipeline, map-reduce, and the special group functionality, see Aggregation
Commands Comparison (page 480).
On this page
Pipeline (page 447)
Pipeline Expressions (page 448)
Aggregation Pipeline Behavior (page 448)
Additional Resources (page 461)
The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines.
Documents enter a multi-stage pipeline that transforms the documents into aggregated results.
The aggregation pipeline provides an alternative to map-reduce and may be the preferred solution for aggregation tasks
where the complexity of map-reduce may be unwarranted.
Aggregation pipeline have some limitations on value types and result size. See Aggregation Pipeline Limits (page 453)
for details on limits and restrictions on the aggregation pipeline.
Pipeline
The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the
pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may
generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline.
MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregate com-
mand for aggregation pipeline. See aggregation-pipeline-operator-reference for the available stages.
For example usage of the aggregation pipeline, consider Aggregation with User Preference Data (page 457) and
Aggregation with the Zip Code Data Set (page 454).
Pipeline Expressions
Some pipeline stages takes a pipeline expression as its operand. Pipeline expressions specify the transformation to
apply to the input documents. Expressions have a document (page 186) structure and can contain other expression
(page 474).
Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other
documents: expression operations provide in-memory transformation of documents.
Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception:
accumulator expressions.
The accumulators, used in the $group stage, maintain their state (e.g. totals, maximums, minimums, and related
data) as documents progress through the pipeline.
Changed in version 3.2: Some accumulators are available in the $project stage; however, when used in the
$project stage, the accumulators do not maintain their state across documents.
For more information on expressions, see Expressions (page 474).
In MongoDB, the aggregate command operates on a single collection, logically passing the entire collection into
the aggregation pipeline. To optimize the operation, wherever possible, use the following strategies to avoid scanning
the entire collection.
The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the
pipeline.
New in version 2.4: The $geoNear pipeline operator takes advantage of a geospatial index. When using $geoNear,
the $geoNear pipeline operation must appear as the first stage in an aggregation pipeline.
Changed in version 3.2: Starting in MongoDB 3.2, indexes can cover (page 70) an aggregation pipeline. In MongoDB
2.6 and 3.0, indexes could not cover an aggregation pipeline since even when the pipeline uses an index, aggregation
still requires access to the actual documents.
Early Filtering
If your aggregation operation requires only a subset of the data in a collection, use the $match, $limit, and $skip
stages to restrict the documents that enter at the beginning of the pipeline. When placed at the beginning of a pipeline,
$match operations use suitable indexes to scan only the matching documents in a collection.
Placing a $match pipeline stage followed by a $sort stage at the start of the pipeline is logically equivalent to a
single query with a sort and can use an index. When possible, place $match operators at the beginning of the pipeline.
Additional Features
The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences
of operators. For details, see Aggregation Pipeline Optimization (page 449).
The aggregation pipeline supports operations on sharded collections. See Aggregation Pipeline and Sharded Collec-
tions (page 453).
On this page
Projection Optimization (page 449)
Aggregation Pipeline Optimization Pipeline Sequence Optimization (page 449)
Pipeline Coalescence Optimization (page 450)
Examples (page 452)
Aggregation pipeline operations have an optimization phase which attempts to reshape the pipeline for improved
performance.
To see how the optimizer transforms a particular aggregation pipeline, include the explain option in the
db.collection.aggregate() method.
Optimizations are subject to change between releases.
Projection Optimization The aggregation pipeline can determine if it requires only a subset of the fields in the
documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data
passing through the pipeline.
$sort + $match Sequence Optimization When you have a sequence with $sort followed by a $match, the
$match moves before the $sort to minimize the number of objects to sort. For example, if the pipeline consists of
the following stages:
{ $sort: { age : -1 } },
{ $match: { status: 'A' } }
During the optimization phase, the optimizer transforms the sequence to the following:
{ $match: { status: 'A' } },
{ $sort: { age : -1 } }
$skip + $limit Sequence Optimization When you have a sequence with $skip followed by a $limit, the
$limit moves before the $skip. With the reordering, the $limit value increases by the $skip amount.
For example, if the pipeline consists of the following stages:
{ $skip: 10 },
{ $limit: 5 }
During the optimization phase, the optimizer transforms the sequence to the following:
{ $limit: 15 },
{ $skip: 10 }
This optimization allows for more opportunities for $sort + $limit Coalescence (page 450), such as with $sort +
$skip + $limit sequences. See $sort + $limit Coalescence (page 450) for details on the coalescence and $sort +
$skip + $limit Sequence (page 452) for an example.
For aggregation operations on sharded collections (page 453), this optimization reduces the results returned from each
shard.
$redact + $match Sequence Optimization When possible, when the pipeline has the $redact stage immedi-
ately followed by the $match stage, the aggregation can sometimes add a portion of the $match stage before the
$redact stage. If the added $match stage is at the start of a pipeline, the aggregation can use an index as well
as query the collection to limit the number of documents that enter the pipeline. See Pipeline Operators and Indexes
(page 448) for more information.
For example, if the pipeline consists of the following stages:
{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } },
{ $match: { year: 2014, category: { $ne: "Z" } } }
The optimizer can add the same $match stage before the $redact stage:
{ $match: { year: 2014 } },
{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } },
{ $match: { year: 2014, category: { $ne: "Z" } } }
During the optimization phase, the optimizer transforms the sequence to the following:
{ $sort: { age : -1 } },
{ $limit: 5 }
{ $project: { status: 1, name: 1 } },
This optimization allows for more opportunities for $sort + $limit Coalescence (page 450), such as with $sort +
$limit sequences. See $sort + $limit Coalescence (page 450) for details on the coalescence.
Pipeline Coalescence Optimization When possible, the optimization phase coalesces a pipeline stage into its pre-
decessor. Generally, coalescence occurs after any sequence reordering optimization.
$sort + $limit Coalescence When a $sort immediately precedes a $limit, the optimizer can coalesce the
$limit into the $sort. This allows the sort operation to only maintain the top n results as it progresses, where
n is the specified limit, and MongoDB only needs to store n items in memory 1 . See sort-and-memory for more
information.
1 The optimization will still apply when allowDiskUse is true and the n items exceed the aggregation memory limit (page 453).
$limit + $limit Coalescence When a $limit immediately follows another $limit, the two stages can
coalesce into a single $limit where the limit amount is the smaller of the two initial limit amounts. For example, a
pipeline contains the following sequence:
{ $limit: 100 },
{ $limit: 10 }
Then the second $limit stage can coalesce into the first $limit stage and result in a single $limit stage where
the limit amount 10 is the minimum of the two initial limits 100 and 10.
{ $limit: 10 }
$skip + $skip Coalescence When a $skip immediately follows another $skip, the two stages can coalesce
into a single $skip where the skip amount is the sum of the two initial skip amounts. For example, a pipeline contains
the following sequence:
{ $skip: 5 },
{ $skip: 2 }
Then the second $skip stage can coalesce into the first $skip stage and result in a single $skip stage where the
skip amount 7 is the sum of the two initial limits 5 and 2.
{ $skip: 7 }
$match + $match Coalescence When a $match immediately follows another $match, the two stages can
coalesce into a single $match combining the conditions with an $and. For example, a pipeline contains the following
sequence:
{ $match: { year: 2014 } },
{ $match: { status: "A" } }
Then the second $match stage can coalesce into the first $match stage and result in a single $match stage
{ $match: { $and: [ { "year" : 2014 }, { "status" : "A" } ] } }
The optimizer can coalesce the $unwind stage into the $lookup stage. If you run the aggregation with explain
option, the explain output shows the coalesced stage:
{
$lookup: {
from: "otherCollection",
as: "resultingArray",
localField: "x",
foreignField: "y",
unwinding: { preserveNullAndEmptyArrays: false }
}
}
Examples The following examples are some sequences that can take advantage of both sequence reordering and
coalescence. Generally, coalescence occurs after any sequence reordering optimization.
$sort + $skip + $limit Sequence A pipeline contains a sequence of $sort followed by a $skip followed
by a $limit:
{ $sort: { age : -1 } },
{ $skip: 10 },
{ $limit: 5 }
First, the optimizer performs the $skip + $limit Sequence Optimization (page 449) to transforms the sequence to the
following:
{ $sort: { age : -1 } },
{ $limit: 15 }
{ $skip: 10 }
The $skip + $limit Sequence Optimization (page 449) increases the $limit amount with the reordering. See $skip +
$limit Sequence Optimization (page 449) for details.
The reordered sequence now has $sort immediately preceding the $limit, and the pipeline can coalesce the two
stages to decrease memory usage during the sort operation. See $sort + $limit Coalescence (page 450) for more
information.
$limit + $skip + $limit + $skip Sequence A pipeline contains a sequence of alternating $limit and
$skip stages:
{ $limit: 100 },
{ $skip: 5 },
{ $limit: 10 },
{ $skip: 2 }
The $skip + $limit Sequence Optimization (page 449) reverses the position of the { $skip: 5 } and { $limit:
10 } stages and increases the limit amount:
{ $limit: 100 },
{ $limit: 15},
{ $skip: 5 },
{ $skip: 2 }
The optimizer then coalesces the two $limit stages into a single $limit stage and the two $skip stages into a
single $skip stage. The resulting sequence is the following:
{ $limit: 15 },
{ $skip: 7 }
See $limit + $limit Coalescence (page 451) and $skip + $skip Coalescence (page 451) for details.
See also:
explain option in the db.collection.aggregate()
On this page
Aggregation Pipeline Limits Result Size Restrictions (page 453)
Memory Restrictions (page 453)
Aggregation operations with the aggregate command have the following limitations.
On this page
Aggregation Pipeline and Sharded Collections Behavior (page 453)
Optimization (page 454)
The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the
aggregation pipeline (page 447) and sharded collections.
primary shard for that database. The $out stage and the $lookup stage require running on the databases primary
shard.
Optimization When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards
perform as many stages as possible with consideration for optimization.
To see how the pipeline was split, include the explain option in the db.collection.aggregate() method.
Optimizations are subject to change between releases.
On this page
Data Model (page 454)
Aggregation with the Zip Code Data Set aggregate() Method (page 454)
Return States with Populations above 10 Million (page 455)
Return Average City Population by State (page 455)
Return Largest and Smallest Cities by State (page 456)
The examples in this document use the zipcodes collection. This collection is available at: me-
dia.mongodb.org/zips.json2 . Use mongoimport to load this data set into your mongod instance.
Data Model Each document in the zipcodes collection has the following form:
{
"_id": "10280",
"city": "NEW YORK",
"state": "NY",
"pop": 5574,
"loc": [
-74.016323,
40.710537
]
}
aggregate() Method All of the following examples use the aggregate() helper in the mongo shell.
The aggregate() method uses the aggregation pipeline (page 447) to processes documents into aggregated results.
An aggregation pipeline (page 447) consists of stages with each stage processing the documents as they pass along
the pipeline. Documents pass through the stages in sequence.
The aggregate() method in the mongo shell provides a wrapper around the aggregate database command. See
the documentation for your driver for a more idiomatic interface for data aggregation operations.
2 http://media.mongodb.org/zips.json
Return States with Populations above 10 Million The following aggregation operation returns all states with total
population greater than 10 million:
db.zipcodes.aggregate( [
{ $group: { _id: "$state", totalPop: { $sum: "$pop" } } },
{ $match: { totalPop: { $gte: 10*1000*1000 } } }
] )
In this example, the aggregation pipeline (page 447) consists of the $group stage followed by the $match stage:
The $group stage groups the documents of the zipcode collection by the state field, calculates the
totalPop field for each state, and outputs a document for each unique state.
The new per-state documents have two fields: the _id field and the totalPop field. The _id field contains
the value of the state; i.e. the group by field. The totalPop field is a calculated field that contains the total
population of each state. To calculate the value, $group uses the $sum operator to add the population field
(pop) for each state.
After the $group stage, the documents in the pipeline resemble the following:
{
"_id" : "AK",
"totalPop" : 550043
}
The $match stage filters these grouped documents to output only those documents whose totalPop value is
greater than or equal to 10 million. The $match stage does not alter the matching documents but outputs the
matching documents unmodified.
The equivalent SQL for this aggregation operation is:
SELECT state, SUM(pop) AS totalPop
FROM zipcodes
GROUP BY state
HAVING totalPop >= (10*1000*1000)
See also:
$group, $match, $sum
Return Average City Population by State The following aggregation operation returns the average populations for
cities in each state:
db.zipcodes.aggregate( [
{ $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } },
{ $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } }
] )
In this example, the aggregation pipeline (page 447) consists of the $group stage followed by another $group
stage:
The first $group stage groups the documents by the combination of city and state, uses the $sum ex-
pression to calculate the population for each combination, and outputs a document for each city and state
combination. 3
After this stage in the pipeline, the documents resemble the following:
3 A city can have more than one zip code associated with it as different sections of the city can each have a different zip code.
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
A second $group stage groups the documents in the pipeline by the _id.state field (i.e. the state field
inside the _id document), uses the $avg expression to calculate the average city population (avgCityPop)
for each state, and outputs a document for each state.
The documents that result from this aggregation operation resembles the following:
{
"_id" : "MN",
"avgCityPop" : 5335
}
See also:
$group, $sum, $avg
Return Largest and Smallest Cities by State The following aggregation operation returns the smallest and largest
cities by population for each state:
db.zipcodes.aggregate( [
{ $group:
{
_id: { state: "$state", city: "$city" },
pop: { $sum: "$pop" }
}
},
{ $sort: { pop: 1 } },
{ $group:
{
_id : "$_id.state",
biggestCity: { $last: "$_id.city" },
biggestPop: { $last: "$pop" },
smallestCity: { $first: "$_id.city" },
smallestPop: { $first: "$pop" }
}
},
{ $project:
{ _id: 0,
state: "$_id",
biggestCity: { name: "$biggestCity", pop: "$biggestPop" },
smallestCity: { name: "$smallestCity", pop: "$smallestPop" }
}
}
] )
In this example, the aggregation pipeline (page 447) consists of a $group stage, a $sort stage, another $group
stage, and a $project stage:
The first $group stage groups the documents by the combination of the city and state, calculates the sum
of the pop values for each combination, and outputs a document for each city and state combination.
At this stage in the pipeline, the documents resemble the following:
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
The $sort stage orders the documents in the pipeline by the pop field value, from smallest to largest; i.e. by
increasing order. This operation does not alter the documents.
The next $group stage groups the now-sorted documents by the _id.state field (i.e. the state field inside
the _id document) and outputs a document for each state.
The stage also calculates the following four fields for each state. Using the $last expression, the $group
operator creates the biggestCity and biggestPop fields that store the city with the largest population
and that population. Using the $first expression, the $group operator creates the smallestCity and
smallestPop fields that store the city with the smallest population and that population.
The documents, at this stage in the pipeline, resemble the following:
{
"_id" : "WA",
"biggestCity" : "SEATTLE",
"biggestPop" : 520096,
"smallestCity" : "BENGE",
"smallestPop" : 2
}
The final $project stage renames the _id field to state and moves the biggestCity, biggestPop,
smallestCity, and smallestPop into biggestCity and smallestCity embedded documents.
The output documents of this aggregation operation resemble the following:
{
"state" : "RI",
"biggestCity" : {
"name" : "CRANSTON",
"pop" : 176404
},
"smallestCity" : {
"name" : "CLAYVILLE",
"pop" : 45
}
}
On this page
Data Model (page 458)
Aggregation with User Preference Data Normalize and Sort Documents (page 458)
Return Usernames Ordered by Join Month (page 458)
Return Total Number of Joins per Month (page 459)
Return the Five Most Common Likes (page 460)
Data Model Consider a hypothetical sports club with a database that contains a users collection that tracks the
users join dates, sport preferences, and stores these data in documents that resemble the following:
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : ["golf", "racquetball"]
}
{
_id : "joe",
joined : ISODate("2012-07-02"),
likes : ["tennis", "golf", "swimming"]
}
Normalize and Sort Documents The following operation returns user names in upper case and in alphabetical order.
The aggregation includes user names for all documents in the users collection. You might do this to normalize user
names for processing.
db.users.aggregate(
[
{ $project : { name:{$toUpper:"$_id"} , _id:0 } },
{ $sort : { name : 1 } }
]
)
All documents from the users collection pass through the pipeline, which consists of the following operations:
The $project operator:
creates a new field called name.
converts the value of the _id to upper case, with the $toUpper operator. Then the $project creates
a new field, named name to hold this value.
suppresses the id field. $project will pass the _id field by default, unless explicitly suppressed.
The $sort operator orders the results by the name field.
The results of the aggregation would resemble the following:
{
"name" : "JANE"
},
{
"name" : "JILL"
},
{
"name" : "JOE"
}
Return Usernames Ordered by Join Month The following aggregation operation returns user names sorted by the
month they joined. This kind of aggregation could help generate membership renewal notices.
db.users.aggregate(
[
{ $project :
{
month_joined : { $month : "$joined" },
name : "$_id",
_id : 0
}
},
{ $sort : { month_joined : 1 } }
]
)
The pipeline passes all documents in the users collection through the following operations:
The $project operator:
Creates two new fields: month_joined and name.
Suppresses the id from the results. The aggregate() method includes the _id, unless explicitly
suppressed.
The $month operator converts the values of the joined field to integer representations of the month. Then
the $project operator assigns those values to the month_joined field.
The $sort operator sorts the results by the month_joined field.
The operation returns results that resemble the following:
{
"month_joined" : 1,
"name" : "ruth"
},
{
"month_joined" : 1,
"name" : "harold"
},
{
"month_joined" : 1,
"name" : "kate"
}
{
"month_joined" : 2,
"name" : "jill"
}
Return Total Number of Joins per Month The following operation shows how many people joined each month of
the year. You might use this aggregated data for recruiting and marketing strategies.
db.users.aggregate(
[
{ $project : { month_joined : { $month : "$joined" } } } ,
{ $group : { _id : {month_joined:"$month_joined"} , number : { $sum : 1 } } },
{ $sort : { "_id.month_joined" : 1 } }
]
)
The pipeline passes all documents in the users collection through the following operations:
The $project operator creates a new field called month_joined.
The $month operator converts the values of the joined field to integer representations of the month. Then
the $project operator assigns the values to the month_joined field.
The $group operator collects all documents with a given month_joined value and counts how many docu-
ments there are for that value. Specifically, for each unique value, $group creates a new per-month document
with two fields:
_id, which contains a nested document with the month_joined field and its value.
number, which is a generated field. The $sum operator increments this field by 1 for every document
containing the given month_joined value.
The $sort operator sorts the documents created by $group according to the contents of the month_joined
field.
The result of this aggregation operation would resemble the following:
{
"_id" : {
"month_joined" : 1
},
"number" : 3
},
{
"_id" : {
"month_joined" : 2
},
"number" : 9
},
{
"_id" : {
"month_joined" : 3
},
"number" : 5
}
Return the Five Most Common Likes The following aggregation collects top five most liked activities in the
data set. This type of analysis could help inform planning and future development.
db.users.aggregate(
[
{ $unwind : "$likes" },
{ $group : { _id : "$likes" , number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $limit : 5 }
]
)
The pipeline begins with all documents in the users collection, and passes these documents through the following
operations:
The $unwind operator separates each value in the likes array, and creates a new version of the source
document for every element in the array.
Example
Given the following document from the users collection:
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : ["golf", "racquetball"]
}
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : "golf"
}
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : "racquetball"
}
The $group operator collects all documents the same value for the likes field and counts each grouping.
With this information, $group creates a new document with two fields:
_id, which contains the likes value.
number, which is a generated field. The $sum operator increments this field by 1 for every document
containing the given likes value.
The $sort operator sorts these documents by the number field in reverse order.
The $limit operator only includes the first 5 result documents.
The results of aggregation would resemble the following:
{
"_id" : "golf",
"number" : 33
},
{
"_id" : "racquetball",
"number" : 31
},
{
"_id" : "swimming",
"number" : 24
},
{
"_id" : "handball",
"number" : 19
},
{
"_id" : "tennis",
"number" : 18
}
Additional Resources
MongoDB Analytics: Learn Aggregation by Example: Exploratory Analytics and Visualization Using Flight
Data4
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregation Framework and Hadoop5
The Aggregation Framework6
4 http://www.mongodb.com/presentations/mongodb-analytics-learn-aggregation-example-exploratory-analytics-and-visualization?jmp=docs
5 http://www.mongodb.com/presentations/mongodb-time-series-data-part-2-analyzing-time-series-data-using-aggregation-
framework?jmp=docs
6 https://www.mongodb.com/presentations/aggregation-framework-0?jmp=docs
7.4.2 Map-Reduce
On this page
Map-Reduce JavaScript Functions (page 463)
Map-Reduce Behavior (page 463)
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For
map-reduce operations, MongoDB provides the mapReduce database command.
Consider the following map-reduce operation:
7 https://www.mongodb.com/webinar/exploring-the-aggregation-framework?jmp=docs
8 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the
collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple
values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores
the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further
condense or process the results of the aggregation.
All map-reduce functions in MongoDB are JavaScript and run within the mongod process. Map-reduce operations
take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before
beginning the map stage. mapReduce can return the results of a map-reduce operation as a document, or may write
the results to collections. The input and the output collections may be sharded.
Note: For most aggregation operations, the Aggregation Pipeline (page 447) provides better performance and more
coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the
aggregation pipeline.
In MongoDB, map-reduce operations use custom JavaScript functions to map, or associate, values to a key. If a key
has multiple values mapped to it, the operation reduces the values for the key to a single object.
The use of custom JavaScript functions provide flexibility to map-reduce operations. For instance, when processing a
document, the map function can create more than one key and value mapping or no mapping. Map-reduce operations
can also use a custom JavaScript function to make final modifications to the results at the end of the map and reduce
operation, such as perform additional calculations.
Map-Reduce Behavior
In MongoDB, the map-reduce operation can write results to a collection or return the results inline. If you write
map-reduce output to a collection, you can perform subsequent map-reduce operations on the same input collection
that merge replace, merge, or reduce new results with previous results. See mapReduce and Perform Incremental
Map-Reduce (page 467) for details and examples.
When returning the results of a map reduce operation inline, the result documents must
be within the BSON Document Size limit, which is currently 16 megabytes. For
additional information on limits and restrictions on map-reduce operations, see the
https://docs.mongodb.org/manual/reference/command/mapReduce reference page.
MongoDB supports map-reduce operations on sharded collections (page 725). Map-reduce operations can also output
the results to a sharded collection. See Map-Reduce and Sharded Collections (page 463).
On this page
Sharded Collection as Input (page 464)
Sharded Collection as Output (page 464)
Map-reduce supports operations on sharded collections, both as an input and as an output. This section describes the
behaviors of mapReduce specific to sharded collections.
Sharded Collection as Input When using sharded collection as the input for a map-reduce operation, mongos will
automatically dispatch the map-reduce job to each shard in parallel. There is no special option required. mongos will
wait for jobs on all shards to finish.
Note:
During later map-reduce jobs, MongoDB splits chunks as needed.
Balancing of chunks for the output collection is automatically prevented during post-processing to avoid con-
currency issues.
In MongoDB 2.0:
mongos retrieves the results from each shard, performs a merge sort to order the results, and proceeds to the
reduce/finalize phase as needed. mongos then writes the result to the output collection in sharded mode.
This model requires only a small amount of memory, even for large data sets.
Shard chunks are not automatically split during insertion. This requires manual intervention until the chunks
are granular and balanced.
Important: For best results, only use the sharded output options for mapReduce in version 2.2 or later.
The map-reduce operation is composed of many tasks, including reads from the input collection, executions of the
map function, executions of the reduce function, writes to a temporary collection during processing, and writes to
the output collection.
During the operation, map-reduce takes the following locks:
The read phase takes a read lock. It yields every 100 documents.
The insert into the temporary collection takes a write lock for a single write.
If the output collection does not exist, the creation of the output collection takes a write lock.
If the output collection exists, then the output actions (i.e. merge, replace, reduce) take a write lock. This
write lock is global, and blocks all operations on the mongod instance.
Note: The final write lock during post-processing makes the results appear atomically. However, output actions
merge and reduce may take minutes to process. For the merge and reduce, the nonAtomic flag is avail-
able, which releases the lock between writing each output document. See the db.collection.mapReduce()
reference for more information.
Map-Reduce Examples
On this page
Return the Total Price Per Customer (page 465)
Calculate Order and Total Quantity with Average Quantity Per Item (page 466)
In the mongo shell, the db.collection.mapReduce() method is a wrapper around the mapReduce command.
The following examples use the db.collection.mapReduce() method:
Consider the following map-reduce operations on a collection orders that contains documents of the following
prototype:
{
_id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A',
price: 25,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
Return the Total Price Per Customer Perform the map-reduce operation on the orders collection to group by
the cust_id, and calculate the sum of the price for each cust_id:
1. Define the map function to process each input document:
In the function, this refers to the document that the map-reduce operation is processing.
The function maps the price to the cust_id for each document and emits the cust_id and price
pair.
var mapFunction1 = function() {
emit(this.cust_id, this.price);
};
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices:
The valuesPrices is an array whose elements are the price values emitted by the map function and
grouped by keyCustId.
The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function
and the reduceFunction1 reduce function.
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
Calculate Order and Total Quantity with Average Quantity Per Item In this example, you will perform a
map-reduce operation on the orders collection for all documents that have an ord_date value greater than
01/01/2012. The operation groups by the item.sku field, and calculates the number of orders and the total
quantity ordered for each sku. The operation concludes by calculating the average quantity per order for each sku
value:
1. Define the map function to process each input document:
In the function, this refers to the document that the map-reduce operation is processing.
For each item, the function associates the sku with a new object value that contains the count of 1
and the item qty for the order and emits the sku and value pair.
var mapFunction2 = function() {
for (var idx = 0; idx < this.items.length; idx++) {
var key = this.items[idx].sku;
var value = {
count: 1,
qty: this.items[idx].qty
};
emit(key, value);
}
};
2. Define the corresponding reduce function with two arguments keySKU and countObjVals:
countObjVals is an array whose elements are the objects mapped to the grouped keySKU values
passed by map function to the reducer function.
The function reduces the countObjVals array to a single object reducedValue that contains the
count and the qty fields.
In reducedVal, the count field contains the sum of the count fields from the individual array ele-
ments, and the qty field contains the sum of the qty fields from the individual array elements.
var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal = { count: 0, qty: 0 };
return reducedVal;
};
3. Define a finalize function with two arguments key and reducedVal. The function modifies the
reducedVal object to add a computed field named avg and returns the modified object:
reducedVal.avg = reducedVal.qty/reducedVal.count;
return reducedVal;
};
4. Perform the map-reduce operation on the orders collection using the mapFunction2,
reduceFunction2, and finalizeFunction2 functions.
db.orders.mapReduce( mapFunction2,
reduceFunction2,
{
out: { merge: "map_reduce_example" },
query: { ord_date:
{ $gt: new Date('01/01/2012') }
},
finalize: finalizeFunction2
}
)
This operation uses the query field to select only those documents with ord_date greater than new
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the
map_reduce_example collection already exists, the operation will merge the existing contents with the
results of this map-reduce operation.
On this page
Data Setup (page 467)
Initial Map-Reduce of Current Collection (page 468)
Subsequent Incremental Map-Reduce (page 469)
Map-reduce operations can handle complex aggregation tasks. To perform map-reduce operations, MongoDB provides
the mapReduce command and, in the mongo shell, the db.collection.mapReduce() wrapper method.
If the map-reduce data set is constantly growing, you may want to perform an incremental map-reduce rather than
performing the map-reduce operation over the entire data set each time.
To perform incremental map-reduce:
1. Run a map-reduce job over the current collection and output the result to a separate collection.
2. When you have more data to process, run subsequent map-reduce job with:
the query parameter that specifies conditions that match only the new documents.
the out parameter that specifies the reduce action to merge the new results into the existing output
collection.
Consider the following example where you schedule a map-reduce operation on a sessions collection to run at the
end of each day.
Data Setup The sessions collection contains documents that log users sessions each day, for example:
Initial Map-Reduce of Current Collection Run the first map-reduce operation as follows:
1. Define the map function that maps the userid to an object that contains the fields userid, total_time,
count, and avg_time:
var mapFunction = function() {
var key = this.userid;
var value = {
userid: this.userid,
total_time: this.length,
count: 1,
avg_time: 0
};
2. Define the corresponding reduce function with two arguments key and values to calculate the total time and
the count. The key corresponds to the userid, and the values is an array whose elements corresponds to
the individual objects mapped to the userid in the mapFunction.
var reduceFunction = function(key, values) {
var reducedObject = {
userid: key,
total_time: 0,
count:0,
avg_time:0
};
values.forEach( function(value) {
reducedObject.total_time += value.total_time;
reducedObject.count += value.count;
}
);
return reducedObject;
};
3. Define the finalize function with two arguments key and reducedValue. The function modifies the
reducedValue document to add another field average and returns the modified document.
var finalizeFunction = function (key, reducedValue) {
if (reducedValue.count > 0)
reducedValue.avg_time = reducedValue.total_time / reducedValue.cou
return reducedValue;
};
4. Perform map-reduce on the session collection using the mapFunction, the reduceFunction, and the
finalizeFunction functions. Output the results to a collection session_stat. If the session_stat
collection already exists, the operation will replace the contents:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
out: "session_stat",
finalize: finalizeFunction
}
)
Subsequent Incremental Map-Reduce Later, as the sessions collection grows, you can run additional map-
reduce operations. For example, add new documents to the sessions collection:
db.sessions.save( { userid: "a", ts: ISODate('2011-11-05 14:17:00'), length: 100 } );
db.sessions.save( { userid: "b", ts: ISODate('2011-11-05 14:23:00'), length: 115 } );
db.sessions.save( { userid: "c", ts: ISODate('2011-11-05 15:02:00'), length: 125 } );
db.sessions.save( { userid: "d", ts: ISODate('2011-11-05 16:45:00'), length: 55 } );
At the end of the day, perform incremental map-reduce on the sessions collection, but use the query field to select
only the new documents. Output the results to the collection session_stat, but reduce the contents with the
results of the incremental map-reduce:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } },
out: { reduce: "session_stat" },
finalize: finalizeFunction
}
);
The map function is a JavaScript function that associates or maps a value with a key and emits the key and value
pair during a map-reduce (page 462) operation.
To verify the key and value pairs emitted by the map function, write your own emit function.
Consider a collection orders that contains documents of the following prototype:
{
_id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A',
price: 250,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
1. Define the map function that maps the price to the cust_id for each document and emits the cust_id and
price pair:
var map = function() {
emit(this.cust_id, this.price);
};
3. Invoke the map function with a single document from the orders collection:
var myDoc = db.orders.findOne( { _id: ObjectId("50a8240b927d5d8b5891743c") } );
map.apply(myDoc);
5. Invoke the map function with multiple documents from the orders collection:
var myCursor = db.orders.find( { cust_id: "abc123" } );
while (myCursor.hasNext()) {
var doc = myCursor.next();
print ("document _id= " + tojson(doc._id));
map.apply(doc);
print();
}
On this page
Confirm Output Type (page 471)
Ensure Insensitivity to the Order of Mapped Values (page 471)
Ensure Reduce Function Idempotence (page 472)
The reduce function is a JavaScript function that reduces to a single object all the values associated with a par-
ticular key during a map-reduce (page 462) operation. The reduce function must meet various requirements. This
tutorial helps verify that the reduce function meets the following criteria:
The reduce function must return an object whose type must be identical to the type of the value emitted by
the map function.
The order of the elements in the valuesArray should not affect the output of the reduce function.
The reduce function must be idempotent.
For a list of all the requirements for the reduce function, see mapReduce, or the mongo shell helper method
db.collection.mapReduce().
Confirm Output Type You can test that the reduce function returns a value that is the same type as the value
emitted from the map function.
1. Define a reduceFunction1 function that takes the arguments keyCustId and valuesPrices.
valuesPrices is an array of integers:
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
5. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
return reducedValue;
};
8. Verify the reduceFunction2 returned a document with exactly the count and the qty field:
{ "count" : 6, "qty" : 30 }
Ensure Insensitivity to the Order of Mapped Values The reduce function takes a key and a values array as
its argument. You can test that the result of the reduce function does not depend on the order of the elements in the
values array.
1. Define a sample values1 array and a sample values2 array that only differ in the order of the array elements:
var values1 = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];
var values2 = [
{ count: 3, qty: 15 },
{ count: 1, qty: 5 },
{ count: 2, qty: 10 }
];
2. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
return reducedValue;
};
3. Invoke the reduceFunction2 first with values1 and then with values2:
reduceFunction2('myKey', values1);
reduceFunction2('myKey', values2);
Ensure Reduce Function Idempotence Because the map-reduce operation may call a reduce multiple times for
the same key, and wont call a reduce for single instances of a key in the working set, the reduce function must
return a value of the same type as the value emitted from the map function. You can test that the reduce function
process reduced values without affecting the final value.
1. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
return reducedValue;
};
3. Define a sample valuesIdempotent array that contains an element that is a call to the reduceFunction2
function:
var valuesIdempotent = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
reduceFunction2(myKey, [ { count:3, qty: 15 } ] )
];
4. Define a sample values1 array that combines the values passed to reduceFunction2:
var values1 = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];
5. Invoke the reduceFunction2 first with myKey and valuesIdempotent and then with myKey and
values1:
reduceFunction2(myKey, valuesIdempotent);
reduceFunction2(myKey, values1);
Aggregation Pipeline Quick Reference (page 473) Quick reference card for aggregation pipeline.
Aggregation Commands (page 479) The reference for the data aggregation commands, which provide the interfaces
to MongoDBs aggregation capability.
Aggregation Commands Comparison (page 480) A comparison of group, mapReduce and aggregate that ex-
plores the strengths and limitations of each aggregation modality.
https://docs.mongodb.org/manual/reference/operator/aggregation Aggregation pipeline
operations have a collection of operators available to define and manipulate documents in pipeline stages.
Variables in Aggregation Expressions (page 482) Use of variables in aggregation pipeline expressions.
SQL to Aggregation Mapping Chart (page 482) An overview common aggregation operations in SQL and Mon-
goDB using the aggregation pipeline and operators in MongoDB and common SQL statements.
On this page
Stages (page 473)
Expressions (page 474)
Accumulators (page 478)
Stages
In the db.collection.aggregate method, pipeline stages appear in an array. Documents pass through the
stages in sequence. All except the $out and $geoNear stages can appear multiple times in a pipeline.
Name Description
$project Reshapes each document in the stream, such as by adding new fields or removing existing fields. For
each input document, outputs one document.
$match Filters the document stream to allow only matching documents to pass unmodified into the next
pipeline stage. $match uses standard MongoDB queries. For each input document, outputs either
one document (a match) or zero documents (no match).
$redact Reshapes each document in the stream by restricting the content for each document based on
information stored in the documents themselves. Incorporates the functionality of $project and
$match. Can be used to implement field level redaction. For each input document, outputs either
one or zero document.
$limit Passes the first n documents unmodified to the pipeline where n is the specified limit. For each input
document, outputs either one document (for the first n documents) or zero documents (after the first n
documents).
$skip Skips the first n documents where n is the specified skip number and passes the remaining documents
unmodified to the pipeline. For each input document, outputs either zero documents (for the first n
documents) or one document (if after the first n documents).
$unwind Deconstructs an array field from the input documents to output a document for each element. Each
output document replaces the array with an element value. For each input document, outputs n
documents where n is the number of array elements and can be zero for an empty array.
$group Groups input documents by a specified identifier expression and applies the accumulator
expression(s), if specified, to each group. Consumes all input documents and outputs one document
per each distinct group. The output documents only contain the identifier field and, if specified,
accumulated fields.
$sample Randomly selects the specified number of documents from its input.
$sort Reorders the document stream by a specified sort key. Only the order changes; the documents remain
unmodified. For each input document, outputs one document.
$geoNear Returns an ordered stream of documents based on the proximity to a geospatial point. Incorporates
the functionality of $match, $sort, and $limit for geospatial data. The output documents
include an additional distance field and can include a location identifier field.
$lookup Performs a left outer join to another collection in the same database to filter in documents from the
joined collection for processing.
$out Writes the resulting documents of the aggregation pipeline to a collection. To use the $out stage, it
must be the last stage in the pipeline.
Returns statistics regarding the use of each index for the collection.
$indexStats
Expressions
Expressions can include field paths and system variables (page 474), literals (page 475), expression objects (page 475),
and expression operators (page 475). Expressions can be nested.
Field Path and System Variables Aggregation expressions use field path to access fields in the input documents.
To specify a field path, use a string that prefixes with a dollar sign $ the field name or the dotted field name, if the field
is in embedded document. For example, "$user" to specify the field path for the user field or "$user.name" to
specify the field path to "user.name" field.
"$<field>" is equivalent to "$$CURRENT.<field>" where the CURRENT (page 482) is a system variable that
defaults to the root of the current object in the most stages, unless stated otherwise in specific stages. CURRENT
(page 482) can be rebound.
Along with the CURRENT (page 482) system variable, other system variables (page 482) are also available for use in
expressions. To use user-defined variables, use $let and $map expressions. To access variables in expressions, use
Literals Literals can be of any type. However, MongoDB parses string literals that start with a dollar sign $ as a path
to a field and numeric/boolean literals in expression objects (page 475) as projection flags. To avoid parsing literals,
use the $literal expression.
If the expressions are numeric or boolean literals, MongoDB treats the literals as projection flags (e.g. 1 or true to
include the field), valid only in the $project stage. To avoid treating numeric or boolean literals as projection flags,
use the $literal expression to wrap the numeric or boolean literals.
Operator Expressions Operator expressions are similar to functions that take arguments. In general, these expres-
sions take an array of arguments and have the following form:
{ <operator>: [ <argument1>, <argument2> ... ] }
If operator accepts a single argument, you can omit the outer array designating the argument list:
{ <operator>: <argument> }
To avoid parsing ambiguity if the argument is a literal array, you must wrap the literal array in a $literal expression
or keep the outer array that designates the argument list.
Boolean Expressions Boolean expressions evaluate their argument expressions as booleans and return a boolean as
the result.
In addition to the false boolean value, Boolean expression evaluates as false the following: null, 0, and
undefined values. The Boolean expression evaluates all other values as true, including non-zero numeric values
and arrays.
Name Description
$and Returns true only when all its expressions evaluate to true. Accepts any number of argument
expressions.
$or Returns true when any of its expressions evaluates to true. Accepts any number of argument
expressions.
$not Returns the boolean value that is the opposite of its argument expression. Accepts a single argument
expression.
Set Expressions Set expressions performs set operation on arrays, treating arrays as sets. Set expressions ignores
the duplicate entries in each input array and the order of the elements.
If the set operation returns a set, the operation filters out duplicates in the result to output an array that contains only
unique entries. The order of the elements in the output array is unspecified.
If a set contains a nested array element, the set expression does not descend into the nested array but evaluates the
array at top-level.
Name Description
$setEquals Returns true if the input sets have the same distinct elements. Accepts two or more argument
expressions.
Returns a set with elements that appear in all of the input sets. Accepts any number of argument
$setIntersection
expressions.
$setUnion Returns a set with elements that appear in any of the input sets. Accepts any number of argument
expressions.
Returns a set with elements that appear in the first set but not in the second set; i.e. performs a
$setDifference
relative complement9 of the second set relative to the first. Accepts exactly two argument
expressions.
$setIsSubsetReturns true if all elements of the first set appear in the second set, including when the first set
equals the second set; i.e. not a strict subset10 . Accepts exactly two argument expressions.
Returns true if any elements of a set evaluate to true; otherwise, returns false. Accepts a
$anyElementTrue
single argument expression.
Returns true if no element of a set evaluates to false, otherwise, returns false. Accepts a
$allElementsTrue
single argument expression.
Comparison Expressions Comparison expressions return a boolean except for $cmp which returns a number.
The comparison expressions take two argument expressions and compare both value and type, using the specified
BSON comparison order (page 195) for values of different types.
Name Description
$cmp Returns: 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the
first value is less than the second.
$eq Returns true if the values are equivalent.
$gt Returns true if the first value is greater than the second.
$gte Returns true if the first value is greater than or equal to the second.
$lt Returns true if the first value is less than the second.
$lte Returns true if the first value is less than or equal to the second.
$ne Returns true if the values are not equivalent.
Arithmetic Expressions Arithmetic expressions perform mathematic operations on numbers. Some arithmetic ex-
pressions can also support date arithmetic.
9 http://en.wikipedia.org/wiki/Complement_(set_theory)
10 http://en.wikipedia.org/wiki/Subset
Name Description
$abs Returns the absolute value of a number.
$add Adds numbers to return the sum, or adds numbers and a date to return a new date. If adding numbers
and a date, treats the numbers as milliseconds. Accepts any number of argument expressions, but at
most, one expression can resolve to a date.
$ceil Returns the smallest integer greater than or equal to the specified number.
$divide Returns the result of dividing the first number by the second. Accepts two argument expressions.
$exp Raises e to the specified exponent.
$floor Returns the largest integer less than or equal to the specified number.
$ln Calculates the natural log of a number.
$log Calculates the log of a number in the specified base.
$log10 Calculates the log base 10 of a number.
$mod Returns the remainder of the first number divided by the second. Accepts two argument expressions.
Multiplies numbers to return the product. Accepts any number of argument expressions.
$multiply
$pow Raises a number to the specified exponent.
$sqrt Calculates the square root.
Returns the result of subtracting the second value from the first. If the two values are numbers, return
$subtract
the difference. If the two values are dates, return the difference in milliseconds. If the two values are a
date and a number in milliseconds, return the resulting date. Accepts two argument expressions. If the
two values are a date and a number, specify the date argument first as it is not meaningful to subtract a
date from a number.
$trunc Truncates a number to its integer.
String Expressions String expressions, with the exception of $concat, only have a well-defined behavior for
strings of ASCII characters.
$concat behavior is well-defined regardless of the characters used.
Name Description
$concat Concatenates any number of strings.
$substr Returns a substring of a string, starting at a specified index position up to a specified length. Accepts
three expressions as arguments: the first argument must resolve to a string, and the second and third
arguments must resolve to integers.
$toLower Converts a string to lowercase. Accepts a single argument expression.
$toUpper Converts a string to uppercase. Accepts a single argument expression.
Performs case-insensitive string comparison and returns: 0 if two strings are equivalent, 1 if the first
$strcasecmp
string is greater than the second, and -1 if the first string is less than the second.
Name Description
Text Search Expressions
$meta Access text search metadata.
Name Description
$arrayElemAt Returns the element at the specified array index.
$concatArrays Concatenates arrays to return the concatenated array.
$filter Selects a subset of the array to return an array with only the elements that match the filter
Array Expressions
condition.
$isArray Determines if the operand is an array. Returns a boolean.
$size Returns the number of elements in the array. Accepts a single expression as argument.
$slice Returns a subset of an array.
Name Description
$map Applies a subexpression to each element of an array and returns the array of resulting values in order
Variable Expressions Accepts named parameters.
$let Defines variables for use within the scope of a subexpression and returns the result of the subexpress
Accepts named parameters.
Name Description
Return a value without parsing. Use for values that the aggregation pipeline may interpret as an
$literal
Literal Expressions
expression. For example, use a $literal expression to a string that starts with a $ to avoid parsing
a field path.
Name Description
$dayOfYear Returns the day of the year for a date as a number between 1 and 366 (leap year).
$dayOfMonth Returns the day of the month for a date as a number between 1 and 31.
$dayOfWeek Returns the day of the week for a date as a number between 1 (Sunday) and 7 (Saturday).
$year Returns the year for a date as a number (e.g. 2014).
$month Returns the month for a date as a number between 1 (January) and 12 (December).
Date Expressions $week Returns the week number for a date as a number between 0 (the partial week that precedes the
first Sunday of the year) and 53 (leap year).
$hour Returns the hour for a date as a number between 0 and 23.
$minute Returns the minute for a date as a number between 0 and 59.
$second Returns the seconds for a date as a number between 0 and 60 (leap seconds).
$millisecondReturns the milliseconds of a date as a number between 0 and 999.
Returns the date as a formatted string.
$dateToString
Name Description
$cond A ternary operator that evaluates one expression, and depending on the result, returns the value o
the other two expressions. Accepts either three expressions in an ordered list or three named par
Conditional Expressions
$ifNullReturns either the non-null result of the first expression or the result of the second expression if t
expression results in a null result. Null result encompasses instances of undefined values or miss
fields. Accepts two expressions as arguments. The result of the second expression can be null.
Accumulators
Changed in version 3.2: Some accumulators are now available in the $project stage. In previous versions of
MongoDB , accumulators are available only for the $group stage.
Accumulators, when used in the $group stage, maintain their state (e.g. totals, maximums, minimums, and related
data) as documents progress through the pipeline.
When used in the $group stage, accumulators take as input a single expression, evaluating the expression once for
each input document, and maintain their stage for the group of documents that share the same group key.
When used in the $project stage, the accumulators do not maintain their state. When used in the $project stage,
accumulators take as input either a single argument or multiple arguments.
Name Description
$sum Returns a sum of numerical values. Ignores non-numeric values.
Changed in version 3.2: Available in both $group and $project stages.
$avg Returns an average of numerical values. Ignores non-numeric values.
Changed in version 3.2: Available in both $group and $project stages.
$first Returns a value from the first document for each group. Order is only defined if the documents are
in a defined order.
Available in $group stage only.
$last Returns a value from the last document for each group. Order is only defined if the documents are in
a defined order.
Available in $group stage only.
$max Returns the highest expression value for each group.
Changed in version 3.2: Available in both $group and $project stages.
$min Returns the lowest expression value for each group.
Changed in version 3.2: Available in both $group and $project stages.
$push Returns an array of expression values for each group.
Available in $group stage only.
$addToSet Returns an array of unique expression values for each group. Order of the array elements is
undefined.
Available in $group stage only.
$stdDevPopReturns the population standard deviation of the input values.
Changed in version 3.2: Available in both $group and $project stages.
Returns the sample standard deviation of the input values.
$stdDevSamp
Changed in version 3.2: Available in both $group and $project stages.
Aggregation Commands
On this page
Aggregation Commands (page 479)
Aggregation Methods (page 479)
Aggregation Commands
Name Description
aggregate Performs aggregation tasks (page 447) such as group using the aggregation framework.
count Counts the number of documents in a collection.
distinct Displays the distinct values found for a specified key in a collection.
group Groups documents in a collection by the specified key and performs simple aggregation.
mapReduce Performs map-reduce (page 462) aggregation for large data sets.
Aggregation Methods
Name Description
db.collection.aggregate()Provides access to the aggregation pipeline (page 447).
db.collection.group() Groups documents in a collection by the specified key and performs simple
aggregation.
db.collection.mapReduce()Performs map-reduce (page 462) aggregation for large data sets.
The following table provides a brief overview of the features of the MongoDB aggregation commands.
On this page
User Variables (page 482)
System Variables (page 482)
Aggregation expressions (page 474) can use both user-defined and system variables.
Variables can hold any BSON type data (page 194). To access the value of the variable, use a string with the variable
name prefixed with double dollar signs ($$).
If the variable references an object, to access a specific field in the object, use the dot notation; i.e.
"$$<variable>.<field>".
User Variables
User variable names can contain the ascii characters [_a-zA-Z0-9] and any non-ascii character.
User variable names must begin with a lowercase ascii letter [a-z] or a non-ascii character.
System Variables
See also:
$let, $redact, $map
On this page
Examples (page 483)
Additional Resources (page 485)
The aggregation pipeline (page 447) allows MongoDB to provide native aggregation capabilities that corresponds to
many common data aggregation operations in SQL.
The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corre-
sponding MongoDB aggregation operators:
SQL Terms, Functions, and Concepts MongoDB Aggregation Operators
WHERE $match
GROUP BY $group
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum
join $lookup
New in version 3.2.
Examples
The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB state-
ments. The examples in the table assume the following conditions:
The SQL examples assume two tables, orders and order_lineitem that join by the
order_lineitem.order_id and the orders.id columns.
The MongoDB examples assume one collection orders that contain documents of the following prototype:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: [ { sku: "xxx", qty: 25, price: 1 },
{ sku: "yyy", qty: 25, price: 1 } ]
}
Additional Resources
MongoDB Analytics: Learn Aggregation by Example: Exploratory Analytics and Visualization Using Flight
Data14
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregation Framework and Hadoop15
The Aggregation Framework16
Webinar: Exploring the Aggregation Framework17
Quick Reference Cards18
11 http://www.mongodb.com/mongodb-and-mysql-compared?jmp=docs
12 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
13 https://www.mongodb.com/products/consulting?jmp=docs#database_modernization
14 http://www.mongodb.com/presentations/mongodb-analytics-learn-aggregation-example-exploratory-analytics-and-visualization?jmp=docs
15 http://www.mongodb.com/presentations/mongodb-time-series-data-part-2-analyzing-time-series-data-using-aggregation-
framework?jmp=docs
16 https://www.mongodb.com/presentations/aggregation-framework-0?jmp=docs
17 https://www.mongodb.com/webinar/exploring-the-aggregation-framework?jmp=docs
18 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
Indexes
On this page
Index Types (page 487)
Index Properties (page 490)
Index Use (page 491)
Covered Queries (page 491)
Index Intersection (page 491)
Restrictions (page 491)
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection
scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an
appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Indexes are special data structures 1 that store a small portion of the collections data set in an easy to traverse form.
The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index
entries supports efficient equality matches and range-based query operations. In addition, MongoDB can return sorted
results by using the ordering in the index.
The following diagram illustrates a query that selects and orders the matching documents using an index:
Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB defines indexes at
the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection.
MongoDB provides a number of different index types to support specific types of data and queries.
Default _id
All MongoDB collections have an index on the _id field that exists by default. If applications do not specify a value
for _id the driver or the mongod will create an _id field with an ObjectId value.
The _id index is unique and prevents clients from inserting two documents with the same value for the _id field.
1 MongoDB indexes use a B-tree data structure.
487
MongoDB Documentation, Release 3.2.4
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-defined ascend-
ing/descending indexes on a single field of a document (page 493).
For a single-field index and sort operations, the sort order (i.e. ascending or descending) of the index key does not
matter because MongoDB can traverse the index in either direction.
See Single Field Indexes (page 493) and Sort with a Single Field Index (page 575) for more information on single-field
indexes.
Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes (page 495).
The order of fields listed in a compound index has significance. For instance, if a compound index consists of {
userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by
score.
For compound indexes and sort operations, the sort order (i.e. ascending or descending) of the index keys can deter-
mine whether the index can support a sort operation. See Sort Order (page 496) for more information on the impact
of index order on results in compound indexes.
See Compound Indexes (page 495) and Sort on Multiple Fields (page 576) for more information on compound indexes.
Multikey Index
MongoDB uses multikey indexes (page 497) to index the content stored in arrays. If you index a field that holds an
array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 497)
allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB
automatically determines whether to create a multikey index if the indexed field contains an array value; you do not
need to explicitly specify the multikey type.
See Multikey Indexes (page 497) and Multikey Index Bounds (page 526) for more information on multikey indexes.
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes
(page 505) that uses planar geometry when returning results and 2sphere indexes (page 503) that use spherical ge-
ometry to return results.
See 2d Index Internals (page 506) for a high level introduction to geospatial indexes.
Text Indexes
MongoDB provides a text index type that supports searching for string content in a collection. These text indexes
do not store language-specific stop words (e.g. the, a, or) and stem the words in a collection to only store root
words.
See Text Indexes (page 508) for more information on text indexes and search.
Hashed Indexes
To support hash based sharding (page 740), MongoDB provides a hashed index (page 512) type, which indexes the
hash of the value of a field. These indexes have a more random distribution of values along their range, but only
support equality matches and cannot support range-based queries.
Unique Indexes
The unique (page 514) property for an index causes MongoDB to reject duplicate values for the indexed field. Other
than the unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes.
Partial Indexes
Sparse Indexes
The sparse (page 519) property of an index ensures that the index only contain entries for documents that have the
indexed field. The index skips documents that do not have the indexed field.
You can combine the sparse index option with the unique index option to reject documents that have duplicate values
for a field but ignore documents that do not have the indexed key.
TTL Indexes
TTL indexes (page 512) are special indexes that MongoDB can use to automatically remove documents from a collec-
tion after a certain amount of time. This is ideal for certain types of information like machine generated event data,
logs, and session information that only need to persist in a database for a finite amount of time.
See: Expire Data from Collections by Setting TTL (page 231) for implementation instructions.
Indexes can improve the efficiency of read operations. The Analyze Query Performance (page 121) tutorial provides
an example of the execution statistics of a query with and without an index.
For information on how MongoDB chooses an index to use, see query optimizer (page 72).
When the query criteria and the projection of a query include only the indexed fields, MongoDB will return results
directly from the index without scanning any documents or bringing documents into memory. These covered queries
can be very efficient.
For more information on covered queries, see Covered Query (page 70).
8.1.6 Restrictions
Certain restrictions apply to indexes, such as the length of the index keys or the number of indexes per collection. See
Index Limitations for details.
On this page
Additional Resources (page 531)
These documents describe and provide examples of the types, configuration options, and behavior of indexes in Mon-
goDB. For an overview of indexing, see Index Introduction (page 487). For operational instructions, see Indexing
Tutorials (page 531). The Indexing Reference (page 579) documents the commands and operations specific to index
construction, maintenance, and querying in MongoDB, including index types and creation options.
Index Types (page 492) MongoDB provides different types of indexes for different purposes and different types of
content.
Single Field Indexes (page 493) A single field index only includes data from a single field of the documents in
a collection. MongoDB supports single field indexes on fields at the top level of a document and on fields
in sub-documents.
Compound Indexes (page 495) A compound index includes more than one field of the documents in a collec-
tion.
Multikey Indexes (page 497) A multikey index is an index on an array field, adding an index key for each value
in the array.
Geospatial Indexes and Queries (page 500) Geospatial indexes support location-based searches on data that is
stored as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 508) Text indexes support search of string content in documents.
Hashed Index (page 512) Hashed indexes maintain entries with hashes of the values of the indexed field and
are primarily used with sharded clusters to support hashed shard keys.
Index Properties (page 512) The properties you can specify when building indexes.
TTL Indexes (page 512) The TTL index is used for TTL collections, which expire data after a period of time.
Unique Indexes (page 514) A unique index causes MongoDB to reject all documents that contain a duplicate
value for the indexed field.
Partial Indexes (page 515) A partial index indexes only documents that meet specified filter criteria.
Sparse Indexes (page 519) A sparse index does not index documents that do not have the indexed field.
Index Creation (page 521) The options available when creating indexes.
Index Intersection (page 524) The use of index intersection to fulfill a query.
Multikey Index Bounds (page 526) The computation of bounds on a multikey index scan.
MongoDB provides a number of different index types. You can create indexes on any field or embedded field within
a document or embedded document.
In general, you should create indexes that support your common and user-facing queries. Having these indexes will
ensure that MongoDB scans the smallest possible number of documents.
In the mongo shell, you can create an index by calling the createIndex() method. For more detailed instructions
about building indexes, see the Indexing Tutorials (page 531) page.
Single Field Indexes (page 493) A single field index only includes data from a single field of the documents in a
collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in
sub-documents.
Compound Indexes (page 495) A compound index includes more than one field of the documents in a collection.
Multikey Indexes (page 497) A multikey index is an index on an array field, adding an index key for each value in
the array.
Geospatial Indexes and Queries (page 500) Geospatial indexes support location-based searches on data that is stored
as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 508) Text indexes support search of string content in documents.
Hashed Index (page 512) Hashed indexes maintain entries with hashes of the values of the indexed field and are
primarily used with sharded clusters to support hashed shard keys.
On this page
Example (page 493)
Cases (page 494)
MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections
have an index on the _id field (page 494), and applications and users may add additional indexes to support important
queries and operations.
MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that this
index-type supports. This document describes ascending/descending indexes that contain a single field. Consider the
following illustration of a single field index.
See also:
Compound Indexes (page 495) for information about indexes that include multiple fields, and Index Introduction
(page 487) for a higher level introduction to indexing in MongoDB.
Example
{ "_id" : ObjectId(...),
"name" : "Alice",
"age" : 27
}
Cases
_id Field Index MongoDB creates the _id index, which is an ascending unique index (page 514) on the _id field,
for all collections when the collection is created. You cannot remove the index on the _id field.
Think of the _id field as the primary key for a collection. Every document must have a unique _id field. You may
store any unique value in the _id field. The default value of _id is an ObjectId which is generated when the client
inserts the document. An ObjectId is a 12-byte unique identifier suitable for use as the value of an _id field.
Note: In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the
uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated
ObjectId.
Before version 2.2, capped collections did not have an _id field. In version 2.2 and newer, capped collections do
have an _id field, except those in the local database. See Capped Collections Recommendations and Restrictions
(page 229) for more information.
Indexes on Embedded Fields You can create indexes on fields within embedded documents, just as you can index
top-level fields in documents. Indexes on embedded fields differ from indexes on embedded documents (page 494),
which include the full content up to the maximum index size of the embedded document in the index. Instead,
indexes on embedded fields allow you to use a dot notation, to introspect into embedded documents.
Consider a collection named people that holds documents that resemble the following example document:
{"_id": ObjectId(...),
"name": "John Doe",
"address": {
"street": "Main",
"zipcode": "53511",
"state": "WI"
}
}
You can create an index on the address.zipcode field, using the following specification:
db.people.createIndex( { "address.zipcode": 1 } )
Indexes on Embedded Documents You can also create indexes on embedded documents.
For example, the factories collection contains documents that contain a metro field, such as:
{
_id: ObjectId(...),
metro: {
city: "New York",
state: "NY"
},
name: "Giant Factory"
}
The metro field is an embedded document, containing the embedded fields city and state. The following com-
mand creates an index on the metro field as a whole:
db.factories.createIndex( { metro: 1 } )
The following query can use the index on the metro field:
db.factories.find( { metro: { city: "New York", state: "NY" } } )
This query returns the above document. When performing equality matches on embedded documents, field order
matters and the embedded documents must match exactly. For example, the following query does not match the above
document:
db.factories.find( { metro: { state: "NY", city: "New York" } } )
Compound Indexes
On this page
Sort Order (page 496)
Prefixes (page 496)
Index Intersection (page 497)
2
MongoDB supports compound indexes, where a single index structure holds references to multiple fields within a
collections documents. The following diagram illustrates an example of a compound index on two fields:
Example
Consider a collection named products that holds documents that resemble the following document:
2 MongoDB imposes a limit of 31 fields for any compound index.
{
"_id": ObjectId(...),
"item": "Banana",
"category": ["food", "produce", "grocery"],
"location": "4th Street Store",
"stock": 4,
"type": "cases",
"arrival": Date(...)
}
If applications query on the item field as well as query on both the item field and the stock field, you can specify
a single compound index to support both of these queries:
db.products.createIndex( { "item": 1, "stock": 1 } )
Important: You may not create compound indexes that have hashed index fields. You will receive an error if you
attempt to create a compound index that includes a hashed index (page 512).
The order of the fields in a compound index is very important. In the previous example, the index will contain
references to documents sorted first by the values of the item field and, within each value of the item field, sorted
by values of the stock field. See Sort Order (page 496) for more information.
In addition to supporting queries that match on all the index fields, compound indexes can support queries that match
on the prefix of the index fields. For details, see Prefixes (page 496).
Sort Order
Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field indexes, the
sort order of keys doesnt matter because MongoDB can traverse the index in either direction. However, for compound
indexes (page 495), sort order can matter in determining whether the index can support a sort operation.
Consider a collection events that contains documents with the fields username and date. Applications can issue
queries that return results sorted first by ascending username values and then by descending (i.e. more recent to last)
date values, such as:
db.events.find().sort( { username: 1, date: -1 } )
or queries that return results sorted first by descending username values and then by ascending date values, such
as:
db.events.find().sort( { username: -1, date: 1 } )
However, the above index cannot support sorting by ascending username values and then by ascending date
values, such as the following:
db.events.find().sort( { username: 1, date: 1 } )
For more information on sort order and compound indexes, see Use Indexes to Sort Query Results (page 575).
Prefixes
Index prefixes are the beginning subsets of indexed fields. For example, consider the following compound index:
Index Intersection
Starting in version 2.6, MongoDB can use index intersection (page 524) to fulfill queries. The choice between creating
compound indexes that support your queries or relying on index intersection depends on the specifics of your system.
See Index Intersection and Compound Indexes (page 525) for more details.
Multikey Indexes
On this page
Create Multikey Index (page 497)
Index Bounds (page 498)
Limitations (page 498)
Examples (page 499)
To index a field that holds an array value, MongoDB creates an index key for each element in the array. These multikey
indexes support efficient queries against array fields. Multikey indexes can be constructed over arrays that hold both
scalar values (e.g. strings, numbers) and nested documents.
MongoDB automatically creates a multikey index if any indexed field is an array; you do not need to explicitly specify
the multikey type.
Index Bounds
If an index is multikey, then computation of the index bounds follows special rules. For details on multikey index
bounds, see Multikey Index Bounds (page 526).
Limitations
Compound Multikey Indexes For a compound (page 495) multikey index, each indexed document can have at most
one indexed field whose value is an array. As such, you cannot create a compound multikey index if more than one
to-be-indexed field of a document is an array. Or, if a compound multikey index already exists, you cannot insert a
document that would violate this restriction.
For example, consider a collection that contains the following document:
{ _id: 1, a: [ 1, 2 ], b: [ 1, 2 ], category: "AB - both arrays" }
You cannot create a compound multikey index { a: 1, b: 1 } on the collection since both the a and b fields
are arrays.
But consider a collection that contains the following documents:
{ _id: 1, a: [1, 2], b: 1, category: "A array" }
{ _id: 2, a: 1, b: [1, 2], category: "B array" }
A compound multikey index { a: 1, b: 1 } is permissible since for each document, only one field indexed
by the compound multikey index is an array; i.e. no document contains array values for both a and b fields. After
creating the compound multikey index, if you attempt to insert a document where both a and b fields are arrays,
MongoDB will fail the insert.
Shard Keys You cannot specify a multikey index as the shard key index.
Changed in version 2.6: However, if the shard key index is a prefix (page 496) of a compound index, the compound
index is allowed to become a compound multikey index if one of the other keys (i.e. keys that are not part of the shard
key) indexes an array. Compound multikey indexes can have an impact on performance.
Covered Queries A multikey index (page 497) cannot support a covered query (page 70).
Query on the Array Field as a Whole When a query filter specifies an exact match for an array as a whole
(page 106), MongoDB can use the multikey index to look up the first element of the query array but cannot use the
multikey index scan to find the whole array. Instead, after using the multikey index to look up the first element of the
query array, MongoDB retrieves the associated documents and filters for documents whose array matches the array in
the query.
For example, consider an inventory collection that contains the following documents:
{ _id: 5, type: "food", item: "aaa", ratings: [ 5, 8, 9 ] }
{ _id: 6, type: "food", item: "bbb", ratings: [ 5, 9 ] }
{ _id: 7, type: "food", item: "ccc", ratings: [ 9, 5, 8 ] }
{ _id: 8, type: "food", item: "ddd", ratings: [ 9, 5 ] }
{ _id: 9, type: "food", item: "eee", ratings: [ 5, 9, 5 ] }
The following query looks for documents where the ratings field is the array [ 5, 9 ]:
db.inventory.find( { ratings: [ 5, 9 ] } )
MongoDB can use the multikey index to find documents that have 5 at any position in the ratings array. Then,
MongoDB retrieves these documents and filters for documents whose ratings array equals the query array [ 5,
9 ].
Examples
Index Basic Arrays Consider a survey collection with the following document:
{ _id: 1, item: "ABC", ratings: [ 2, 5, 9 ] }
Since the ratings field contains an array, the index on ratings is multikey. The multikey index contains the
following three index keys, each pointing to the same document:
2,
5, and
9.
Index Arrays with Embedded Documents You can create multikey indexes on array fields that contain nested
objects.
Consider an inventory collection with documents of the following form:
{
_id: 1,
item: "abc",
stock: [
{ size: "S", color: "red", quantity: 25 },
{ size: "S", color: "blue", quantity: 10 },
{ size: "M", color: "blue", quantity: 50 }
]
}
{
_id: 2,
item: "def",
stock: [
{ size: "S", color: "blue", quantity: 20 },
{ size: "M", color: "blue", quantity: 5 },
{ size: "M", color: "black", quantity: 10 },
{ size: "L", color: "red", quantity: 2 }
]
}
{
_id: 3,
item: "ijk",
stock: [
{ size: "M", color: "blue", quantity: 15 },
{ size: "L", color: "blue", quantity: 100 },
{ size: "L", color: "red", quantity: 25 }
]
}
...
The following operation creates a multikey index on the stock.size and stock.quantity fields:
db.inventory.createIndex( { "stock.size": 1, "stock.quantity": 1 } )
The compound multikey index can support queries with predicates that include both indexed fields as well as predicates
that include only the index prefix "stock.size", as in the following examples:
db.inventory.find( { "stock.size": "M" } )
db.inventory.find( { "stock.size": "S", "stock.quantity": { $gt: 20 } } )
For details on how MongoDB can combine multikey index bounds, see Multikey Index Bounds (page 526). For more
information on behavior of compound indexes and prefixes, see compound indexes and prefixes (page 496).
The compound multikey index can also support sort operations, such as the following examples:
db.inventory.find( ).sort( { "stock.size": 1, "stock.quantity": 1 } )
db.inventory.find( { "stock.size": "M" } ).sort( { "stock.quantity": 1 } )
For more information on behavior of compound indexes and sort operations, see Use Indexes to Sort Query Results
(page 575).
On this page
Surfaces (page 501)
Location Data (page 501)
Query Operations (page 502)
Geospatial Indexes (page 502)
Geospatial Indexes and Sharding (page 503)
Additional Resources (page 503)
MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces
MongoDBs geospatial features. For complete examples of geospatial queries in MongoDB, see Geospatial Index
Tutorials (page 546).
Surfaces
Before storing your location data and writing queries, you must decide the type of surface to use to perform calcula-
tions. The type you choose affects how you store data, what type of index to build, and the syntax of your queries.
MongoDB offers two surface types:
Spherical To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use
2dsphere (page 503) index.
Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate
reference system for GeoJSON uses the WGS84 datum.
Flat To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d
(page 505) index.
Location Data
If you choose spherical surface calculations, you store location data as either:
GeoJSON Objects Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference
system for GeoJSON uses the WGS84 datum.
New in version 2.4: Support for GeoJSON storage and queries is new in version 2.4. Prior to version 2.4, all geospatial
data used coordinate pairs.
Changed in version 2.6: Support for additional GeoJSON types: MultiPoint, MultiLineString, MultiPolygon, Geome-
tryCollection.
MongoDB supports the following GeoJSON objects:
Point
LineString
Polygon
MultiPoint
MultiLineString
MultiPolygon
GeometryCollection
Legacy Coordinate Pairs MongoDB supports spherical surface calculations on legacy coordinate pairs using a
2dsphere index by converting the data to the GeoJSON Point type.
If you choose flat surface calculations via a 2d index, you can store data only as legacy coordinate pairs.
Query Operations
Inclusion MongoDB can query for locations contained entirely within a specified polygon. Inclusion queries use
the $geoWithin operator.
Both 2d and 2dsphere indexes can support inclusion queries. MongoDB does not require an index for inclusion
queries; however, such indexes will improve query performance.
Intersection MongoDB can query for locations that intersect with a specified geometry. These queries apply only
to data on a spherical surface. These queries use the $geoIntersects operator.
Only 2dsphere indexes support intersection.
Proximity MongoDB can query for the points nearest to another point. Proximity queries use the $near operator.
The $near operator requires a 2d or 2dsphere index.
Geospatial Indexes
MongoDB provides the following geospatial index types to support the geospatial queries.
Additional Resources
The following pages provide complete documentation for geospatial indexes and queries:
2dsphere Indexes (page 503) A 2dsphere index supports queries that calculate geometries on an earth-like sphere.
The index supports data stored as both GeoJSON objects and as legacy coordinate pairs.
2d Indexes (page 505) The 2d index supports data stored as legacy coordinate pairs and is intended for use in Mon-
goDB 2.2 and earlier.
geoHaystack Indexes (page 506) A haystack index is a special index optimized to return results over small areas. For
queries that use spherical geometry, a 2dsphere index is a better option than a haystack index.
2d Index Internals (page 506) Provides a more in-depth explanation of the internals of geospatial indexes. This ma-
terial is not necessary for normal operations but may be useful for troubleshooting and for further understanding.
On this page
Overview (page 503)
2dsphere Indexes 2dsphere (Version 2) (page 503)
Considerations (page 504)
Create a 2dsphere Index (page 504)
Overview A 2dsphere index supports queries that calculate geometries on an earth-like sphere. 2dsphere
index supports all MongoDB geospatial queries: queries for inclusion, intersection and proximity. See the
https://docs.mongodb.org/manual/reference/operator/query-geospatial for the query op-
erators that support geospatial queries.
The 2dsphere index supports data stored as GeoJSON (page 580) objects and as legacy coordinate pairs (See also
2dsphere Indexed Field Restrictions (page 504)). For legacy coordinate pairs, the index converts the data to GeoJSON
Point (page 581). For details on the supported GeoJSON objects, see GeoJSON Objects (page 580).
The default datum for an earth-like sphere is WGS84. Coordinate-axis order is longitude, latitude.
Additional GeoJSON Objects 2dsphere (Version 2) includes support for additional GeoJSON object:
MultiPoint (page 583), MultiLineString (page 583), MultiPolygon (page 583), and GeometryCollection (page 583).
For details on all supported GeoJSON objects, see GeoJSON Objects (page 580).
Considerations
geoNear and $geoNear Restrictions The geoNear command and the $geoNear pipeline stage require that
a collection have at most only one 2dsphere index and/or only one 2d (page 505) index whereas geospatial query
operators (e.g. $near and $geoWithin) permit collections to have multiple geospatial indexes.
The geospatial index restriction for the geoNear command and the $geoNear pipeline stage exists because neither
the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection
among multiple 2d indexes or 2dsphere indexes is ambiguous.
No such restriction applies for geospatial query operators since these operators take a location field, eliminating the
ambiguity.
Shard Key Restrictions You cannot use a 2dsphere index as a shard key when sharding a collection. However,
you can create and maintain a geospatial index on a sharded collection by using a different field as the shard key.
2dsphere Indexed Field Restrictions Fields with 2dsphere (page 503) indexes must hold geometry data in
the form of coordinate pairs or GeoJSON data. If you attempt to insert a document with non-geometry data in a
2dsphere indexed field, or build a 2dsphere index on a collection where the indexed field has non-geometry data,
the operation will fail.
Create a 2dsphere Index To create a 2dsphere index, use the db.collection.createIndex() method,
specifying the location field as the key and specify the string literal "2dsphere" as the index type:
db.collection.createIndex( { <location field> : "2dsphere" } )
Unlike a compound 2d (page 505) index which can reference one location field and one other field, a compound
(page 495) 2dsphere index can reference multiple location and non-location fields.
For more information on creating 2dspshere indexes, see Create a 2dsphere Index (page 554).
On this page
Considerations (page 505)
2d Indexes Behavior (page 505)
Points on a 2D Plane (page 505)
sparse Property (page 506)
Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for legacy coordinate
pairs used in MongoDB 2.2 and earlier.
Use a 2d index if:
your database has legacy location data from MongoDB 2.2 or earlier, and
you do not intend to store any location data as GeoJSON objects.
See the https://docs.mongodb.org/manual/reference/operator/query-geospatial for the
query operators that support geospatial queries.
Considerations The geoNear command and the $geoNear pipeline stage require that a collection have at most
only one 2d index and/or only one 2dsphere index (page 503) whereas geospatial query operators (e.g. $near and
$geoWithin) permit collections to have multiple geospatial indexes.
The geospatial index restriction for the geoNear command and the $geoNear pipeline stage exists because neither
the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection
among multiple 2d indexes or 2dsphere indexes is ambiguous.
No such restriction applies for geospatial query operators since these operators take a location field, eliminating the
ambiguity.
Do not use a 2d index if your location data includes GeoJSON objects. To index on both legacy coordinate pairs and
GeoJSON objects, use a 2dsphere (page 503) index.
You cannot use a 2d index as a shard key when sharding a collection. However, you can create and maintain a
geospatial index on a sharded collection by using a different field as the shard key.
Behavior The 2d index supports calculations on a flat, Euclidean plane. The 2d index also supports distance-only
calculations on a sphere, but for geometric calculations (e.g. $geoWithin) on a sphere, store data as GeoJSON
objects and use the 2dsphere index type.
A 2d index can reference two fields. The first must be the location field. A 2d compound index constructs queries
that select first on the location field, and then filters those results by the additional criteria. A compound 2d index can
cover queries.
Points on a 2D Plane To store location data as legacy coordinate pairs, use an array or an embedded document.
When possible, use the array format:
loc : [ <longitude> , <latitude> ]
Arrays are preferred as certain languages do not guarantee associative map ordering.
For all points, if you use longitude and latitude, store coordinates in longitude, latitude order.
sparse Property 2d indexes are sparse (page 519) by default and ignores the sparse: true (page 519) option. If
a document lacks a 2d index field (or the field is null or an empty array), MongoDB does not add an entry for the
document to the 2d index. For inserts, MongoDB inserts the document but does not add to the 2d index.
For a compound index that includes a 2d index key along with keys of other types, only the 2d index field determines
whether the index references a document.
On this page
A geoHaystack index is a special index that is optimized to return results over small areas. geoHaystack indexes
improve performance on queries that use flat geometry.
For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere in-
dexes (page 503) allow field reordering; geoHaystack indexes require the first field to be the location field. Also,
geoHaystack indexes are only usable via commands and so always return all results at once.
Behavior geoHaystack indexes create buckets of documents from the same geographic area in order to improve
performance for queries limited to that area. Each bucket in a geoHaystack index contains all the documents within
a specified proximity to a given longitude and latitude.
sparse Property geoHaystack indexes are sparse (page 519) by default and ignore the sparse: true (page 519)
option. If a document lacks a geoHaystack index field (or the field is null or an empty array), MongoDB does
not add an entry for the document to the geoHaystack index. For inserts, MongoDB inserts the document but does
not add to the geoHaystack index.
geoHaystack indexes include one geoHaystack index key and one non-geospatial index key; however, only the
geoHaystack index field determines whether the index references a document.
Create geoHaystack Index To create a geoHaystack index, see Create a Haystack Index (page 560). For
information and example on querying a haystack index, see Query a Haystack Index (page 561).
On this page
2d Index Internals Calculation of Geohash Values for 2d Indexes (page 506)
Multi-location Documents for 2d Indexes (page 507)
This document provides a more in-depth explanation of the internals of MongoDBs 2d geospatial indexes. This
material is not necessary for normal operations or application development but may be useful for troubleshooting and
for further understanding.
Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs,
MongoDB computes geohash values for the coordinate pairs within the specified location range (page 558) and then
indexes the geohash values.
To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a
two-bit value. For example, a two-bit representation of four quadrants would be:
01 11
00 10
These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For
a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top
left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11,
respectively.
To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have
the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the
upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101,
1111, 1110, and 1100, respectively.
While 2d geospatial indexes do not support more than one geospatial field in a document, you can use a multi-key
index (page 497) to index multiple coordinate pairs in a single document. In the simplest example you may have a
field (e.g. locs) that holds an array of coordinates, as in the following example:
db.places.save( {
locs : [ [ 55.5 , 42.3 ] ,
[ -74 , 44.74 ] ,
{ lng : 55.5 , lat : 42.3 } ]
} )
The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng :
55.5 , lat : 42.3 }.
You could then create a geospatial index on the locs field, as in the following:
db.places.createIndex( { "locs": "2d" } )
You may also model the location data as a field inside of an embedded document. In this case, the document would
contain a field (e.g. addresses) that holds an array of documents where each document has a field (e.g. loc:) that
holds location coordinates. For example:
db.records.save( {
name : "John Smith",
addresses : [ {
context : "home" ,
loc : [ 55.5, 42.3 ]
} ,
{
context : "work",
loc : [ -74 , 44.74 ]
}
]
} )
You could then create the geospatial index on the addresses.loc field as in the following example:
db.records.createIndex( { "addresses.loc": "2d" } )
To include the location field with the distance field in multi-location document queries, specify includeLocs:
true in the geoNear command.
Text Indexes
On this page
Overview (page 508)
Create Text Index (page 508)
Case Insensitivity (page 509)
Diacritic Insensitivity (page 510)
Tokenization Delimiters (page 510)
Index Entries (page 510)
Supported Languages and Stop Words (page 510)
sparse Property (page 510)
Restrictions (page 511)
Storage Requirements and Performance Costs (page 511)
Text Search Support (page 511)
Overview
MongoDB provides text indexes to support query operations that perform a text search of string content. text
indexes can include any field whose value is a string or an array of string elements.
To create a text index, use the db.collection.createIndex() method. To index a field that contains a
string or an array of string elements, include the field and specify the string literal "text" in the index document, as
in the following example:
db.reviews.createIndex( { comments: "text" } )
You can index multiple fields for the text index. The following example creates a text index on the fields subject
and comments:
db.reviews.createIndex(
{
subject: "text",
comments: "text"
}
)
A compound index (page 495) can include text index keys in combination with ascending/descending index keys.
For more information, see Compound Index (page 511).
Specify Weights For a text index, the weight of an indexed field denotes the significance of the field relative to
the other indexed fields in terms of the text search score.
For each indexed field in the document, MongoDB multiplies the number of matches by the weight and sums the
results. Using this sum, MongoDB then calculates the score for the document. See $meta operator for details on
returning and sorting by text scores.
The default weight is 1 for the indexed fields. To adjust the weights for the indexed fields, include the weights
option in the db.collection.createIndex() method.
For more information using weights to control the results of a text search, see Control Search Results with Weights
(page 570).
Wildcard Text Indexes When creating a text index on multiple fields, you can also use the wildcard specifier
($**). With a wildcard text index, MongoDB indexes every field that contains string data for each document in the
collection. The following example creates a text index using the wildcard specifier:
db.collection.createIndex( { "$**": "text" } )
This index allows for text search on all fields with string content. Such an index can be useful with highly unstructured
data if it is unclear which fields to include in the text index or for ad-hoc querying.
Wildcard text indexes are text indexes on multiple fields. As such, you can assign weights to specific fields during
index creation to control the ranking of the results. For more information using weights to control the results of a text
search, see Control Search Results with Weights (page 570).
Wildcard text indexes, as with all text indexes, can be part of a compound indexes. For example, the following creates
a compound index on the field a as well as the wildcard specifier:
db.collection.createIndex( { a: 1, "$**": "text" } )
As with all compound text indexes (page 511), since the a precedes the text index key, in order to perform a $text
search with this index, the query predicate must include an equality match conditions a. For information on compound
text indexes, see Compound Text Indexes (page 511).
Case Insensitivity
Diacritic Insensitivity
Tokenization Delimiters
Index Entries
text index tokenizes and stems the terms in the indexed fields for the index entries. text index stores one index
entry for each unique stemmed term in each indexed field for each document in the collection. The index uses simple
language-specific (page 510) suffix stemming.
MongoDB supports text search for various languages. text indexes drop language-specific stop words (e.g. in
English, the, an, a, and, etc.) and use simple language-specific suffix stemming. For a list of the supported
languages, see Text Search Languages (page 584).
If you specify a language value of "none", then the text index uses simple tokenization with no list of stop words
and no stemming.
To specify a language for the text index, see Specify a Language for Text Index (page 565).
sparse Property
text indexes are sparse (page 519) by default and ignore the sparse: true (page 519) option. If a document lacks a
text index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the
text index. For inserts, MongoDB inserts the document but does not add to the text index.
For a compound index that includes a text index key along with keys of other types, only the text index field
determines whether the index references a document. The other keys do not determine whether the index references
the documents or not.
4 http://www.unicode.org/Public/8.0.0/ucd/PropList.txt
5 http://www.unicode.org/Public/8.0.0/ucd/PropList.txt
Restrictions
One Text Index Per Collection A collection can have at most one text index.
Text Search and Hints You cannot use hint() if the query includes a $text query expression.
Text Index and Sort Sort operations cannot obtain sort order from a text index, even from a compound text index
(page 511); i.e. sort operations cannot use the ordering in the text index.
Compound Index A compound index (page 495) can include a text index key in combination with ascend-
ing/descending index keys. However, these compound indexes have the following restrictions:
A compound text index cannot include any other special index types, such as multi-key (page 497) or geospa-
tial (page 502) index fields.
If the compound text index includes keys preceding the text index key, to perform a $text search, the
query predicate must include equality match conditions on the preceding keys.
See also Text Index and Sort (page 511) for additional limitations.
For an example of a compound text index, see Limit the Number of Entries Scanned (page 571).
Drop a Text Index To drop a text index, pass the name of the index to the db.collection.dropIndex()
method. To get the name of the index, run the db.collection.getIndexes() method.
For information on the default naming scheme for text indexes as well as overriding the default name, see Specify
Name for text Index (page 568).
text indexes have the following storage requirements and performance costs:
text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed
field for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a
simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufficiently high limit on
open file descriptors. See the recommended settings (page 295).
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique
post-stemmed word in each indexed field of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents.
As a result, phrase queries will run much more effectively when the entire collection fits in RAM.
The text index supports $text query operations. For examples of text search, see the $text reference
page. For examples of $text operations in aggregation pipelines, see Text Search in the Aggregation Pipeline
(page 572).
Hashed Index
This operation creates a hashed index for the active collection on the a field.
In addition to the numerous index types (page 492) MongoDB supports, indexes can also have various properties. The
following documents detail the index properties that you can select when building an index.
TTL Indexes (page 512) The TTL index is used for TTL collections, which expire data after a period of time.
Unique Indexes (page 514) A unique index causes MongoDB to reject all documents that contain a duplicate value
for the indexed field.
Partial Indexes (page 515) A partial index indexes only documents that meet specified filter criteria.
Sparse Indexes (page 519) A sparse index does not index documents that do not have the indexed field.
TTL Indexes
On this page
Behavior (page 513)
Restrictions (page 513)
Additional Information (page 514)
TTL indexes are special single-field indexes that MongoDB can use to automatically remove documents from a col-
lection after a certain amount of time. Data expiration is useful for certain types of information like machine generated
event data, logs, and session information that only need to persist in a database for a finite amount of time.
To create a TTL index, use the db.collection.createIndex() method with the expireAfterSeconds
option on a field whose value is either a date (page 197) or an array that contains date values (page 197).
For example, to create a TTL index on the lastModifiedDate field of the eventlog collection, use the following
operation in the mongo shell:
db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )
Behavior
Expiration of Data TTL indexes expire documents after the specified number of seconds has passed since the
indexed field value; i.e. the expiration threshold is the indexed field value plus the specified number of seconds.
If the field is an array, and there are multiple date values in the index, MongoDB uses lowest (i.e. earliest) date value
in the array to calculate the expiration threshold.
If the indexed field in a document is not a date or an array that holds a date value(s), the document will not expire.
If a document does not contain the indexed field, the document will not expire.
Delete Operations A background thread in mongod reads the values in the index and removes expired documents
from the collection.
When the TTL thread is active, you will see delete (page 77) operations in the output of db.currentOp() or in the
data collected by the database profiler (page 249).
Timing of the Delete Operation When you build a TTL index in the background (page 522), the TTL thread can
begin deleting documents while the index is building. If you build a TTL index in the foreground, MongoDB begins
removing expired documents as soon as the index finishes building.
The TTL index does not guarantee that expired data will be deleted immediately upon expiration. There may be a
delay between the time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a
collection during the period between the expiration of the document and the running of the background task.
Because the duration of the removal operation depends on the workload of your mongod instance, expired data may
exist for some time beyond the 60 second period between runs of the background task.
Replica Sets On replica sets, the TTL background thread only deletes documents on the primary. However, the TTL
background thread does run on secondaries. Secondary members replicate deletion operations from the primary.
Support for Queries A TTL index supports queries in the same way non-TTL indexes do.
Record Allocation A collection with a TTL index has usePowerOf2Sizes enabled, and you cannot modify this
setting for the collection. As a result of enabling usePowerOf2Sizes, MongoDB must allocate more disk space
relative to data size. This approach helps mitigate the possibility of storage fragmentation caused by frequent delete
operations and leads to more predictable storage use patterns.
Restrictions
TTL indexes are a single-field indexes. Compound indexes (page 495) do not support TTL and ignores the
expireAfterSeconds option.
The _id field does not support TTL indexes.
You cannot create a TTL index on a capped collection (page 228) because MongoDB cannot remove documents
from a capped collection.
You cannot use createIndex() to change the value of expireAfterSeconds of an existing index.
Instead use the collMod database command in conjunction with the index collection flag. Otherwise, to
change the value of the option of an existing index, you must drop the index first and recreate.
If a non-TTL single-field index already exists for a field, you cannot create a TTL index on the same field
since you cannot create indexes that have the same key specification and differ only by the options. To
change a non-TTL single-field index to a TTL index, you must drop the index first and recreate with the
expireAfterSeconds option.
Additional Information
For examples, see Expire Data from Collections by Setting TTL (page 231).
Unique Indexes
On this page
Behavior (page 514)
A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field.
To create a unique index, use the db.collection.createIndex() method with the unique option set to
true. For example, to create a unique index on the user_id field of the members collection, use the following
operation in the mongo shell:
db.members.createIndex( { "user_id": 1 }, { unique: true } )
Behavior
Unique Constraint Across Separate Documents The unique constraint applies to separate documents in the col-
lection. That is, the unique index prevents separate documents from having the same value for the indexed key, but the
index does not prevent a document from having multiple elements or embedded documents in an indexed array from
having the same value. In the case of a single document with repeating values, the repeated value is inserted into the
index only once.
For example, a collection has a unique index on a.b:
db.collection.createIndex( { "a.b": 1 }, { unique: true } )
The unique index permits the insertion of the following document into the collection if no other document in the
collection has the a.b value of 5:
db.collection.insert( { a: [ { b: 5 }, { b: 5 } ] } )
Unique Index and Missing Field If a document does not have a value for the indexed field in a unique index, the
index will store a null value for this document. Because of the unique constraint, MongoDB will only permit one
document that lacks the indexed field. If there is more than one document without a value for the indexed field or is
missing the indexed field, the index build will fail with a duplicate key error.
For example, a collection has a unique index on x:
db.collection.createIndex( { "x": 1 }, { unique: true } )
The unique index allows the insertion of a document without the field x if the collection does not already contain a
document missing the field x:
db.collection.insert( { y: 1 } )
However, the unique index errors on the insertion of a document without the field x if the collection already contains
a document missing the field x:
db.collection.insert( { z: 1 } )
The operation fails to insert the document because of the violation of the unique constraint on the value of the field x:
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error index: test.collection.$a.b_1 dup key: { : null }"
}
})
You can combine the unique constraint with the sparse index (page 519) to filter these null values from the unique
index and avoid the error.
Restrictions You may not specify a unique constraint on a hashed index (page 512).
See also:
Create a Unique Index (page 534)
Partial Indexes
On this page
Behavior (page 516)
Restrictions (page 517)
Examples (page 518)
You can specify a partialFilterExpression option for all MongoDB index types (page 492).
Behavior
Query Coverage MongoDB will not use the partial index for a query or sort operation if using the index results
in an incomplete result set. To use the partial index, a query must contain the filter expression (or a modified filter
expression that specifies a subset of the filter expression) as part of its query condition.
For example, given the following index:
db.restaurants.createIndex(
{ cuisine: 1 },
{ partialFilterExpression: { rating: { $gt: 5 } } }
)
The following query can use the index since the query predicate includes the condition rating: { $gte: 8
} that matches a subset of documents matched by the index filter expression ratings: { $gt: 5 }:
db.restaurants.find( { cuisine: "Italian", rating: { $gte: 8 } } )
However, the following query cannot use the partial index on the cuisine field because using the index results in
an incomplete result set. Specifically, the query predicate includes the condition rating: { $lt: 8 } while
the index has the filter rating: { $gt: 5 }. That is, the query { cuisine: "Italian", rating:
{ $gte: 8 } } matches more documents (e.g. an Italian restaurant with a rating equal to 1) than are indexed.
db.restaurants.find( { cuisine: "Italian", rating: { $lt: 8 } } )
Similarly, the following query cannot use the partial index because the query predicate does not include the filter
expression and using the index would return an incomplete result set.
db.restaurants.find( { cuisine: "Italian" } )
Partial indexes offer a more expressive mechanism than Sparse Indexes (page 519) indexes to specify which documents
are indexed.
Sparse indexes selects documents to index solely based on the existence of the indexed field, or for compound indexes,
the existence of the indexed fields.
Partial indexes determine the index entries based on the specified filter. The filter can include fields other than the
index keys and can specify conditions other than just an existence check. For example, a partial index can implement
the same behavior as a sparse index:
db.contacts.createIndex(
{ name: 1 },
{ partialFilterExpression: { name: { $exists: true } } }
)
This partial index supports the same queries as a sparse index on the name field.
However, a partial index can also specify filter expressions on fields other than the index key. For example, the
following operation creates a partial index, where the index is on the name field but the filter expression is on the
email field:
db.contacts.createIndex(
{ name: 1 },
{ partialFilterExpression: { email: { $exists: true } } }
)
For the query optimizer to choose this partial index, the query predicate must include a non-null match on the email
field as well as a condition on the name field.
For example, the following query can use the index:
db.contacts.find( { name: "xyz", email: { $regex: /\.org$/ } } )
Restrictions
In MongoDB, you cannot create multiple versions of an index that differ only in the options. As such, you cannot
create multiple partial indexes that differ only by the filter expression.
You cannot specify both the partialFilterExpression option and the sparse option.
Earlier versions of MongoDB do not support partial indexes. For sharded clusters or replica sets, all nodes must be
version 3.2.
_id indexes cannot be partial indexes.
Shard key indexes cannot be partial indexes.
Examples
Create a Partial Index On A Collection Consider a collection restaurants containing documents that resemble
the following
{
"_id" : ObjectId("5641f6a7522545bc535b5dc9"),
"address" : {
"building" : "1007",
"coord" : [
-73.856077,
40.848447
],
"street" : "Morris Park Ave",
"zipcode" : "10462"
},
"borough" : "Bronx",
"cuisine" : "Bakery",
"rating" : { "date" : ISODate("2014-03-03T00:00:00Z"),
"grade" : "A",
"score" : 2
},
"name" : "Morris Park Bake Shop",
"restaurant_id" : "30075445"
}
You could add a partial index on the borough and cuisine fields choosing only to index documents where the
rating.grade field is A:
db.restaurants.createIndex(
{ borough: 1, cuisine: 1 },
{ partialFilterExpression: { 'rating.grade': { $eq: "A" } } }
)
Then, the following query on the restaurants collection uses the partial index to return the restaurants in the
Bronx with rating.grade equal to A:
db.restaurants.find( { borough: "Bronx", 'rating.grade': "A" } )
However, the following query cannot use the partial index because the query expression does not include the
rating.grade field:
db.restaurants.find( { borough: "Bronx", cuisine: "Bakery" } )
Partial Index with Unique Constraint Partial indexes only index the documents in a collection that meet a specified
filter expression. If you specify both the partialFilterExpression and a unique constraint (page 514), the
unique constraint only applies to the documents that meet the filter expression. A partial index with a unique constraint
does not prevent the insertion of documents that do not meet the unique constraint if the documents do not meet the
filter criteria.
For example, a collection users contains the following documents:
{ "_id" : ObjectId("56424f1efa0358a27fa1f99a"), "username" : "david", "age" : 29 }
{ "_id" : ObjectId("56424f37fa0358a27fa1f99b"), "username" : "amanda", "age" : 35 }
{ "_id" : ObjectId("56424fe2fa0358a27fa1f99c"), "username" : "rajiv", "age" : 57 }
The following operation creates an index that specifies a unique constraint (page 514) on the username field and a
partial filter expression age: { $gte: 21 }.
db.users.createIndex(
{ username: 1 },
{ unique: true, partialFilterExpression: { age: { $gte: 21 } } }
)
The index prevents the insertion of the following documents since documents already exist with the specified user-
names and the age fields are greater than 21:
db.users.insert( { username: "david", age: 27 } )
db.users.insert( { username: "amanda", age: 25 } )
db.users.insert( { username: "rajiv", age: 32 } )
However, the following documents with duplicate usernames are allowed since the unique constraint only applies to
documents with age greater than or equal to 21.
db.users.insert( { username: "david", age: 20 } )
db.users.insert( { username: "amanda" } )
db.users.insert( { username: "rajiv", age: null } )
Sparse Indexes
On this page
Behavior (page 519)
Examples (page 520)
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null
value. The index skips over any document that is missing the indexed field. The index is sparse because it does not
include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing
null values for those documents that do not contain the indexed field.
Important: Changed in version 3.2: Starting in MongoDB 3.2, MongoDB provides the option to create partial
indexes (page 515). Partial indexes offer a superset of the functionality of sparse indexes. If you are using MongoDB
3.2 or later, partial indexes (page 515) should be preferred over sparse indexes.
To create a sparse index, use the db.collection.createIndex() method with the sparse option set to
true. For example, the following operation in the mongo shell creates a sparse index on the xmpp_id field of the
addresses collection:
db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } )
Note: Do not confuse sparse indexes in MongoDB with block-level6 indexes in other databases. Think of them as
dense indexes with a specific filter.
Behavior
For example, the query { x: { $exists: false } } will not use a sparse index on the x field unless
explicitly hinted. See Sparse Index On A Collection Cannot Return Complete Results (page 520) for an example that
details the behavior.
Indexes that are sparse by Default 2dsphere (version 2) (page 503), 2d (page 505), geoHaystack (page 506), and
text (page 508) indexes are always sparse.
sparse Compound Indexes Sparse compound indexes (page 495) that only contain ascending/descending index
keys will index a document as long as the document contains at least one of the keys.
For sparse compound indexes that contain a geospatial key (i.e. 2dsphere (page 503), 2d (page 505), or geoHaystack
(page 506) index keys) along with ascending/descending index key(s), only the existence of the geospatial field(s) in
a document determine whether the index references the document.
For sparse compound indexes that contain text (page 508) index keys along with ascending/descending index keys,
only the existence of the text index field(s) determine whether the index references a document.
sparse and unique Properties An index that is both sparse and unique (page 514) prevents collection from
having documents with duplicate values for a field but allows multiple documents that omit the key.
Examples
Create a Sparse Index On A Collection Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
Then, the following query on the scores collection uses the sparse index to return the documents that have the
score field less than ($lt) 90:
db.scores.find( { score: { $lt: 90 } } )
Because the document for the userid "newbie" does not contain the score field and thus does not meet the query
criteria, the query can use the sparse index to return the results:
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
Sparse Index On A Collection Cannot Return Complete Results Consider a collection scores that contains the
following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
Because the document for the userid "newbie" does not contain the score field, the sparse index does not contain
an entry for that document.
Consider the following query to return all documents in the scores collection, sorted by the score field:
db.scores.find().sort( { score: -1 } )
Even though the sort is by the indexed field, MongoDB will not select the sparse index to fulfill the query in order to
return complete results:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
To use the sparse index, explicitly specify the index with hint():
db.scores.find().sort( { score: -1 } ).hint( { score: 1 } )
The use of the index results in the return of only those documents with the score field:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
See also:
explain() and Analyze Query Performance (page 121)
Sparse Index with Unique Constraint Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
You could create an index with a unique constraint (page 514) and sparse filter on the score field using the following
operation:
db.scores.createIndex( { score: 1 } , { sparse: true, unique: true } )
This index would permit the insertion of documents that had unique values for the score field or did not include a
score field. As such, given the existing documents in the scores collection, the index permits the following insert
operations (page 99):
db.scores.insert( { "userid": "AAAAAAA", "score": 43 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 34 } )
db.scores.insert( { "userid": "CCCCCCC" } )
db.scores.insert( { "userid": "DDDDDDD" } )
However, the index would not permit the addition of the following documents since documents already exists with
score value of 82 and 90:
db.scores.insert( { "userid": "AAAAAAA", "score": 82 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 90 } )
On this page
Background Construction (page 522)
Index Names (page 523)
MongoDB provides several options that only affect the creation of the index. Specify these options in a document as
the second argument to the db.collection.createIndex() method. This section describes the uses of these
creation options and their behavior.
Related
Some options that you can specify to createIndex() options control the properties of the index (page 512), which
are not index creation options. For example, the unique (page 514) option affects the behavior of the index after
creation.
For a detailed description of MongoDBs index types, see Index Types (page 492) and Index Properties (page 512) for
related documentation.
Background Construction
By default, creating an index blocks all other operations on a database. When building an index on a collection, the
database that holds the collection is unavailable for read or write operations until the index build completes. Any
operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index
build to complete.
For potentially long running index building operations, consider the background operation so that the MongoDB
database remains available during the index building operation. For example, to create an index in the background of
the zipcode field of the people collection, issue the following:
db.people.createIndex( { zipcode: 1}, {background: true} )
Behavior
As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently.
Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time.
Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time.
Background indexing operations run in the background so that other database operations can run while creating the
index. However, the mongo shell session or connection where you are creating the index will block until the index
build is complete. To continue issuing commands to the database, open another connection or mongo instance.
Queries will not use partially-built indexes: the index will only be usable once the index build is complete.
Note: If MongoDB is building an index in the background, you cannot perform other administra-
tive operations involving that collection, including running repairDatabase, dropping the collection (i.e.
db.collection.drop()), and running compact. These operations will return an error during background
index builds.
Performance
The background index operation uses an incremental approach that is slower than the normal foreground index
builds. If the index is larger than the available RAM, then the incremental process can take much longer than the
foreground build.
If your application includes createIndex() operations, and an index doesnt exist for other operational concerns,
building the index can have a severe impact on the performance of the database.
To avoid performance issues, make sure that your application checks for the indexes at start up using the
getIndexes() method or the equivalent method for your driver7 and terminates if the proper indexes do not ex-
ist. Always build indexes in production instances using separate application code, during designated maintenance
windows.
If a background index build is in progress when the mongod process terminates, when the instance restarts the index
build will restart as foreground index build. If the index build encounters any errors, such as a duplicate key error, the
mongod will exit with an error.
To start the mongod after a failed index build, use the storage.indexBuildRetry or
--noIndexBuildRetry to skip the index build on start up.
Changed in version 2.6: Secondary members can now build indexes in the background. Previously all index builds on
secondaries were in the foreground.
Background index operations on a replica set secondaries begin after the primary completes building the index. If
MongoDB builds an index in the background on the primary, the secondaries will then build that index in the back-
ground.
To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and
build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other
members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step
down the primary, restart it as a standalone, and build the index on the former primary.
The amount of time required to build the index on a secondary must be within the window of the oplog, so that the
secondary can catch up with the primary.
Indexes on secondary members in recovering mode are always built in the foreground to allow them to catch up as
soon as possible.
See Build Indexes on Replica Sets (page 537) for a complete procedure for building indexes on secondaries.
Index Names
The default name for an index is the concatenation of the indexed keys and each keys direction in the index, 1 or -1.
Example
Issue the following command to create an index on item and quantity:
7 https://api.mongodb.org/
Optionally, you can specify a name for an index instead of using the default name.
Example
Issue the following command to create an index on item and quantity and specify inventory as the index
name:
db.products.createIndex( { item: 1, quantity: -1 } , { name: "inventory" } )
On this page
Index Prefix Intersection (page 524)
Index Intersection and Compound Indexes (page 525)
Index Intersection and Sort (page 525)
MongoDB can use the intersection of the two indexes to support the following query:
db.orders.find( { item: "abc123", qty: { $gt: 15 } } )
To determine if MongoDB used index intersection, run explain(); the results of explain() will include either an
AND_SORTED stage or an AND_HASH stage.
With index intersection, MongoDB can use an intersection of either the entire index or the index prefix. An index
prefix is a subset of a compound index, consisting of one or more keys starting from the beginning of the index.
Consider a collection orders with the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
8 In previous versions, MongoDB could use only a single index to fulfill most queries. The exception to this is queries with $or clauses, which
To fulfill the following query which specifies a condition on both the qty field and the status field, MongoDB can
use the intersection of the two indexes:
db.orders.find( { qty: { $gt: 10 } , status: "A" } )
Index intersection does not eliminate the need for creating compound indexes (page 495). However, because both the
list order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending),
matter in compound indexes (page 495), a compound index may not support a query condition that does not include
the index prefix keys (page 496) or that specifies a different sort order.
For example, if a collection orders has the following compound index, with the status field listed before the
ord_date field:
{ status: 1, ord_date: -1 }
The two indexes can, either individually or through index intersection, support all four aforementioned queries.
The choice between creating compound indexes that support your queries or relying on index intersection depends on
the specifics of your system.
See also:
compound indexes (page 495), Create Compound Indexes to Support Several Different Queries (page 574)
Index intersection does not apply when the sort() operation requires an index completely separate from the query
predicate.
For example, the orders collection has the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
{ status: 1 }
{ ord_date: -1 }
MongoDB cannot use index intersection for the following query with sort:
That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the
{ status: 1, ord_date: -1 } index for the sort.
However, MongoDB can use index intersection for the following query with sort since the index { status: 1,
ord_date: -1 } can fulfill part of the query predicate.
db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } )
On this page
Intersect Bounds for Multikey Index (page 526)
Compound Bounds for Multikey Index (page 527)
The bounds of an index scan define the portions of an index to search during a query. When multiple predicates over an
index exist, MongoDB will attempt to combine the bounds for these predicates by either intersection or compounding
in order to produce a scan with smaller bounds.
Bounds intersection refers to a logical conjunction (i.e. AND) of multiple bounds. For instance, given two bounds [ [
3, Infinity ] ] and [ [ -Infinity, 6 ] ], the intersection of the bounds results in [ [ 3, 6 ] ].
Given an indexed (page 497) array field, consider a query that specifies multiple predicates on the array and can use
a multikey index (page 497). MongoDB can intersect multikey index (page 497) bounds if an $elemMatch joins the
predicates.
For example, a collection survey contains documents with a field item and an array field ratings:
{ _id: 1, item: "ABC", ratings: [ 2, 9 ] }
{ _id: 2, item: "XYZ", ratings: [ 4, 3 ] }
The following query uses $elemMatch to require that the array contains at least one single element that matches
both conditions:
db.survey.find( { ratings : { $elemMatch: { $gte: 3, $lte: 6 } } } )
If the query does not join the conditions on the array field with $elemMatch, MongoDB cannot intersect the multikey
index bounds. Consider the following query:
The query searches the ratings array for at least one element greater than or equal to 3 and at least one element
less than or equal to 6. Because a single element does not need to meet both criteria, MongoDB does not intersect the
bounds and uses either [ [ 3, Infinity ] ] or [ [ -Infinity, 6 ] ]. MongoDB makes no guarantee
as to which of these two bounds it chooses.
Compounding bounds refers to using bounds for multiple keys of compound index (page 495). For instance, given a
compound index { a: 1, b: 1 } with bounds on field a of [ [ 3, Infinity ] ] and bounds on field
b of [ [ -Infinity, 6 ] ], compounding the bounds results in the use of both bounds:
{ a: [ [ 3, Infinity ] ], b: [ [ -Infinity, 6 ] ] }
If MongoDB cannot compound the two bounds, MongoDB always constrains the index scan by the bound on its
leading field, in this case, a: [ [ 3, Infinity ] ].
Consider a compound multikey index; i.e. a compound index (page 495) where one of the indexed fields is an array.
For example, a collection survey contains documents with a field item and an array field ratings:
{ _id: 1, item: "ABC", ratings: [ 2, 9 ] }
{ _id: 2, item: "XYZ", ratings: [ 4, 3 ] }
Create a compound index (page 495) on the item field and the ratings field:
db.survey.createIndex( { item: 1, ratings: 1 } )
If an array contains embedded documents, to index on fields contained in the embedded documents, use the dotted
field name (page 189) in the index specification. For instance, given the following array of embedded documents:
ratings: [ { score: 2, by: "mn" }, { score: 9, by: "anon" } ]
Compound Bounds of Non-array Field and Field from an Array Consider a collection survey2 contains doc-
uments with a field item and an array field ratings:
{
_id: 1,
item: "ABC",
ratings: [ { score: 2, by: "mn" }, { score: 9, by: "anon" } ]
}
{
_id: 2,
item: "XYZ",
ratings: [ { score: 5, by: "anon" }, { score: 7, by: "wv" } ]
}
Create a compound index (page 495) on the non-array field item as well as two fields from an array
ratings.score and ratings.by:
db.survey2.createIndex( { "item": 1, "ratings.score": 1, "ratings.by": 1 } )
Or, MongoDB may choose to compound the item bounds with "ratings.by" bounds:
{
"item" : [ [ "XYZ", "XYZ" ] ],
"ratings.score" : [ [ MinKey, MaxKey ] ],
"ratings.by" : [ [ "anon", "anon" ] ]
}
However, to compound the bounds for "ratings.score" with the bounds for "ratings.by", the query must
use $elemMatch. See Compound Bounds of Index Fields from an Array (page 528) for more information.
Compound Bounds of Index Fields from an Array To compound together the bounds for index keys from the
same array:
the index keys must share the same field path up to but excluding the field names, and
the query must specify predicates on the fields using $elemMatch on that path.
For a field in an embedded document, the dotted field name (page 189), such as "a.b.c.d", is the field path for
d. To compound the bounds for index keys from the same array, the $elemMatch must be on the path up to but
excluding the field name itself; i.e. "a.b.c".
For instance, create a compound index (page 495) on the ratings.score and the ratings.by fields:
db.survey2.createIndex( { "ratings.score": 1, "ratings.by": 1 } )
The fields "ratings.score" and "ratings.by" share the field path ratings. The following query uses
$elemMatch on the field ratings to require that the array contains at least one single element that matches both
conditions:
db.survey2.find( { ratings: { $elemMatch: { score: { $lte: 5 }, by: "anon" } } } )
Query Without $elemMatch If the query does not join the conditions on the indexed array fields with
$elemMatch, MongoDB cannot compound their bounds. Consider the following query:
db.survey2.find( { "ratings.score": { $lte: 5 }, "ratings.by": "anon" } )
Because a single embedded document in the array does not need to meet both criteria, MongoDB does not compound
the bounds. When using a compound index, if MongoDB cannot constrain all the fields of the index, MongoDB
always constrains the leading field of the index, in this case "ratings.score":
{
"ratings.score": [ [ -Infinity, 5 ] ],
"ratings.by": [ [ MinKey, MaxKey ] ]
}
$elemMatch on Incomplete Path If the query does not specify $elemMatch on the path of the embedded fields,
up to but excluding the field names, MongoDB cannot compound the bounds of index keys from the same array.
For example, a collection survey3 contains documents with a field item and an array field ratings:
{
_id: 1,
item: "ABC",
ratings: [ { score: { q1: 2, q2: 5 } }, { score: { q1: 8, q2: 4 } } ]
}
{
_id: 2,
item: "XYZ",
ratings: [ { score: { q1: 7, q2: 8 } }, { score: { q1: 9, q2: 5 } } ]
}
Create a compound index (page 495) on the ratings.score.q1 and the ratings.score.q2 fields:
db.survey3.createIndex( { "ratings.score.q1": 1, "ratings.score.q2": 1 } )
The fields "ratings.score.q1" and "ratings.score.q2" share the field path "ratings.score" and
the $elemMatch must be on that path.
The following query, however, uses an $elemMatch but not on the required path:
db.survey3.find( { ratings: { $elemMatch: { 'score.q1': 2, 'score.q2': 8 } } } )
As such, MongoDB cannot compound the bounds, and the "ratings.score.q2" field will be unconstrained
during the index scan. To compound the bounds, the query must use $elemMatch on the path "ratings.score":
db.survey3.find( { 'ratings.score': { $elemMatch: { 'q1': 2, 'q2': 8 } } } )
Compound $elemMatch Clauses Consider a query that contains multiple $elemMatch clauses on different field
paths, for instance, "a.b": { $elemMatch: ... }, "a.c": { $elemMatch: ... }. Mon-
goDB cannot combine the bounds of the "a.b" with the bounds of "a.c" since "a.b" and "a.c" also require
$elemMatch on the path a.
For example, a collection survey4 contains documents with a field item and an array field ratings:
{
_id: 1,
item: "ABC",
ratings: [
{ score: { q1: 2, q2: 5 }, certainty: { q1: 2, q2: 3 } },
{ score: { q1: 8, q2: 4 }, certainty: { q1: 10, q2: 10 } }
]
}
{
_id: 2,
item: "XYZ",
ratings: [
{ score: { q1: 7, q2: 8 }, certainty: { q1: 5, q2: 5 } },
{ score: { q1: 9, q2: 5 }, certainty: { q1: 7, q2: 7 } }
]
}
Create a compound index (page 495) on the ratings.score.q1 and the ratings.score.q2 fields:
db.survey4.createIndex( {
"ratings.score.q1": 1,
"ratings.score.q2": 1,
"ratings.certainty.q1": 1,
"ratings.certainty.q2": 1
} )
the bounds for the "ratings.certainty" predicate are the compound bounds:
{ "ratings.certainty.q1" : [ [ 7, 7 ] ], "ratings.certainty.q2" : [ [ 7, 7 ] ] }
However, MongoDB cannot compound the bounds for "ratings.score" and "ratings.certainty"
since $elemMatch does not join the two. Instead, MongoDB constrains the leading field of the index
"ratings.score.q1" which can be compounded with the bounds for "ratings.score.q2":
{
"ratings.score.q1" : [ [ 5, 5 ] ],
"ratings.score.q2" : [ [ 5, 5 ] ],
"ratings.certainty.q1" : [ [ MinKey, MaxKey ] ],
"ratings.certainty.q2" : [ [ MinKey, MaxKey ] ]
}
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection.
The documents in this section outline specific tasks related to building and maintaining indexes for data in MongoDB
collections and discusses strategies and practical approaches. For a conceptual overview of MongoDB indexing, see
the Index Concepts (page 492) document.
Index Creation Tutorials (page 531) Create and configure different types of indexes for different purposes.
Index Management Tutorials (page 541) Monitor and assess index performance and rebuild indexes as needed.
Geospatial Index Tutorials (page 546) Create indexes that support data stored as GeoJSON objects and legacy coor-
dinate pairs.
Text Search Tutorials (page 563) Build and configure indexes that support full-text searches.
Indexing Strategies (page 573) The factors that affect index performance and practical approaches to indexing in
MongoDB
Instructions for creating and configuring indexes in MongoDB and building indexes on replica sets and sharded clus-
ters.
Create an Index (page 532) Build an index for any field on a collection.
Create a Compound Index (page 533) Build an index of multiple fields on a collection.
Create a Unique Index (page 534) Build an index that enforces unique values for the indexed field or fields.
Create a Partial Index (page 535) Build an index that only indexes documents that meet specified filter criteria. This
can reduce index size and improve performance.
Create a Sparse Index (page 536) Build an index that omits references to documents that do not include the indexed
field. This saves space when indexing fields that are present in only some documents.
9 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
Create a Hashed Index (page 537) Compute a hash of the value of a field in a collection and index the hashed value.
These indexes permit equality queries and may be suitable shard keys for some collections.
Build Indexes on Replica Sets (page 537) To build indexes on a replica set, you build the indexes separately on the
primary and the secondaries, as described here.
Build Indexes in the Background (page 539) Background index construction allows read and write operations to
continue while building the index, but take longer to complete and result in a larger index.
Build Old Style Indexes (page 540) A {v : 0} index is necessary if you need to roll back from MongoDB version
2.0 (or later) to MongoDB version 1.8.
Create an Index
On this page
Create an Index on a Single Field (page 532)
Additional Considerations (page 533)
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. Users can create indexes for any collection on any field in a document. By default,
MongoDB creates an index on the _id field of every collection.
This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes (page 495),
which are indexes on multiple fields. See Create a Compound Index (page 533) for instructions on building compound
indexes.
To create an index, use createIndex() or a similar method from your driver10 . The createIndex() method
only creates an index if an index of the same specification does not already exist.
For example, the following operation creates an index on the userid field of the records collection:
db.records.createIndex( { userid: 1 } )
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending
order. For additional index types, see Index Types (page 492).
The created index will support queries that select on the field userid, such as the following:
db.records.find( { userid: 2 } )
db.records.find( { userid: { $gt: 10 } } )
But the created index does not support the following query on the profile_url field:
db.records.find( { profile_url: 2 } )
For queries that cannot use an index, MongoDB must scan all documents in a collection for documents that match the
query.
10 https://api.mongodb.org/
Additional Considerations
Although indexes can improve query performances, indexes also present some operational considerations. See Oper-
ational Considerations for Indexes (page 166) for more information.
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 522). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 537) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 537).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
See also:
Create a Compound Index (page 533), Indexing Tutorials (page 531) and Index Concepts (page 492) for more infor-
mation.
On this page
Build a Compound Index (page 533)
Example (page 533)
Additional Considerations (page 534)
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound
indexes (page 495) that include content from multiple fields. Continue reading for instructions and examples of
building a compound index.
To create a compound index (page 495) use an operation that resembles the following prototype:
db.collection.createIndex( { a: 1, b: 1, c: 1 } )
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending
order. For additional index types, see Index Types (page 492).
Example
The following operation will create an index on the item, category, and price fields of the products collec-
tion:
db.products.createIndex( { item: 1, category: 1, price: 1 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 522). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 537) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 537).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
See also:
Create an Index (page 532), Indexing Tutorials (page 531) and Index Concepts (page 492) for more information.
On this page
Unique Index on a Single Field (page 534)
Unique Compound Index (page 534)
Unique Index and Missing Field (page 535)
MongoDB allows you to specify a unique constraint (page 514) on an index. These constraints prevent applications
from inserting documents that have duplicate values for the inserted fields.
MongoDB cannot create a unique index (page 514) on the specified index field(s) if the collection already contains
data that would violate the unique constraint for the index.
For example, you may want to create a unique index on the "tax-id" field of the accounts collection to prevent
storing multiple account records for the same legal entity:
db.accounts.createIndex( { "tax-id": 1 }, { unique: true } )
The _id index (page 494) is a unique index. In some situations you may consider using the _id field itself for this
kind of data rather than using a unique index on another field.
You can also enforce a unique constraint on compound indexes (page 495), as in the following prototype:
db.collection.createIndex( { a: 1, b: 1 }, { unique: true } )
These indexes enforce uniqueness for the combination of index keys and not for either key individually.
If a document does not have a value for a field, the index entry for that item will be null in any index that includes it.
Thus, in many situations you will want to combine the unique constraint with the sparse option. Sparse indexes
(page 519) skip over any document that is missing the indexed field, rather than storing null for the index entry.
Since unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the
second document and all subsequent documents without the indexed field. Consider the following prototype.
db.collection.createIndex( { a: 1 }, { unique: true, sparse: true } )
On this page
Prototype (page 535)
Example (page 536)
Considerations (page 536)
Prototype
To create a partial index (page 515) on a field, use the partialFilterExpression option when creating the
index, as in the following:
db.collection.createIndex(
{ a: 1 },
{ partialFilterExpression: { b: { $gt: 5 } } }
)
The partialFilterExpression option accepts a document that specifies the filter condition using:
equality expressions (i.e. field: value or using the $eq operator),
$exists: true expression,
$gt, $gte, $lt, $lte expressions,
$type expressions,
$and operator at the top-level only
Example
The following operation creates a sparse index on the users collection that only includes a document in the index if
the archived field is false.
db.users.createIndex( { username: 1 }, { archived: false } )
The index only includes documents where the archived field is false.
Considerations
Note: To use the partial index, a query must contain the filter expression (or a modified filter expression that specifies
a subset of the filter expression) as part of its query condition. As such, MongoDB will not use the partial index if the
index results in an incomplete result set for the query or sort operation.
On this page
Prototype (page 536)
Example (page 536)
Considerations (page 537)
Important: Changed in version 3.2: Partial indexes (page 515) offer a superset of the functionality of sparse indexes.
If you are using MongoDB 3.2 or later, you should use partial indexes (page 515) rather than sparse.
Sparse indexes omit references to documents that do not include the indexed field. For fields that are only present
in some documents sparse indexes may provide a significant space savings. See Sparse Indexes (page 519) for more
information about sparse indexes and their use.
See also:
Index Concepts (page 492) and Indexing Tutorials (page 531) for more information.
Prototype
To create a sparse index (page 519) on a field, use an operation that resembles the following prototype:
db.collection.createIndex( { a: 1 }, { sparse: true } )
Example
The following operation, creates a sparse index on the users collection that only includes a document in the index if
the twitter_name field exists in a document.
db.users.createIndex( { twitter_name: 1 }, { sparse: true } )
The index excludes all documents that do not include the twitter_name field.
Considerations
Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not
included in the index. See the sparse index (page 519) section for more information.
On this page
Procedure (page 537)
Considerations (page 537)
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need
to compute hashes.
See
Hashed Shard Keys (page 740) for more information about hashed indexes in sharded clusters, as well as Index Con-
cepts (page 492) and Indexing Tutorials (page 531) for more information about indexes.
Procedure
To create a hashed index (page 512), specify hashed as the value of the index key, as in the following example:
Example
Specify a hashed index on _id
db.collection.createIndex( { _id: "hashed" } )
Considerations
MongoDB supports hashed indexes of any single field. The hashing function collapses embedded documents and
computes the hash for the entire value, but does not support multi-key (i.e. arrays) indexes.
You may not create compound indexes that have hashed index fields.
On this page
Considerations (page 538)
Procedure (page 538)
For replica sets, secondaries will begin building indexes after the primary finishes building the index. In sharded
clusters, the mongos will send createIndex() to the primary members of the replica set for each shard, which
then replicate to the secondaries after the primary finishes building the index.
To minimize the impact of building an index on your replica set, use the following procedure to build indexes:
See
Indexing Tutorials (page 531) and Index Concepts (page 492) for more information.
Considerations
Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without
falling too far behind to catch up. See the oplog sizing (page 647) documentation for additional information.
This procedure does take one member out of the replica set at a time. However, this procedure will only affect
one member of the set at a time rather than all secondaries at the same time.
Before version 2.6 Background index creation operations (page 522) become foreground indexing operations
on secondary members of replica sets. After 2.6, background index builds replicate as background index builds
on the secondaries.
Procedure
Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that
provides each shard.
Stop One Secondary Stop the mongod process on one secondary. Restart the mongod process without the
--replSet option and running on a different port. 11 This instance is now in standalone mode.
For example, if your mongod normally runs with on the default port of 27017 with the --replSet option you
would use the following invocation:
mongod --port 47017
Build the Index Create the new index using the createIndex() in the mongo shell, or comparable method in
your driver. This operation will create or rebuild the index on this mongod instance
For example, to create an ascending index on the username field of the records collection, use the following
mongo shell operation:
db.records.createIndex( { username: 1 } )
11 By running the mongod on a different port, you ensure that the other members of the replica set and all clients will not contact the member
See also:
Create an Index (page 532) and Create a Compound Index (page 533) for more information.
Restart the Program mongod When the index build completes, start the mongod instance with the --replSet
option on its usual port:
mongod --port 27017 --replSet rs0
Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed.
Allow replication to catch up on this member.
Build Indexes on all Secondaries Changed in version 2.6: Secondary members can now build indexes in the back-
ground (page 539). Previously all index builds on secondaries were in the foreground.
For each secondary in the set, build an index according to the following steps:
1. Stop One Secondary (page 538)
2. Build the Index (page 538)
3. Restart the Program mongod (page 539)
Build the Index on the Primary To build an index on the primary you can either:
1. Build the index in the background (page 539) on the primary.
2. Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to
become a secondary graceful and allow the set to elect another member as primary.
Then repeat the index building procedure, listed below, to build the index on the primary:
(a) Stop One Secondary (page 538)
(b) Build the Index (page 538)
(c) Restart the Program mongod (page 539)
Building the index on the background, takes longer than the foreground index build and results in a less compact index
structure. Additionally, the background index build may impact write performance on the primary. However, building
the index in the background allows the set to be continuously up for write operations while MongoDB builds the index.
On this page
Considerations (page 540)
Procedure (page 540)
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database
while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can
occur during a foreground index build.
Background index construction (page 522) allows read and write operations to continue while building the index.
See also:
Index Concepts (page 492) and Indexing Tutorials (page 531) for more information.
Considerations
Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an
index built in the foreground. Over time, the compactness of indexes built in the background will approach foreground-
built indexes.
After MongoDB finishes building the index, background-built indexes are functionally identical to any other index.
Procedure
To create an index in the background, add the background argument to the createIndex() operation, as in the
following index:
db.collection.createIndex( { a: 1 }, { background: true } )
Consider the section on background index construction (page 522) for more information about these indexes and their
implications.
Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier
than 2.0.
MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1}
format and the earlier {v:0} format.
MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a
version prior to 2.0, you must drop and re-create your indexes.
To build pre-2.0 indexes, use the dropIndexes() and createIndex() methods. You cannot simply reindex the
collection. When you reindex on versions that only support {v:0} indexes, the v fields in the index definition still
hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version
2.0 or later, these indexes would not work.
Example
Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the
items collection:
{ "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" }
The v field tells you the index is a {v:1} index, which is incompatible with version 1.8.
To drop the index, issue the following command:
db.items.dropIndex( { name : 1 } )
See also:
Index Performance Enhancements (page 1047).
Instructions for managing indexes and assessing index performance and use.
Remove Indexes (page 541) Drop an index from a collection.
Modify an Index (page 542) Modify an existing index.
Rebuild Indexes (page 543) In a single operation, drop all indexes on a collection and then rebuild them.
Manage In-Progress Index Creation (page 543) Check the status of indexing progress, or terminate an ongoing in-
dex build.
Return a List of All Indexes (page 544) Obtain a list of all indexes on a collection or of all indexes on all collections
in a database.
Measure Index Use (page 545) Study query operations and observe index use for your database.
Remove Indexes
On this page
Remove a Specific Index (page 541)
Remove All Indexes (page 541)
To remove an index from a collection use the dropIndex() method and the following procedure. If you simply
need to rebuild indexes you can use the process described in the Rebuild Indexes (page 543) document.
See also:
Indexing Tutorials (page 531) and Index Concepts (page 492) for more information about indexes and indexing oper-
ations in MongoDB.
Where the value of nIndexesWas reflects the number of indexes before removing this index.
For text (page 508) indexes, pass the index name to the db.collection.dropIndex() method. See Use the
Index Name to Drop a text Index (page 569) for details.
You can also use the db.collection.dropIndexes() to remove all indexes, except for the _id index (page 494)
from a collection.
These shell helpers provide wrappers around the dropIndexes database command. Your client library may
have a different or additional interface for these operations.
Modify an Index
To modify an existing index, you need to drop and recreate the index.
The method returns a document with the status of the results. The method only creates an index if the index does
not already exist. See Create an Index (page 532) and Index Creation Tutorials (page 531) for more information on
creating indexes.
To modify an existing index, you cannot just re-issue the createIndex() method with the updated specification
of the index.
For example, the following operation attempts to remove the unique constraint from the previously created index by
using the createIndex() method.
db.orders.createIndex(
{ "cust_id" : 1, "ord_date" : -1, "items" : 1 }
)
The method returns a document with the status of the operation. Upon successful operation, the ok field in the returned
document should specify a 1. See Remove Indexes (page 541) for more information about dropping indexes.
The method returns a document with the status of the results. Upon successful operation, the returned document
should show the numIndexesAfter to be greater than numIndexesBefore by one.
See also:
Index Introduction (page 487), Index Concepts (page 492).
Rebuild Indexes
On this page
Process (page 543)
Additional Considerations (page 543)
If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all
indexes on a collection in a single operation. This operation drops all indexes, including the _id index (page 494), and
then rebuilds all indexes.
See also:
Index Concepts (page 492) and Indexing Tutorials (page 531).
Process
MongoDB will return the following document when the operation completes:
{
"nIndexesWas" : 2,
"msg" : "indexes dropped for collection",
"nIndexes" : 2,
"indexes" : [
{
"key" : {
"_id" : 1,
"tax-id" : 1
},
"ns" : "records.accounts",
"name" : "_id_"
}
],
"ok" : 1
}
This shell helper provides a wrapper around the reIndex database command. Your client library may have
a different or additional interface for this operation.
Additional Considerations
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 537).
On this page
View Index Creation Operations (page 544)
Terminate Index Creation (page 544)
To see the status of an indexing process, you can use the db.currentOp() method in the mongo shell. To filter
the current operations for index creation operations, see currentOp-index-creation for an example.
The msg field will include the percent of the build that is complete.
To terminate an ongoing index build, use the db.killOp() method in the mongo shell. For index builds, the effects
of db.killOp() may not be immediate and may occur well after much of the index build operation has completed.
You cannot terminate a replicated index build on secondary members of a replica set. To minimize the impact of
building an index on replica sets, see Build Indexes on Replica Sets (page 537).
Changed in version 2.4: Before MongoDB 2.4, you could only terminate background index builds. After 2.4, you can
terminate both background index builds and foreground index builds.
See also:
db.currentOp(), db.killOp()
On this page
List all Indexes on a Collection (page 544)
List all Indexes for a Database (page 545)
When performing maintenance you may want to check which indexes exist on a collection. In the mongo shell, you
can use the getIndexes() method to return a list of the indexes on a collection.
See also:
Index Concepts (page 492) and Indexing Tutorials (page 531) for more information about indexes in MongoDB and
common index management operations.
To return a list of all indexes on a collection, use the db.collection.getIndexes() method or a similar
method for your driver12 .
For example, to view all indexes on the people collection:
12 https://api.mongodb.org/
db.people.getIndexes()
To list all indexes on all collections in a database, you can use the following operation in the mongo shell:
db.getCollectionNames().forEach(function(collection) {
indexes = db[collection].getIndexes();
print("Indexes for " + collection + ":");
printjson(indexes);
});
On this page
Synopsis (page 545)
Operations (page 545)
Synopsis
Query performance is a good general indicator of index use; however, for more precise insight into index use, Mon-
goDB provides a number of tools that allow you to study query operations and observe index use for your database.
See also:
Index Concepts (page 492) and Indexing Tutorials (page 531) for more information.
Operations
Return Query Plan with explain() Use the db.collection.explain() or the cursor.explain()
method in executionStats mode to return statistics about the query process, including the index used, the number of
documents scanned, and the time the query takes to process in milliseconds.
Run db.collection.explain() or the cursor.explain() method in allPlansExecution mode to view
partial execution statistics collected during plan selection.
db.collection.explain() provides information on the execution of other operations, such as
db.collection.update(). See db.collection.explain() for details.
Control Index Use with hint() To force MongoDB to use a particular index for a db.collection.find()
operation, specify the index with the hint() method. Append the hint() method to the find() method. Consider
the following example:
db.people.find(
{ name: "John Doe", zipcode: { $gt: "63000" } }
).hint( { zipcode: 1 } )
To view the execution statistics for a specific index, append to the db.collection.find() the hint() method
followed by cursor.explain(), e.g.:
db.people.find(
{ name: "John Doe", zipcode: { $gt: "63000" } }
).hint( { zipcode: 1 } ).explain("executionStats")
Specify the $natural operator to the hint() method to prevent MongoDB from using any index:
db.people.find(
{ name: "John Doe", zipcode: { $gt: "63000" } }
).hint( { $natural: 1 } )
Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may
want to consider when analyzing index use for your database:
In the output of serverStatus:
metrics.queryExecutor.scanned
metrics.operation.scanAndOrder
In the output of collStats:
totalIndexSize
indexSizes
In the output of dbStats:
dbStats.indexes
dbStats.indexSize
Instructions for creating and querying 2d, 2dsphere, and haystack indexes.
Find Restaurants with Geospatial Queries (page 547) Use Geospatial queries to find a users current neighborhood
and list nearby restaurants.
Create a 2dsphere Index (page 554) A 2dsphere index supports data stored as both GeoJSON objects and as
legacy coordinate pairs.
Query a 2dsphere Index (page 555) Search for locations within, near, or intersected by a GeoJSON shape, or within
a circle as defined by coordinate points on a sphere.
Create a 2d Index (page 557) Create a 2d index to support queries on data stored as legacy coordinate pairs.
Query a 2d Index (page 558) Search for locations using legacy coordinate pairs.
Create a Haystack Index (page 560) A haystack index is optimized to return results over small areas. For queries
that use spherical geometry, a 2dsphere index is a better option.
Query a Haystack Index (page 561) Search based on location and non-location data within a small area.
Calculate Distance Using Spherical Geometry (page 561) Convert distances to radians and back again.
On this page
Overview (page 547)
Differences Between Flat and Spherical Geometry (page 547)
Distortion (page 548)
Searching for Restaurants (page 548)
Overview
MongoDBs geospatial indexing allows you to efficiently execute spatial queries on a collection that contains geospa-
tial shapes and points. This tutorial will briefly introduce the concepts of geospatial indexes, and then demonstrate
their use with $geoWithin, $geoIntersects, and geoNear.
To showcase the capabilities of geospatial features and compare different approaches, this tutorial will guide you
through the process of writing queries for a simple geospatial application.
Suppose you are designing a mobile application to help users find restaurants in New York City. The application must:
Determine the users current neighborhood using $geoIntersects,
Show the number of restaurants in that neighborhood using $geoWithin, and
Find restaurants within a specified distance of the user using $nearSphere.
This tutorial will use a 2dsphere index to query for this data on spherical geometry.
Geospatial queries can use either flat or spherical geometries, depending on both the query and the type of index in use.
2dsphere indexes support only spherical geometries, while 2d indexes support both flat and spherical geometries.
However, queries using spherical geometries will be more performant and accurate with a 2dsphere index, so you
should always use 2dsphere indexes on geographical geospatial fields.
The following table shows what kind of geometry each geospatial operator will use:
Distortion
Spherical geometry will appear distorted when visualized on a map due to the nature of projecting a three dimensional
sphere, such as the earth, onto a flat plane.
For example, take the specification of the spherical square defined by the longitude latitude points (0,0), (80,0),
(80,80), and (0,80). The following figure depicts the area covered by this region:
The geoNear command requires a geospatial index, and almost always improves performance of $geoWithin and
$geoIntersects queries.
Because this data is geographical, create a 2dsphere index on each collection using the mongo shell:
db.restaurants.createIndex({ location: "2dsphere" })
db.neighborhoods.createIndex({ coordinates: "2dsphere" })
Exploring the Data Inspect an entry in the newly-created restaurants collection from within the mongo shell:
db.restaurants.findOne()
This restaurant document corresponds to the location shown in the following figure:
Because the tutorial uses a 2dsphere index, the geometry data in the location field must follow the doc:GeoJSON
format </reference/geojson>.
Now inspect an entry in the neighborhoods collection:
db.neighborhoods.findOne()
Find the Current Neighborhood Assuming the users mobile device can give a reasonably accurate location for
the user, it is simple to find the users current neighborhood with $geoIntersects.
Suppose the user is located at -73.93414657 longitude and 40.82302903 latitude. To find the current neighborhood,
you will specify a point using the special $geometry field in GeoJSON format:
db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -
{
"_id" : ObjectId("55cb9c666c522cafdb053a68"),
"geometry" : {
"type" : "Polygon",
"coordinates" : [
[
[
-73.93383000695911,
40.81949109558767
],
...
]
]
},
"name" : "Central Harlem North-Polo Grounds"
}
Find all Restaurants in the Neighborhood You can also query to find all restaurants contained in a given neigh-
borhood. Run the following in the mongo shell to find the neighborhood containing the user, and then count the
restaurants within that neighborhood:
var neighborhood = db.neighborhoods.findOne( { geometry: { $geoIntersects: { $geometry: { type: "Poin
db.restaurants.find( { location: { $geoWithin: { $geometry: neighborhood.geometry } } } ).count()
This query will tell you that there are 127 restaurants in the requested neighborhood, visualized in the following figure:
Find Restaurants within a Distance To find restaurants within a specified distance of a point, you can use either
$geoWithin with $centerSphere to return results in unsorted order, or nearSphere with $maxDistance
if you need results sorted by distance.
Unsorted with $geoWithin To find restaurants within a circular region, use $geoWithin with
$centerSphere. $centerSphere is a MongoDB-specific syntax to denote a circular region by specifying
the center and the radius in radians.
$geoWithin does not return the documents in any specific order, so it may show the user the furthest documents
first.
The following will find all restaurants within five miles of the user:
db.restaurants.find({ location:
{ $geoWithin:
{ $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 3963.2 ] } } })
$centerSpheres second argument accepts the radius in radians, so you must divide it by the radius of the earth
in miles. See Calculate Distance Using Spherical Geometry (page 561) for more information on converting between
distance units.
Sorted with $nearSphere You may also use $nearSphere and specify a $maxDistance term in meters.
This will return all restaurants within five miles of the user in sorted order from nearest to farthest:
var METERS_PER_MILE = 1609.34
db.restaurants.find({ location: { $nearSphere: { $geometry: { type: "Point", coordinates: [ -73.93414
On this page
Procedure (page 554)
Considerations (page 555)
The following procedure presents steps to populate a collection with documents that contain a GeoJSON data field
and create 2dsphere indexes (page 503). Although the procedure populates the collection first, you can also create the
indexes before populating the collection.
Procedure
First, populate a collection places with documents that store location data as GeoJSON Point (page 581) in a field
named loc. The coordinate order is longitude, then latitude.
db.places.insert(
{
loc : { type: "Point", coordinates: [ -73.97, 40.77 ] },
name: "Central Park",
category : "Parks"
}
)
db.places.insert(
{
loc : { type: "Point", coordinates: [ -73.88, 40.78 ] },
name: "La Guardia Airport",
category : "Airport"
}
)
Create a 2dsphere Index For example, the following creates a 2dsphere (page 503) index on the location field
loc:
db.places.createIndex( { loc : "2dsphere" } )
Create a Compound Index with 2dsphere Index Key A compound index (page 495) can include a 2dsphere
index key in combination with non-geospatial index keys. For example, the following operation creates a compound
index where the first key loc is a 2dsphere index key, and the remaining keys category and names are non-
geospatial index keys, specifically descending (-1) and ascending (1) keys respectively.
db.places.createIndex( { loc : "2dsphere" , category : -1, name: 1 } )
Unlike the 2d (page 505) index, a compound 2dsphere index does not require the location field to be the first field
indexed. For example:
db.places.createIndex( { category : 1 , loc : "2dsphere" } )
Considerations
Fields with 2dsphere (page 503) indexes must hold geometry data in the form of coordinate pairs or GeoJSON data. If
you attempt to insert a document with non-geometry data in a 2dsphere indexed field, or build a 2dsphere index
on a collection where the indexed field has non-geometry data, the operation will fail.
The geoNear command and the $geoNear pipeline stage require that a collection have at most only one 2dsphere
index and/or only one 2d (page 505) index whereas geospatial query operators (e.g. $near and $geoWithin)
permit collections to have multiple geospatial indexes.
The geospatial index restriction for the geoNear command and the $geoNear pipeline stage exists because neither
the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection
among multiple 2d indexes or 2dsphere indexes is ambiguous.
No such restriction applies for geospatial query operators since these operators take a location field, eliminating the
ambiguity.
As such, although this tutorial creates multiple 2dsphere indexes, to use the geoNear command or the $geoNear
pipeline stage against the example collection, you will need to drop all but one of the 2dsphere indexes.
To query using the 2dsphere index, see Query a 2dsphere Index (page 555).
On this page
GeoJSON Objects Bounded by a Polygon (page 555)
Intersections of GeoJSON Objects (page 556)
Proximity to a GeoJSON Point (page 556)
Points within a Circle Defined on a Sphere (page 557)
The $geoWithin operator queries for location data found within a GeoJSON polygon. Your location data must be
stored in GeoJSON format. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ <coordinates> ]
} } } } )
The following example selects all points and shapes that exist entirely within a GeoJSON polygon:
db.places.find( { loc :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
The following example uses $geoIntersects to select all indexed points and shapes that intersect with the polygon
defined by the coordinates array.
db.places.find( { loc :
{ $geoIntersects :
{ $geometry :
{ type : "Polygon" ,
coordinates: [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
Proximity queries return the points closest to the defined point and sorts the results by distance. A proximity query on
GeoJSON data requires a 2dsphere index.
To query for proximity to a GeoJSON point, use either the $near operator or geoNear command. Distance is in
meters.
The $near uses the following syntax:
db.<collection>.find( { <location field> :
{ $near :
{ $geometry :
{ type : "Point" ,
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
To select all grid coordinates in a spherical cap on a sphere, use $geoWithin with the $centerSphere operator.
Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 561).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere :
[ [ <x>, <y> ] , <radius> ] }
} } )
The following example queries grid coordinates and returns all documents within a 10 mile radius of longitude 88 W
and latitude 30 N. The example converts the distance, 10 miles, to radians by dividing by the approximate equatorial
radius of the earth, 3963.2 miles:
db.places.find( { loc :
{ $geoWithin :
{ $centerSphere :
[ [ -88 , 30 ] , 10 / 3963.2 ]
} } } )
Create a 2d Index
On this page
Define Location Range for a 2d Index (page 558)
Define Location Precision for a 2d Index (page 558)
To build a geospatial 2d index, use the db.collection.createIndex() method and specify 2d. Use the
following syntax:
By default, a 2d index assumes longitude and latitude and has boundaries of -180 inclusive and 180 non-inclusive. If
documents contain coordinate data outside of the specified range, MongoDB returns an error.
Important: The default boundaries allow applications to insert documents with invalid latitudes greater than 90 or
less than -90. The behavior of geospatial queries with such invalid points is not defined.
By default, a 2d index on legacy coordinate pairs uses 26 bits of precision, which is roughly equivalent to 2 feet or 60
centimeters of precision using the default range of -180 to 180. Precision is measured by the size in bits of the geohash
values used to store location data. You can configure geospatial indexes with up to 32 bits of precision.
Index precision does not affect query accuracy. The actual grid coordinates are always used in the final query process-
ing. Advantages to lower precision are a lower processing overhead for insert operations and use of less space. An
advantage to higher precision is that queries scan smaller portions of the index to return results.
To configure a location precision other than the default, use the bits option when creating the index. Use following
syntax:
db.<collection>.createIndex( {<location field> : "<index type>"} ,
{ bits : <bit precision> } )
For information on the internals of geohash values, see Calculation of Geohash Values for 2d Indexes (page 506).
Query a 2d Index
On this page
Points within a Shape Defined on a Flat Surface (page 559)
Points within a Circle Defined on a Sphere (page 559)
Proximity to a Point on a Flat Surface (page 560)
Exact Matches on a Flat Surface (page 560)
To select all legacy coordinate pairs found within a given shape on a flat surface, use the $geoWithin operator along
with a shape operator. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $box|$polygon|$center : <coordinates>
} } } )
The following queries for documents within a rectangle defined by [ 0 , 0 ] at the bottom left corner and by [
100 , 100 ] at the top right corner.
db.places.find( { loc :
{ $geoWithin :
{ $box : [ [ 0 , 0 ] ,
[ 100 , 100 ] ]
} } } )
The following queries for documents that are within the circle centered on [ -74 , 40.74 ] and with a radius of
10:
db.places.find( { loc: { $geoWithin :
{ $center : [ [-74, 40.74 ] , 10 ]
} } } )
For syntax and examples for each shape, see the following:
$box
$polygon
$center (defines a circle)
MongoDB supports rudimentary spherical queries on flat 2d indexes for legacy reasons. In general, spherical calcula-
tions should use a 2dsphere index, as described in 2dsphere Indexes (page 503).
To query for legacy coordinate pairs in a spherical cap on a sphere, use $geoWithin with the $centerSphere
operator. Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 561).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere : [ [ <x>, <y> ] , <radius> ] }
} } )
The following example query returns all documents within a 10-mile radius of longitude 88 W and latitude 30 N. The
example converts distance to radians by dividing distance by the approximate equatorial radius of the earth, 3963.2
miles:
Proximity queries return the 100 legacy coordinate pairs closest to the defined point and sort the results by distance.
Use either the $near operator or geoNear command. Both require a 2d index.
The $near operator uses the following syntax:
db.<collection>.find( { <location field> :
{ $near : [ <x> , <y> ]
} } )
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
Changed in version 2.6: Previously, 2d indexes would support exact-match queries for coordinate pairs.
You cannot use a 2d index to return an exact match for a coordinate pair. Use a scalar, ascending or descending, index
on a field that stores coordinates to return exact matches.
In the following example, the find() operation will return an exact match on a location if you have a { loc:
1} index:
db.<collection>.find( { loc: [ <x> , <y> ] } )
This query will return any documents with the value of [ <x> , <y> ].
A haystack index must reference two fields: the location field and a second field. The second field is used for exact
matches. Haystack indexes return documents based on location and an exact match on a single additional criterion.
These indexes are not necessarily suited to returning the closest documents to a particular location.
To build a haystack index, use the following syntax:
db.coll.createIndex( { <location field> : "geoHaystack" ,
<additional field> : 1 } ,
{ bucketSize : <bucket value> } )
To build a haystack index, you must specify the bucketSize option when creating the index. A bucketSize
of 5 creates an index that groups location values that are within 5 units of the specified longitude and latitude. The
bucketSize also determines the granularity of the index. You can tune the parameter to the distribution of your
data so that in general you search only very small regions. The areas defined by buckets can overlap. A document can
exist in multiple buckets.
Example
If you have a collection with documents that contain fields similar to the following:
{ _id : 100, pos: { lng : 126.9, lat : 35.2 } , type : "restaurant"}
{ _id : 200, pos: { lng : 127.5, lat : 36.1 } , type : "restaurant"}
{ _id : 300, pos: { lng : 128.0, lat : 36.7 } , type : "national park"}
The following operations create a haystack index with buckets that store keys within 1 unit of longitude or latitude.
db.places.createIndex( { pos : "geoHaystack", type : 1 } ,
{ bucketSize : 1 } )
This index stores the document with an _id field that has the value 200 in two different buckets:
In a bucket that includes the document where the _id field has a value of 100
In a bucket that includes the document where the _id field has a value of 300
To query using a haystack index you use the geoSearch command. See Query a Haystack Index (page 561).
By default, queries that use a haystack index return 50 documents.
A haystack index is a special 2d geospatial index that is optimized to return results over small areas. To create a
haystack index see Create a Haystack Index (page 560).
To query a haystack index, use the geoSearch command. You must specify both the coordinates and the additional
field to geoSearch. For example, to return all documents with the value restaurant in the type field near the
example point, the command would resemble:
db.runCommand( { geoSearch : "places" ,
search : { type: "restaurant" } ,
near : [-74, 40.74] ,
maxDistance : 10 } )
Note: Haystack indexes are not suited to queries for the complete list of documents closest to a particular location.
The closest documents could be more distant compared to the bucket size.
Note: Spherical query operations (page 561) are not currently supported by haystack indexes.
The find() method and geoNear command cannot access the haystack index.
On this page
Distance Multiplier (page 563)
Note: While basic queries using spherical distance are supported by the 2d index, consider moving to a 2dsphere
index if your data is primarily longitude and latitude.
The 2d index supports queries that calculate distances on a Euclidean plane (flat surface). The index also supports the
following query operators and command that calculate distances using spherical geometry:
$nearSphere
$centerSphere
$near
geoNear command with the { spherical: true } option.
Important: These three queries use radians for distance. Other query types do not.
For spherical query operators to function properly, you must convert distances to radians, and convert from radians to
the distances units used by your application.
To convert:
distance to radians: divide the distance by the radius of the sphere (e.g. the Earth) in the same units as the
distance measurement.
radians to distance: multiply the radian measure by the radius of the sphere (e.g. the Earth) in the units system
that you want to convert the distance to.
The equatorial radius of the Earth is approximately 3,963.2 miles or 6,378.1 kilometers.
The following query would return documents from the places collection within the circle described by the center [
-74, 40.74 ] with a radius of 100 miles:
db.places.find( { loc: { $geoWithin: { $centerSphere: [ [ -74, 40.74 ] ,
100 / 3963.2 ] } } } )
You may also use the distanceMultiplier option to the geoNear to convert radians in the mongod process,
rather than in your application code. See distance multiplier (page 563).
The following spherical query, returns all documents in the collection places within 100 miles from the point [
-74, 40.74 ].
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true
} )
"maxDistance" : 0.01853714811400047
},
"ok" : 1
}
Warning: Spherical queries that wrap around the poles or at the transition from -180 to 180 longitude raise an
error.
Note: While the default Earth-like bounds for geospatial indexes are between -180 inclusive, and 180, valid values
for latitude are between -90 and 90.
Distance Multiplier
The distanceMultiplier option of the geoNear command returns distances only after multiplying the results
by an assigned value. This allows MongoDB to return converted values, and removes the requirement to convert units
in application logic.
Using distanceMultiplier in spherical queries provides results from the geoNear command that do not need
radian-to-distance conversion. The following example uses distanceMultiplier in the geoNear command
with a spherical (page 561) example:
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true,
distanceMultiplier: 3963.2
} )
Instructions for enabling MongoDBs text search feature, and for building and configuring text indexes.
Create a text Index (page 564) A text index allows searches on text strings in the indexs specified fields.
Specify a Language for Text Index (page 565) The specified language determines the list of stop words and the rules
for Text Searchs stemmer and tokenizer.
Text Search with Basis Technology Rosette Linguistics Platform (page 567) Enable text search support for Arabic,
Farsi (specifically Dari and Iranian Persian dialects), Urdu, Simplified Chinese, and Traditional Chinese. Mon-
goDB Enterprise feature only.
Specify Name for text Index (page 568) Override the text index name limit for long index names.
Control Search Results with Weights (page 570) Give priority to certain search values by denoting the significance
of an indexed field relative to other indexed fields
Limit the Number of Entries Scanned (page 571) Create an index to support queries that includes $text expres-
sions and equality conditions.
Text Search in the Aggregation Pipeline (page 572) Perform various text search in the aggregation pipeline.
On this page
Index Specific Fields (page 564)
Index All Fields (page 564)
You can create a text index on the field or fields whose value is a string or an array of string elements. When creating
a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**).
The following example creates a text index on the fields subject and content:
db.collection.createIndex(
{
subject: "text",
content: "text"
}
)
This text index catalogs all string data in the subject field and the content field, where the field value is either
a string or an array of string elements.
To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain
string content.
The following example indexes any string value in the data of every field of every document in collection and
names the index TextIndex:
db.collection.createIndex(
{ "$**": "text" },
{ name: "TextIndex" }
)
Note: In order to drop a text index, use the index name. See Use the Index Name to Drop a text Index (page 569)
for more information.
On this page
Specify the Default Language for a text Index (page 565)
Create a text Index for a Collection in Multiple Languages (page 565)
This tutorial describes how to specify the default language associated with the text index (page 565) and also how to
create text indexes for collections that contain documents in different languages (page 565).
The default language associated with the indexed data determines the rules to parse word roots (i.e. stemming) and
ignore stop words. The default language for the indexed data is english.
To specify a different language, use the default_language option when creating the text index. See Text Search
Languages (page 584) for the languages available for default_language.
The following example creates for the quotes collection a text index on the content field and sets the
default_language to spanish:
db.quotes.createIndex(
{ content : "text" },
{ default_language: "spanish" }
)
Changed in version 2.6: Added support for language overrides within embedded documents.
Specify the Index Language within the Document If a collection contains documents or embedded documents that
are in different languages, include a field named language in the documents or embedded documents and specify
as its value the language for that document or embedded document.
MongoDB will use the specified language for that document or embedded document when building the text index:
The specified language in the document overrides the default language for the text index.
The specified language in an embedded document override the language specified in an enclosing document or
the default language for the index.
See Text Search Languages (page 584) for a list of supported languages.
For example, a collection quotes contains multi-language documents that include the language field in the docu-
ment and/or the embedded document as needed:
{
_id: 1,
language: "portuguese",
If you create a text index on the quote field with the default language of English.
db.quotes.createIndex( { original: "text", "translation.quote": "text" } )
Then, for the documents and embedded documents that contain the language field, the text index uses that lan-
guage to parse word stems and other linguistic characteristics.
For embedded documents that do not contain the language field,
If the enclosing document contains the language field, then the index uses the documents language for the
embedded document.
Otherwise, the index uses the default language for the embedded documents.
For documents that do not contain the language field, the index uses the default language, which is English.
Use any Field to Specify the Language for a Document To use a field with a name other than language, include
the language_override option when creating the index.
For example, give the following command to use idioma as the field name instead of language:
The documents of the quotes collection may specify a language with the idioma field:
{ _id: 1, idioma: "portuguese", quote: "A sorte protege os audazes" }
{ _id: 2, idioma: "spanish", quote: "Nada hay ms surrealista que la realidad." }
{ _id: 3, idioma: "english", quote: "is this a dagger which I see before me" }
On this page
Overview (page 567)
Prerequisites (page 567)
Procedure (page 568)
Additional Information (page 568)
Enterprise Feature
Available in MongoDB Enterprise only.
Overview
Prerequisites
To use MongoDB with RLP, MongoDB requires a license for the Base Linguistics component of RLP and one or more
languages specified above. MongoDB does not require a license for all 6 languages listed above.
Support for any of the specified languages is conditional on having a valid RLP license for the language. For instance,
if there is only an RLP license provided for Arabic, then MongoDB will only enable support for Arabic and will
not enable support for any other RLP based languages. For any language which lacks a valid license, the MongoDB
log will contain a warning message. Additionally, you can set the MongoDB log verbosity level to 2 to log debug
messages that identify each supported language.
You do not need the Language Extension Pack as MongoDB does not support these RLP languages at this time.
Contact Basis Technology at info@basistech.com13 to get a copy of RLP and a license for one or more languages. For
more information on how to contact Basis Technology, see http://www.basistech.com/contact/.
Procedure
Step 1: Download Rosette Linguistics Platform from Basis Technology. From Basis Technology, obtain
the links to download the RLP C++ SDK package file, the documentation package file, and the license file
(rlp-license.xml) for Linux x64. Basis Technology provides the download links in an email.
Using the links, download the RLP C++ SDK package file, the documentation package file, and the license file
(rlp-license.xml) for Linux x64.
Step 2: Install the RLP binaries. Untar the RLP binaries and place them in a directory; this directory is referred to
as the installation directory or BT_ROOT. For this example, we will use /opt/basis as the BT_ROOT.
tar zxvC /opt/basis rlp-7.11.1-sdk-amd64-glibc25-gcc41.tar.gz
Step 3: Move the RLP license into the RLP licenses directory. Move the RLP license file
rlp-license.xml to the <BT_ROOT>/rlp/rlp/licenses directory; in our example, move the file to the
/opt/basis/rlp/rlp/licenses/ directory.
mv rlp-license.xml /opt/basis/rlp/rlp/licenses/
Step 4: Run mongod with RLP support. To enable support for RLP, use the --basisTechRootDirectory
option to specify the BT_ROOT directory.
Include any additional settings as appropriate for your deployment.
mongod --basisTechRootDirectory=/opt/basis
Additional Information
For installation help, see the RLP Quick Start manual or Chapter 2 of the Rosette Linguistics Platform Application
Developers Guide.
For debugging any RLP specific issues, you can set the rlpVerbose parameter to true (i.e. --setParameter
rlpVerbose=true) to view INFO messages from RLP.
Warning: Enabling rlpVerbose has a performance overhead and should only be enabled for troubleshooting
installation issues.
13 info@basistech.com
On this page
Specify a Name for text Index (page 569)
Use the Index Name to Drop a text Index (page 569)
The default name for the index consists of each indexed field name concatenated with _text. For example, the
following command creates a text index on the fields content, users.comments, and users.profiles:
db.collection.createIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
}
)
The text index, like other indexes, must fall within the index name length limit.
To avoid creating an index with a name that exceeds the index name length limit, you can pass the name
option to the db.collection.createIndex() method:
db.collection.createIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
},
{
name: "MyTextIndex"
}
)
Whether the text (page 508) index has the default name or you specified a name for the text (page 508) index, to drop
the text (page 508) index, pass the index name to the db.collection.dropIndex() method.
For example, consider the index created by the following operation:
db.collection.createIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
},
{
name: "MyTextIndex"
}
)
Then, to remove this text index, pass the name "MyTextIndex" to the db.collection.dropIndex()
method, as in the following:
db.collection.dropIndex("MyTextIndex")
Text search assigns a score to each document that contains the search term in the indexed fields. The score determines
the relevance of a document to a given search query.
For a text index, the weight of an indexed field denotes the significance of the field relative to the other indexed
fields in terms of the text search score.
For each indexed field in the document, MongoDB multiplies the number of matches by the weight and sums the
results. Using this sum, MongoDB then calculates the score for the document. See $meta operator for details on
returning and sorting by text scores.
The default weight is 1 for the indexed fields. To adjust the weights for the indexed fields, include the weights
option in the db.collection.createIndex() method.
Warning: Choose the weights carefully in order to prevent the need to reindex.
{
_id: 2,
content: "Who doesn't like cake?",
about: "food",
keywords: [ "cake", "food", "dessert" ]
}
To create a text index with different field weights for the content field and the keywords field, include the
weights option to the createIndex() method. For example, the following command creates an index on three
fields and assigns weights to two of the fields:
db.blog.createIndex(
{
content: "text",
keywords: "text",
about: "text"
},
{
weights: {
content: 10,
keywords: 5
},
name: "TextIndex"
}
)
This tutorial describes how to create indexes to limit the number of index entries scanned for queries that includes a
$text expression and equality conditions.
A collection inventory contains the following documents:
{ _id: 1, dept: "tech", description: "lime green computer" }
{ _id: 2, dept: "tech", description: "wireless red mouse" }
{ _id: 3, dept: "kitchen", description: "green placemat" }
{ _id: 4, dept: "kitchen", description: "red peeler" }
{ _id: 5, dept: "food", description: "green apple" }
{ _id: 6, dept: "food", description: "red potato" }
Consider the common use case that performs text searches by individual departments, such as:
db.inventory.find( { dept: "kitchen", $text: { $search: "green" } } )
To limit the text search to scan only those documents within a specific dept, create a compound index that first spec-
ifies an ascending/descending index key on the field dept and then a text index key on the field description:
db.inventory.createIndex(
{
dept: 1,
description: "text"
}
)
Then, the text search within a particular department will limit the scan of indexed documents. For example, the
following query scans only those documents with dept equal to kitchen:
db.inventory.find( { dept: "kitchen", $text: { $search: "green" } } )
Note:
A compound text index cannot include any other special index types, such as multi-key (page 497) or geospa-
tial (page 502) index fields.
If the compound text index includes keys preceding the text index key, to perform a $text search, the
query predicate must include equality match conditions on the preceding keys.
See also:
Text Indexes (page 508)
On this page
Restrictions (page 572)
Text Score (page 572)
Calculate the Total Views for Articles that Contains a Word (page 572)
Return Results Sorted by Text Search Score (page 573)
Match on Text Score (page 573)
Specify a Language for Text Search (page 573)
New in version 2.6. In the aggregation pipeline, text search is available via the use of the $text query operator in
the $match stage.
Restrictions
Text Score
The $text operator assigns a score to each document that contains the search term in the indexed fields. The score
represents the relevance of a document to a given text search query. The score can be part of a $sort pipeline
specification as well as part of the projection expression. The { $meta: "textScore" } expression provides
information on the processing of the $text operation. See $meta aggregation for details on accessing the score for
projection or sort.
The metadata is only available after the $match stage that includes the $text operation.
Examples The following examples assume a collection articles that has a text index on the field subject:
db.articles.createIndex( { subject: "text" } )
The following aggregation searches for the term cake in the $match stage and calculates the total views for the
matching documents in the $group stage.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake" } } },
{ $group: { _id: null, views: { $sum: "$views" } } }
]
)
To sort by the text search score, include a $meta expression in the $sort stage. The following example matches on
either the term cake or tea, sorts by the textScore in descending order, and returns only the title field in the
results set.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake tea" } } },
{ $sort: { score: { $meta: "textScore" } } },
{ $project: { title: 1, _id: 0 } }
]
)
The specified metadata determines the sort order. For example, the "textScore" metadata sorts in descending
order. See $meta for more information on metadata as well as an example of overriding the default sort order of the
metadata.
The "textScore" metadata is available for projections, sorts, and conditions subsequent the $match stage that
includes the $text operation.
The following example matches on either the term cake or tea, projects the title and the score fields, and then
returns only those documents with a score greater than 1.0.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake tea" } } },
{ $project: { title: 1, _id: 0, score: { $meta: "textScore" } } },
{ $match: { score: { $gt: 1.0 } } }
]
)
The following aggregation searches in spanish for documents that contain the term saber but not the term claro in
the $match stage and calculates the total views for the matching documents in the $group stage.
db.articles.aggregate(
[
{ $match: { $text: { $search: "saber -claro", $language: "es" } } },
{ $group: { _id: null, views: { $sum: "$views" } } }
]
)
The best indexes for your application must take a number of factors into account, including the kinds of queries you
expect, the ratio of reads to writes, and the amount of free memory on your system.
When developing your indexing strategy you should have a deep understanding of your applications queries. Before
you build indexes, map out the types of queries you will run so that you can build indexes that reference those fields.
Indexes come with a performance cost, but are more than worth the cost for frequent queries on large data set. Consider
the relative frequency of each query in the application and whether the query justifies an index.
The best overall strategy for designing indexes is to profile a variety of index configurations with data sets similar to
the ones youll be running in production to see which configurations perform best.Inspect the current indexes created
for your collections to ensure they are supporting your current and planned queries. If an index is no longer used, drop
the index.
Generally, MongoDB only uses one index to fulfill most queries. However, each clause of an $or query may use a
different index, and starting in 2.6, MongoDB can use an intersection (page 524) of multiple indexes.
The following documents introduce indexing strategies:
Create Indexes to Support Your Queries (page 574) An index supports a query when the index contains all the fields
scanned by the query. Creating indexes that supports queries results in greatly increased query performance.
Use Indexes to Sort Query Results (page 575) To support efficient queries, use the strategies here when you specify
the sequential order and sort order of index fields.
Ensure Indexes Fit in RAM (page 577) When your index fits in RAM, the system can avoid reading the index from
disk and you get the fastest processing.
Create Queries that Ensure Selectivity (page 578) Selectivity is the ability of a query to narrow results using the
index. Selectivity allows MongoDB to use the index for a larger portion of the work associated with fulfilling
the query.
On this page
Create a Single-Key Index if All Queries Use the Same, Single Key (page 574)
Create Compound Indexes to Support Several Different Queries (page 574)
An index supports a query when the index contains all the fields scanned by the query. The query scans the index and
not the collection. Creating indexes that support queries results in greatly increased query performance.
This document describes strategies for creating indexes that support queries.
Create a Single-Key Index if All Queries Use the Same, Single Key
If you only ever query on a single key in a given collection, then you need to create just one single-key index for that
collection. For example, you might create an index on category in the product collection:
db.products.createIndex( { "category": 1 } )
If you sometimes query on only one key and at other times query on that key combined with a second key, then creating
a compound index is more efficient than creating a single-key index. MongoDB will use the compound index for both
queries. For example, you might create an index on both category and item.
db.products.createIndex( { "category": 1, "item": 1 } )
This allows you both options. You can query on just category, and you also can query on category combined
with item. A single compound index (page 495) on multiple fields can support all the queries that search a prefix
subset of those fields.
Example
The following index on a collection:
{ x: 1, y: 1, z: 1 }
There are some situations where the prefix indexes may offer better query performance: for example if z is a large
array.
The { x: 1, y: 1, z: 1 } index can also support many of the same queries as the following index:
{ x: 1, z: 1 }
The { x: 1, z: 1 } index supports both the query and the sort operation, while the { x: 1, y: 1,
z: 1 } index only supports the query. For more information on sorting, see Use Indexes to Sort Query Results
(page 575).
Starting in version 2.6, MongoDB can use index intersection (page 524) to fulfill queries. The choice between creating
compound indexes that support your queries or relying on index intersection depends on the specifics of your system.
See Index Intersection and Compound Indexes (page 525) for more details.
On this page
Sort with a Single Field Index (page 575)
Sort on Multiple Fields (page 576)
In MongoDB, sort operations can obtain the sort order by retrieving documents based on the ordering in an index. If
the query planner cannot obtain the sort order from an index, it will sort the results in memory. Sort operations that
use an index often have better performance than those that do not use an index. In addition, sort operations that do not
use an index will abort when they use 32 megabytes of memory.
If an ascending or a descending index is on a single field, the sort operation on the field can be in either direction.
For example, create an ascending index on the field a for a collection records:
db.records.createIndex( { a: 1 } )
db.records.find().sort( { a: 1 } )
The index can also support the following descending sort on a by traversing the index in reverse order:
db.records.find().sort( { a: -1 } )
Sort and Index Prefix If the sort keys correspond to the index keys or an index prefix, MongoDB can use the index
to sort the query results. A prefix of a compound index is a subset that consists of one or more keys at the start of the
index key pattern.
For example, create a compound index on the data collection:
db.data.createIndex( { a:1, b: 1, c: 1, d: 1 } )
The following query and sort operations use the index prefixes to sort the results. These operations do not need to sort
the result set in memory.
Example Index Prefix
db.data.find().sort( { a: 1 } ) { a: 1 }
db.data.find().sort( { a: -1 } ) { a: 1 }
db.data.find().sort( { a: 1, b: 1 } ) { a: 1, b: 1 }
db.data.find().sort( { a: -1, b: -1 } ) { a: 1, b: 1 }
db.data.find().sort( { a: 1, b: 1, c: 1 } ) { a: 1, b: 1, c:
1 }
db.data.find( { a: { $gt: 4 } } ).sort( { a: 1, b: { a: 1, b: 1 }
1 } )
Consider the following example in which the prefix keys of the index appear in both the query predicate and the sort:
db.data.find( { a: { $gt: 4 } } ).sort( { a: 1, b: 1 } )
In such cases, MongoDB can use the index to retrieve the documents in order specified by the sort. As the example
shows, the index prefix in the query predicate can be different from the prefix in the sort.
Sort and Non-prefix Subset of an Index An index can support sort operations on a non-prefix subset of the index
key pattern. To do so, the query must include equality conditions on all the prefix keys that precede the sort keys.
For example, the collection data has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
The following operations can use the index to get the sort order:
Example Index Prefix
db.data.find( { a: 5 } ).sort( { b: 1, c: 1 } ) { a: 1 , b: 1, c:
1 }
db.data.find( { b: 3, a: 4 } ).sort( { c: 1 } ) { a: 1, b: 1, c: 1
}
db.data.find( { a: 5, b: { $lt: 3} } ).sort( { b: { a: 1, b: 1 }
1 } )
As the last operation shows, only the index fields preceding the sort subset must have the equality conditions in the
query document; the other index fields may specify other conditions.
If the query does not specify an equality condition on an index prefix that precedes or overlaps with the sort specifi-
cation, the operation will not efficiently use the index. For example, the following operations specify a sort document
of { c: 1 }, but the query documents do not contain equality matches on the preceding index fields a and b:
db.data.find( { a: { $gt: 2 } } ).sort( { c: 1 } )
db.data.find( { c: 5 } ).sort( { c: 1 } )
These operations will not efficiently use the index { a: 1, b: 1, c: 1, d: 1 } and may not even use
the index to retrieve the documents.
On this page
Indexes that Hold Only Recent Values in RAM (page 578)
For the fastest processing, ensure that your indexes fit entirely in RAM so that the system can avoid reading the index
from disk.
To check the size of your indexes, use the db.collection.totalIndexSize() helper, which returns data in
bytes:
> db.collection.totalIndexSize()
4294976499
The above example shows an index size of almost 4.3 gigabytes. To ensure this index fits in RAM, you must not only
have more than that much RAM available but also must have RAM available for the rest of the working set. Also
remember:
If you have and use multiple collections, you must consider the size of all indexes on all collections. The indexes and
the working set must be able to fit in memory at the same time.
There are some limited cases where indexes do not need to fit in memory. See Indexes that Hold Only Recent Values
in RAM (page 578).
See also:
collStats and db.collection.stats()
Indexes do not have to fit entirely into RAM in all cases. If the value of the indexed field increments with every insert,
and most queries select recently added documents; then MongoDB only needs to keep the parts of the index that hold
the most recent or right-most values in RAM. This allows for efficient index use for read and write operations and
minimize the amount of RAM required to support the index.
Selectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow
MongoDB to use the index for a larger portion of the work associated with fulfilling the query.
To ensure selectivity, write queries that limit the number of possible documents with the indexed field. Write queries
that are appropriately selective relative to your indexed data.
Example
Suppose you have a field called status where the possible values are new and processed. If you add an index
on status youve created a low-selectivity index. The index will be of little help in locating records.
A better strategy, depending on your queries, would be to create a compound index (page 495) that includes the low-
selectivity field and another field. For example, you could create a compound index on status and created_at.
Another option, again depending on your use case, might be to use separate collections, one for each status.
Example
Consider an index { a : 1 } (i.e. an index on the key a sorted in ascending order) on a collection where a has
three values evenly distributed across the collection:
{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 1, b: "cd" }
{ _id: ObjectId(), a: 1, b: "ef" }
{ _id: ObjectId(), a: 2, b: "jk" }
{ _id: ObjectId(), a: 2, b: "lm" }
{ _id: ObjectId(), a: 2, b: "no" }
{ _id: ObjectId(), a: 3, b: "pq" }
{ _id: ObjectId(), a: 3, b: "rs" }
{ _id: ObjectId(), a: 3, b: "tv" }
If you query for { a: 2, b: "no" } MongoDB must scan 3 documents in the collection to return the one
matching result. Similarly, a query for { a: { $gt: 1}, b: "tv" } must scan 6 documents, also to
return one result.
Consider the same index on a collection where a has nine values evenly distributed across the collection:
{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 2, b: "cd" }
{ _id: ObjectId(), a: 3, b: "ef" }
{ _id: ObjectId(), a: 4, b: "jk" }
{ _id: ObjectId(), a: 5, b: "lm" }
{ _id: ObjectId(), a: 6, b: "no" }
{ _id: ObjectId(), a: 7, b: "pq" }
{ _id: ObjectId(), a: 8, b: "rs" }
{ _id: ObjectId(), a: 9, b: "tv" }
If you query for { a: 2, b: "cd" }, MongoDB must scan only one document to fulfill the query. The index
and query are more selective because the values of a are evenly distributed and the query can select a specific document
If overall selectivity is low, and if MongoDB must read a number of documents to return results, then some queries
may perform faster without indexes. To determine performance, see Measure Index Use (page 545).
For a conceptual introduction to indexes in MongoDB see Index Concepts (page 492).
On this page
Indexing Methods in the mongo Shell (page 579)
Indexing Database Commands (page 580)
Geospatial Query Selectors (page 580)
Indexing Query Modifiers (page 580)
Other Index References (page 580)
Name Description
Builds an index on a collection.
db.collection.createIndex()
Removes a specified index on a collection.
db.collection.dropIndex()
Removes all indexes on a collection.
db.collection.dropIndexes()
Returns an array of documents that describe the existing indexes on a collection.
db.collection.getIndexes()
Rebuilds all existing indexes on a collection.
db.collection.reIndex()
Reports the total size used by the indexes on a collection. Provides a wrapper around
db.collection.totalIndexSize()
the totalIndexSize field of the collStats output.
cursor.explain() Reports on the query execution plan for a cursor.
cursor.hint() Forces MongoDB to use a specific index for a query.
cursor.max() Specifies an exclusive upper index bound for a cursor. For use with
cursor.hint()
cursor.min() Specifies an inclusive lower index bound for a cursor. For use with
cursor.hint()
cursor.snapshot() Forces the cursor to use the index on the _id field. Ensures that the cursor returns
each document, with regards to the value of the _id field, only once.
Name Description
createIndexes Builds one or more indexes for a collection.
dropIndexes Removes indexes from a collection.
compact Defragments a collection and rebuilds the indexes.
reIndex Rebuilds all indexes on a collection.
validate Internal command that scans for a collections data and indexes for correctness.
geoNear Performs a geospatial query that returns the documents closest to a given point.
geoSearch Performs a geospatial query that uses MongoDBs haystack index functionality.
checkShardingIndex Internal command that validates index on shard key.
Name Description
$geoWithin Selects geometries within a bounding GeoJSON geometry (page 580). The 2dsphere (page 503)
and 2d (page 505) indexes support $geoWithin.
Selects geometries that intersect with a GeoJSON geometry. The 2dsphere (page 503) index
$geoIntersects
supports $geoIntersects.
$near Returns geospatial objects in proximity to a point. Requires a geospatial index. The 2dsphere
(page 503) and 2d (page 505) indexes support $near.
$nearSphereReturns geospatial objects in proximity to a point on a sphere. Requires a geospatial index. The
2dsphere (page 503) and 2d (page 505) indexes support $nearSphere.
Name Description
$explain Forces MongoDB to report on query execution plans. See explain().
$hint Forces MongoDB to use a specific index. See hint()
$max Specifies an exclusive upper limit for the index to use in a query. See max().
$min Specifies an inclusive lower limit for the index to use in a query. See min().
$returnKey Forces the cursor to only return fields included in the index.
$snapshot Guarantees that a query returns each document no more than once. See snapshot().
GeoJSON Objects
On this page
Overview (page 581)
Point (page 581)
LineString (page 581)
Polygon (page 581)
MultiPoint (page 583)
MultiLineString (page 583)
MultiPolygon (page 583)
GeometryCollection (page 583)
Overview
The default coordinate reference system for GeoJSON uses the WGS84 datum.
Point
LineString
Polygon
The line that joins two points on a curved surface may or may not contain the same set of co-ordinates that joins those
two points on a flat surface. The line that joins two points on a curved surface will be a geodesic. Carefully check
points to avoid errors with shared edges, as well as overlaps and other types of intersections.
Polygons with a Single Ring The following example specifies a GeoJSON Polygon with an exterior ring and no
interior rings (or holes). The first and last coordinates must match in order to close the polygon:
{
type: "Polygon",
coordinates: [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ]
}
MultiPoint
MultiLineString
MultiPolygon
GeometryCollection
{
type: "GeometryCollection",
geometries: [
{
type: "MultiPoint",
coordinates: [
[ -73.9580, 40.8003 ],
[ -73.9498, 40.7968 ],
[ -73.9737, 40.7648 ],
[ -73.9814, 40.7681 ]
]
},
{
type: "MultiLineString",
coordinates: [
[ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ],
[ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ],
[ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ],
[ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ]
]
}
]
}
The text index (page 508) and the $text operator supports the following languages:
Changed in version 2.6: MongoDB introduces version 2 of the text search feature. With version 2, text search feature
supports using the two-letter language codes defined in ISO 639-1. Version 1 of text search only supported the long
form of each language name.
Changed in version 3.2: MongoDB Enterprise includes support for Arabic, Farsi (specifically Dari and Iranian Persian
dialects), Urdu, Simplified Chinese, and Traditional Chinese. To support the new languages, the text search feature
uses the three-letter language codes defined in ISO 636-3. To enable support for these languages, see Text Search with
Basis Technology Rosette Linguistics Platform (page 567).
Language Name ISO 639-1 (Two letter ISO 636-3 (Three letter RLP names (Three
codes) codes) letter codes)
danish da
dutch nl
english en
finnish fi
french fr
german de
hungarian hu
italian it
norwegian nb
portuguese pt
romanian ro
russian ru
spanish es
swedish sv
turkish tr
arabic ara
dari prs
iranian persian pes
urdu urd
simplified chinese zhs
or hans
traditional zht
chinese or hant
Note: If you specify a language value of "none", then the text search uses simple tokenization with no list of stop
words and no stemming.
See also:
Specify a Language for Text Index (page 565)
Storage
The storage engine (page 587) is the primary component of MongoDB responsible for managing data. MongoDB
provides a variety of storage engines, allowing you to choose one most suited to your application.
The journal is a log that helps the database recover in the event of a hard shutdown. There are several configurable
options that allows the journal to strike a balance between performance and reliability that works for your particular
use case.
GridFS (page 603) is a versatile storage system that is suited to handling large files, such as those exceeding the 16
MB document size limit.
The storage engine is the component of the database that is responsible for managing how data is stored, both in
memory and on disk. MongoDB supports multiple storage engines, as different engines perform better for specific
workloads. Choosing the appropriate storage engine for your use case can significantly impact the performance of
your applications.
WiredTiger (page 587) is the default storage engine starting in MongoDB 3.2. It is well-suited for most workloads and
is recommended for new deployments. WiredTiger provides a document-level concurrency model, checkpointing, and
compression, among other features. In MongoDB Enterprise, WiredTiger also supports Encryption At Rest (page 338).
MMAPv1 (page 595) is the original MongoDB storage engine and is the default storage engine for MongoDB versions
before 3.2. It performs well on workloads with high volumes of reads and writes, as well as in-place updates.
The In-Memory Storage Engine (page 597) is available in MongoDB Enterprise. Rather than storing documents on-
disk, it retains them in-memory for more predictable data latencies. This storage engine is in beta do not use in
production.
On this page
Document Level Concurrency (page 588)
Snapshots and Checkpoints (page 588)
Journal (page 588)
Compression (page 589)
Memory Use (page 589)
587
MongoDB Documentation, Release 3.2.4
Starting in MongoDB 3.0, the WiredTiger storage engine is available in the 64-bit builds.
Changed in version 3.2: The WiredTiger storage engine is the default storage engine starting in MongoDB 3.2. For ex-
isting deployments, if you do not specify the --storageEngine or the storage.engine setting, MongoDB 3.2
can automatically determine the storage engine used to create the data files in the --dbpath or storage.dbPath.
See Default Storage Engine Change (page 891).
WiredTiger uses document-level concurrency control for write operations. As a result, multiple clients can modify
different documents of a collection at the same time.
For most read and write operations, WiredTiger uses optimistic concurrency control. WiredTiger uses only intent locks
at the global, database and collection levels. When the storage engine detects conflicts between two operations, one
will incur a write conflict causing MongoDB to transparently retry that operation.
Some global operations, typically short lived operations involving multiple databases, still require a global instance-
wide lock. Some other operations, such as dropping a collection, still require an exclusive database lock.
WiredTiger uses MultiVersion Concurrency Control (MVCC). At the start of an operation, WiredTiger provides a
point-in-time snapshot of the data to the transaction. A snapshot presents a consistent view of the in-memory data.
When writing to disk, WiredTiger writes all the data in a snapshot to disk in a consistent way across all data files. The
now-durable data act as a checkpoint in the data files. The checkpoint ensures that the data files are consistent up to
and including the last checkpoint; i.e. checkpoints can act as recovery points.
MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds
or 2 gigabytes of journal data.
During the write of a new checkpoint, the previous checkpoint is still valid. As such, even if MongoDB terminates or
encounters an error while writing a new checkpoint, upon restart, MongoDB can recover from the last valid checkpoint.
The new checkpoint becomes accessible and permanent when WiredTigers metadata table is atomically updated to
reference the new checkpoint. Once the new checkpoint is accessible, WiredTiger frees pages from the old check-
points.
Using WiredTiger, even without journaling (page 588), MongoDB can recover from the last checkpoint; however, to
recover changes made after the last checkpoint, run with journaling (page 588).
Journal
WiredTiger uses a write-ahead transaction log in combination with checkpoints (page 588) to ensure data durability.
The WiredTiger journal persists all data modifications between checkpoints. If MongoDB exits between checkpoints,
it uses the journal to replay all data modified since the last checkpoint. For information on the frequency with which
MongoDB writes the journal data to disk, see Journaling Process (page 599).
WiredTiger journal is compressed using the snappy compression library. To specify an alternate compression algorithm
or no compression, use the storage.wiredTiger.engineConfig.journalCompressor setting.
Note: Minimum log record size for WiredTiger is 128 bytes. If a log record is 128 bytes or smaller, WiredTiger does
not compress that record.
You can disable journaling by setting storage.journal.enabled to false, which can reduce the overhead of
maintaining the journal.
For standalone instances, not using the journal means that you will lose some data modifications when MongoDB
exits unexpectedly between checkpoints. For members of replica sets, the replication process may provide sufficient
durability guarantees.
See also:
Journaling with WiredTiger (page 598)
Compression
With WiredTiger, MongoDB supports compression for all collections and indexes. Compression minimizes storage
use at the expense of additional CPU.
By default, WiredTiger uses block compression with the snappy compression library for all collections and prefix
compression for all indexes.
For collections, block compression with zlib is also available. To specify an alternate compression algorithm or no
compression, use the storage.wiredTiger.collectionConfig.blockCompressor setting.
For indexes, to disable prefix compression, use the storage.wiredTiger.indexConfig.prefixCompression
setting.
Compression settings are also configurable on a per-collection and per-index basis during collection and index creation.
See create-collection-storage-engine-options and db.collection.createIndex() storageEngine option.
For most workloads, the default compression settings balance storage efficiency and processing requirements.
The WiredTiger journal is also compressed by default. For information on journal compression, see Journal
(page 588).
Memory Use
With WiredTiger, MongoDB utilizes both the WiredTiger cache and the filesystem cache.
Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For systems with up to 10 GB of RAM, the new default setting is less than or equal to the 3.0 default setting (For
MongoDB 3.0, the WiredTiger cache uses either 1 GB or half of the installed physical RAM, whichever is larger).
For systems with more than 10 GB of RAM, the new default setting is greater than the 3.0 setting.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or
by other processes. Data in the filesystem cache is compressed.
To adjust the size of the WiredTiger cache, see storage.wiredTiger.engineConfig.cacheSizeGB and
--wiredTigerCacheSizeGB. Avoid increasing the WiredTiger cache size above its default value.
See also:
http://wiredtiger.com
This tutorial gives an overview of changing the storage engine of a standalone MongoDB instance to WiredTiger
(page 587).
Considerations This tutorial uses the mongodump and mongorestore utilities to export and import data. Ensure
that these MongoDB package components are installed and updated on your system. In addition, make sure you have
sufficient drive space available for the mongodump export file and the data files of your new mongod instance running
with WiredTiger.
You must be using MongoDB version 3.0 or greater in order to use the WiredTiger storage engine. If upgrading from
an earlier version of MongoDB, see the guides on Upgrading to MongoDB 3.0 (page 945) or Upgrading to MongoDB
3.2 (page 894) before proceeding with changing your storage engine.
Procedure
Step 1: Start the mongod you wish to change to WiredTiger. If mongod is already running, you can skip this
step.
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 3: Create a data directory for the new mongod running with WiredTiger. Create a data directory for
the new mongod instance that will run with the WiredTiger storage engine. mongod must have read and write
permissions for this directory.
mongod with WiredTiger will not start with data files created with a different storage engine.
Step 4: Start mongod with WiredTiger. Start mongod, specifying wiredTiger as the --storageEngine
and the newly created data directory for WiredTiger as the --dbpath. Specify additional options as appropriate.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath>
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
New in version 3.0: The WiredTiger storage engine is available. Also, replica sets may have members with different
storage engines.
Changed in version 3.2: WiredTiger is the new default storage engine for MongoDB.
This tutorial gives an overview of changing the storage engine of a member of a replica set to WiredTiger (page 587).
Considerations Replica sets can have members with different storage engines. As such, you can update members to
use the WiredTiger storage engine in a rolling fashion. Before changing all the members to use WiredTiger, you may
prefer to run with mixed storage engines for some period. However, performance can vary according to workload.
You must be using MongoDB version 3.0 or greater in order to use the WiredTiger storage engine. If upgrading from
an earlier version of MongoDB, see the guides on Upgrading to MongoDB 3.0 (page 945) or Upgrading to MongoDB
3.2 (page 894) before proceeding with changing your storage engine.
Before enabling the new WiredTiger storage engine, ensure that all replica set/sharded cluster members are running at
least MongoDB version 2.6.8, and preferably version 3.0.0 or newer.
Procedure This procedure completely removes a secondary replica set members data, starts mongod with
WiredTiger, and performs an initial sync (page 690).
To update all members of the replica set to use WiredTiger, update the secondary members first. Then step down the
primary, and update the stepped-down member.
Step 1: Shut down the secondary member. In the mongo shell, shut down the secondary mongod instance you
wish to upgrade.
db.shutdownServer()
Step 2: Prepare a data directory for the new mongod running with WiredTiger. Prepare a data directory for
the new mongod instance that will run with the WiredTiger storage engine. mongod must have read and write
permissions for this directory. You can either delete the contents of the stopped secondary members current data
directory or create a new directory entirely.
mongod with WiredTiger will not start with data files created with a different storage engine.
Step 3: Start mongod with WiredTiger. Start mongod, specifying wiredTiger as the --storageEngine
and the prepared data directory for WiredTiger as the --dbpath. Specify additional options as appropriate for this
replica set member.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> --replSet <replSetName>
Since no data exists in the --dbpath, the mongod will perform an initial sync (page 690). The length of the initial
sync process depends on the size of the database and network connection between members of the replica set.
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
Step 4: Repeat the procedure for other replica set secondaries you wish to upgrade. Perform this procedure
again for the rest of the secondary members of the replica set you wish to use the WiredTiger storage engine.
New in version 3.0: The WiredTiger storage engine is available. Also, sharded clusters may have individual shards
with different storage engine configurations.
Changed in version 3.2: WiredTiger is the new default storage engine for MongoDB.
This tutorial gives an overview of changing the storage engines of a component of a sharded cluster to WiredTiger
(page 587).
Considerations This procedure may involve downtime, especially if one or more of your shards is a standalone. If
you change the host or port of any shard, you must update the shard configuration as well.
You must be using MongoDB version 3.0 or greater in order to use the WiredTiger storage engine. If upgrading from
an earlier version of MongoDB, see the guides on Upgrading to MongoDB 3.0 (page 945) or Upgrading to MongoDB
3.2 (page 894) before proceeding with changing your storage engine.
Before enabling the new WiredTiger storage engine, ensure that all replica set/sharded cluster members are running at
least MongoDB version 2.6.8, and preferably version 3.0.0 or newer.
To change the storage engine for the shards to WiredTiger, refer to the appropriate procedure for each shard:
If the shard is a standalone, see Change Standalone to WiredTiger (page 589).
If the shard is a replica set, see Change Replica Set to WiredTiger (page 590).
Change Config Servers to WiredTiger To change the storage engines of the config servers of a sharded cluster, see
Change Config Servers to WiredTiger (page 592).
You may safely continue to use MMAPv1 (page 595) for the config servers even if the shards of the sharded cluster
is using the WiredTiger storage engine. If you do choose to update the config servers to use WiredTiger, you must
update all three.
See also:
Change Config Servers to WiredTiger (page 592)
Change Config Servers to WiredTiger New in version 3.0: The WiredTiger storage engine is available.
Changed in version 3.2: WiredTiger is the new default storage engine for MongoDB.
This tutorial gives an overview of changing the storage engine of the config servers in a sharded cluster to WiredTiger
(page 587).
Considerations You may safely continue to use MMAPv1 (page 595) for the config servers even if the shards of the
sharded cluster is using the WiredTiger storage engine. If you do choose to update the config servers to use WiredTiger,
you must update all three.
You must be using MongoDB version 3.0 or greater in order to use the WiredTiger storage engine. If upgrading from
an earlier version of MongoDB, see the guides on Upgrading to MongoDB 3.0 (page 945) or Upgrading to MongoDB
3.2 (page 894) before proceeding with changing your storage engine.
Procedure This tutorial assumes that you have three config servers for this sharded cluster. The three servers are
named first, second, and third, based on their position in the mongos configDB setting.
Important: During this process, at most only two config servers will be running at any given time to ensure that the
sharded clusters metadata is read-only.
Turn off the balancer (page 750) in the sharded cluster, as described in Disable the Balancer (page 794).
Step 2: Shut down the third config server to ensure read-only metadata. Connect a mongo shell to the third
config server and use db.shutdownServer() to shut down the third config server.
The third config server is the last one listed in the mongos configDB setting.
db.shutdownServer()
Step 3: Export the data of the second config server with mongodump. While the third config server is down to
ensure the config servers are read-only, prepare to upgrade the second config server to use WiredTiger. The second
config server is the second server listed in the mongos setting configDB.
Export the data of the second config server with mongodump.
mongodump --out <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 4: For the second config server, create a new data directory for use with WiredTiger. Create a data direc-
tory in preparation for having the second config server run with WiredTiger. mongod will not start if the --dbpath
directory contains data files created with a different storage engine.
mongod must have read and write permissions for the new directory.
Step 5: Stop the second config server. Connect a mongo shell to the second config server and use
db.shutdownServer() to shut down the second config server.
db.shutdownServer()
Step 6: Start the second config server mongod with the WiredTiger storage engine option. Start mongod
as a config server, specifying wiredTiger as the --storageEngine and the newly created data directory for
WiredTiger as the --dbpath. Specify additional options as appropriate.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> --configsvr
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
Step 7: Upload the exported data using mongorestore to the second config server. Use mongorestore to
upload the exported data. Specify additional options as appropriate. See mongorestore for available options.
mongorestore <exportDataDestination>
When the mongorestore finishes, the second config server upgrade to use WiredTiger is complete.
Step 8: Shut down the second config server to ensure read-only metadata. When the second config server
upgrade is complete, shut down the second config server in preparation to upgrade the other config servers. This is
necessary to maintain at most only two active config servers and keep the sharded clusters metadata read-only.
Connect a mongo shell to the second config server and use db.shutdownServer() to shut down the second
config server.
db.shutdownServer()
Step 9: Restart the third config server to prepare for its upgrade. Restart the third config server with its original
startup options. Do not change its options to use the WiredTiger storage engine at this point.
mongod --configsvr --dbpath <existingDBPath>
Include any other options in use for the third config server.
Step 10: Export the data of the third config server with mongodump.
mongodump --out <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 11: For the third config server, create a new data directory for use with WiredTiger. Create a data directory
in preparation for having the third config server run with WiredTiger. mongod will not start if the --dbpath
directory contains data files created with a different storage engine.
mongod must have read and write permissions for the new directory.
Step 12: Stop the third config server. Connect a mongo shell to the third config server and use
db.shutdownServer() to shut down the third config server.
db.shutdownServer()
Step 13: Start the third config server with the WiredTiger storage engine option. Start mongod as a config
server, specifying wiredTiger as the --storageEngine and the newly created data directory for WiredTiger as
the --dbpath. Specify additional options as appropriate.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> --configsvr
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
Step 14: Upload the exported data using mongorestore to the third config server. Use mongorestore to
upload the exported data. Specify additional options as appropriate. See mongorestore for available options.
mongorestore <exportDataDestination>
When the mongorestore finishes, the third config server upgrade to use WiredTiger is complete.
Step 15: Export data of the first config server with mongodump. To prepare for the upgrade of the first config
server to use WiredTiger, export the data of the first config server with mongodump.
mongodump --out <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 16: For the first config server, create a new data directory for use with WiredTiger. Create a data directory
in preparation for having the first config server run with WiredTiger. mongod will not start if the --dbpath directory
contains data files created with a different storage engine.
mongod must have read and write permissions for the new directory.
Step 17: Stop the first config server. Connect a mongo shell to the first config server and use
db.shutdownServer() to shut down the first config server.
db.shutdownServer()
Step 18: Start the first config server with the WiredTiger storage engine option. Start mongod as a config
server, specifying wiredTiger as the --storageEngine and the newly created data directory for WiredTiger as
the --dbpath. Specify additional options as appropriate.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> --configsvr
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
Step 19: Upload the exported data using mongorestore to the first config server. Use mongorestore to
upload the exported data. Specify additional options as appropriate. See mongorestore for available options.
mongorestore <exportDataDestination>
When the mongorestore finishes, the first config server upgrade to use WiredTiger is complete.
Step 20: Restart the second config server to enable writes to the sharded clusters metadata. Restart the second
config server, specifying WiredTiger as the --storageEngine and the newly created WiredTiger data directory as
the --dbpath. Specify additional options as appropriate.
mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> --configsvr
You can also specify the options in a configuration file. To specify the storage engine, use the
storage.engine setting.
Once all three config servers are up, the sharded clusters metadata is available for writes.
Step 21: Re-enable the balancer. Once all three config servers are up and running with WiredTiger, re-enable the
balancer (page 795).
sh.startBalancer()
On this page
Journal (page 596)
Record Storage Characteristics (page 596)
Record Allocation Strategies (page 596)
Memory Use (page 597)
MMAPv1 is MongoDBs original storage engine based on memory mapped files. It excels at workloads with high
volume inserts, reads, and in-place updates.
Changed in version 3.2: Starting in MongoDB 3.2, the MMAPv1 is no longer the default storage engine; instead, the
WiredTiger (page 587) storage engine is the default storage engine . See Default Storage Engine Change (page 891).
Journal
In order to ensure that all modifications to a MongoDB data set are durably written to disk, MongoDB, by default,
records all modifications to an on-disk journal. MongoDB writes more frequently to the journal than it writes the data
files.
In the default configuration for the MMAPv1 storage engine (page 595), MongoDB writes to the data files on disk
every 60 seconds and writes to the journal files roughly every 100 milliseconds.
To change the interval for writing to the data files, use the storage.syncPeriodSecs setting. For the journal
files, see storage.journal.commitIntervalMs setting.
These values represent the maximum amount of time between the completion of a write operation and when MongoDB
writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk
more frequently, so that the above values represents a theoretical maximum.
The journal allows MongoDB to successfully recover data from data files after a mongod instance exits without
flushing all changes. See Journaling (page 598) for more information about the journal in MongoDB.
All records are contiguously located on disk, and when a document becomes larger than the allocated record, Mon-
goDB must allocate a new record. New allocations require MongoDB to move a document and update all indexes that
refer to the document, which takes more time than in-place updates and leads to storage fragmentation.
Changed in version 3.0.0.
By default, MongoDB uses Power of 2 Sized Allocations (page 596) so that every document in MongoDB is stored in
a record which contains the document itself and extra space, or padding. Padding allows the document to grow as the
result of updates while minimizing the likelihood of reallocations.
MongoDB supports multiple record allocation strategies that determine how mongod adds padding to a document
when creating a record. Because documents in MongoDB may grow after insertion and all records are contiguous on
disk, the padding can reduce the need to relocate documents on disk following updates. Relocations are less efficient
than in-place updates and can lead to storage fragmentation. As a result, all padding strategies trade additional space
for increased efficiency and decreased fragmentation.
Different allocation strategies support different kinds of workloads: the power of 2 allocations (page 596) are more
efficient for insert/update/delete workloads; while exact fit allocations (page 597) is ideal for collections without update
and delete workloads.
Can reduce moves. The added padding space gives a document room to grow without requiring a move. In
addition to saving the cost of moving, this results in less updates to indexes. Although the power of 2 sizes
strategy can minimize moves, it does not eliminate them entirely.
Memory Use
With MMAPv1, MongoDB automatically uses all free memory on the machine as its cache. System resource monitors
show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the servers
RAM, MongoDB will yield cached memory to the other process.
Technically, the operating systems virtual memory subsystem manages MongoDBs memory. This means that Mon-
goDB will use as much free memory as it can, swapping to disk as needed. Deployments with enough memory to fit
the applications working data set in RAM will achieve the best performance.
On this page
Specify In-Memory Storage Engine (page 597)
Concurrency (page 598)
Durability (page 598)
Warning: The in-memory storage engine is currently in beta. Do not use in production.
Starting in MongoDB Enterprise 3.2, an in-memory storage engine is available in the 64-bit builds for beta-testing
purposes. Other than some metadata and diagnoistic data, the in-memory storage engine does not maintain any on-disk
data. By avoiding disk I/O, the in-memory storage engine allows for more predictable latency of database operations.
Warning: The in-memory storage engine does not persist data after process shutdown.
Concurrency
The in-memory storage engine uses document-level concurrency control for write operations. As a result, multiple
clients can modify different documents of a collection at the same time.
Durability
The in-memory storage engine is non-persistent and does not write data to a persistent storage. As such, the concept
of journal or waiting for data to become durable does not apply to the in-memory storage engine.
Write operations that specify a write concern journaled (page 143) are acknowledged immediately. When an
mongod instance shuts down, either as result of the shutdown command or due to a system error, recovery of
in-memory data is impossible.
9.2 Journaling
On this page
Journaling and the WiredTiger Storage Engine (page 598)
Journaling and the MMAPv1 Storage Engine (page 599)
Journaling and the In-Memory Storage Engine (page 601)
To provide durability in the event of a failure, MongoDB uses write ahead logging to on-disk journal files.
Important: The log mentioned in this section refers to the WiredTiger write-ahead log (i.e. the journal) and not the
MongoDB log file.
WiredTiger (page 587) uses checkpoints (page 588) to provide a consistent view of data on disk and allow MongoDB
to recover from the last checkpoint. However, if MongoDB exits unexpectedly in between checkpoints, journaling is
required to recover information that occurred after the last checkpoint.
With journaling, the recovery process:
1. Looks in the data files to find the identifier of the last checkpoint.
2. Searches in the journal files for the record that matches the identifier of the last checkpoint.
3. Apply the operations in the journal files since the last checkpoint.
Journaling Process
Important: In between write operations, while the journal records remain in the WiredTiger buffers, updates can be
lost following a hard shutdown of mongod.
See also:
The serverStatus command returns information on the WiredTiger journal statistics in the wiredTiger.log
field.
Journal Files
For the journal files, MongoDB creates a subdirectory named journal under the dbPath directory. WiredTiger
journal files have names with the following format WiredTigerLog.<sequence> where <sequence> is a
zero-padded number starting from 0000000001.
Journal files contain a record per each write operation. Each record has a unique identifier.
MongoDB configures WiredTiger to use snappy compression for the journaling data.
Minimum log record size for WiredTiger is 128 bytes. If a log record is 128 bytes or smaller, WiredTiger does not
compress that record.
WiredTiger journal files for MongoDB have a maximum size limit of approximately 100 MB. Once the file exceeds
that limit, WiredTiger creates a new journal file.
WiredTiger automatically removes old journal files to maintain only the files needed to recover from last checkpoint.
WiredTiger will pre-allocate journal files.
With MMAPv1 (page 595), when a write operation occurs, MongoDB updates the in-memory view. With journaling
enabled, MongoDB writes the in-memory changes first to on-disk journal files. If MongoDB should terminate or
encounter an error before committing the changes to the data files, MongoDB can use the journal files to apply the
write operation to the data files and maintain a consistent state.
Journaling Process
With journaling, MongoDBs storage layer has two internal views of the data set: the private view, used to write to the
journal files, and the shared view, used to write to the data files:
1. MongoDB first applies write operations to the private view.
2. MongoDB then applies the changes in the private view to the on-disk journal files (page 600) in the journal
directory roughly every 100 milliseconds. MongoDB records the write operations to the on-disk journal files
in batches called group commits. Grouping the commits help minimize the performance impact of journal-
ing since these commits must block all writers during the commit. Writes to the journal are atomic, ensur-
ing the consistency of the on-disk journal files. For information on the frequency of the commit interval, see
storage.journal.commitIntervalMs.
3. Upon a journal commit, MongoDB applies the changes from the journal to the shared view.
4. Finally, MongoDB applies the changes in the shared view to the data files. More precisely, at default intervals of
60 seconds, MongoDB asks the operating system to flush the shared view to the data files. The operating system
may choose to flush the shared view to disk at a higher frequency than 60 seconds, particularly if the system is
low on free memory. To change the interval for writing to the data files, use the storage.syncPeriodSecs
setting.
If the mongod instance were to crash without having applied the writes to the data files, the journal could replay the
writes to the shared view for eventual write to the data files.
When MongoDB flushes write operations to the data files, MongoDB notes which journal writes have been flushed.
Once a journal file contains only flushed writes, it is no longer needed for recovery and MongoDB can recycle it for a
new journal file.
Once the journal operations have been applied to the shared view and flushed to disk (i.e. pages in the shared view and
private view are in sync), MongoDB asks the operating system to remap the shared view to the private view in order
to save physical RAM. MongoDB routinely asks the operating system to remap the shared view to the private view in
order to save physical RAM. Upon a new remapping, the operating system knows that physical memory pages can be
shared between the shared view and the private view mappings.
Note: The interaction between the shared view and the on-disk data files is similar to how MongoDB works without
journaling. Without journaling, MongoDB asks the operating system to flush in-memory changes to the data files
every 60 seconds.
Journal Files
With journaling enabled, MongoDB creates a subdirectory named journal under the dbPath directory. The
journal directory contains journal files named j._<sequence> where <sequence> is an integer starting from
0 and a last sequence number file lsn.
Journal files contain the write ahead logs; each journal entry describes the bytes the write operation changed in the
data files. Journal files are append-only files. When a journal file holds 1 gigabyte of data, MongoDB creates a new
journal file. If you use the storage.smallFiles option when starting mongod, you limit the size of each journal
file to 128 megabytes.
The lsn file contains the last time MongoDB flushed the changes to the data files.
Once MongoDB applies all the write operations in a particular journal file to the data files, MongoDB can recycle it
for a new journal file.
Unless you write many bytes of data per second, the journal directory should contain only two or three journal files.
A clean shutdown removes all the files in the journal directory. A dirty shutdown (crash) leaves files in the journal
directory; these are used to automatically recover the database to a consistent state when the mongod process is
restarted.
Journal Directory
To speed the frequent sequential writes that occur to the current journal file, you can ensure that the journal directory
is on a different filesystem from the database data files.
Important: If you place the journal on a different filesystem from your data files, you cannot use a filesystem
snapshot alone to capture valid backups of a dbPath directory. In this case, use fsyncLock() to ensure that
database files are consistent before the snapshot and fsyncUnlock() once the snapshot is complete.
Preallocation Lag
MongoDB may preallocate journal files if the mongod process determines that it is more efficient to preallocate
journal files than create new journal files as needed.
Depending on your filesystem, you might experience a preallocation lag the first time you start a mongod instance
with journaling enabled. The amount of time required to pre-allocate files might last several minutes; during this
time, you will not be able to connect to the database. This is a one-time preallocation and does not occur with future
invocations.
To avoid preallocation lag, see Avoid Preallocation Lag for MMAPv1 (page 602).
Warning: The in-memory storage engine is currently in beta. Do not use in production.
The In-Memory Storage Engine (page 597) is available in MongoDB Enterprise 3.2 and later. Because its data is
kept in memory, there is no separate journal. Write operations with a write concern of j: true (page 143) are
immediately acknowledged.
See also:
In-Memory Storage Engine: Durability (page 598)
Manage Journaling
On this page
Procedures (page 602)
MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 77) durability. The
MMAPv1 storage engine also requires the journal in order to provide crash resiliency.
The WiredTiger storage engine does not require journaling to guarantee a consistent state after a crash. The database
will be restored to the last consistent checkpoint (page 588) during recovery. However, if MongoDB exits unexpectedly
in between checkpoints, journaling is required to recover writes that occurred after the last checkpoint.
With journaling enabled, if mongod stops unexpectedly, the program can recover everything written to the journal.
MongoDB will re-apply the write operations on restart and maintain a consistent state. By default, the greatest extent
of lost writes, i.e., those not made to the journal, are those made in the last 100 milliseconds, plus the time it takes to
perform the actual journal writes. See commitIntervalMs for more information on the default.
Procedures
Enable Journaling Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default.
To enable journaling, start mongod with the --journal command line option.
Warning: Do not disable journaling on production systems. When using the MMAPv1 storage engine withou
journal, if your mongod instance stops without shutting down cleanly unexpectedly for any reason, (e.g. pow
Disable Journaling
failure) and you are not running with journaling, then you must recover from an unaffected replica set member
backup, as described in repair (page 289).
To disable journaling, start mongod with the --nojournal command line option.
Get Commit Acknowledgment You can get commit acknowledgment with the Write Concern (page 141) and the
j (page 143) option. For details, see Write Concern (page 141).
Avoid Preallocation Lag for MMAPv1 With the MMAPv1 storage engine (page 595), MongoDB may preallocate
journal files if the mongod process determines that it is more efficient to preallocate journal files than create new
journal files as needed.
Depending on your filesystem, you might experience a preallocation lag the first time you start a mongod instance
with journaling enabled. The amount of time required to pre-allocate files might last several minutes; during this
time, you will not be able to connect to the database. This is a one-time preallocation and does not occur with future
invocations.
To avoid preallocation lag (page 600), you can preallocate files in the journal directory by copying them from another
instance of mongod.
Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod with journaling,
mongod will create them again.
Example
The following sequence preallocates journal files for an instance of mongod running on port 27017 with a database
path of /data/db.
For demonstration purposes, the sequence starts by creating a set of journal files in the usual way.
1. Create a temporary directory into which to create a set of journal files:
mkdir ~/tmpDbpath
2. Create a set of journal files by staring a mongod instance that uses the temporary directory:
mongod --port 10000 --dbpath ~/tmpDbpath --journal
3. When you see the following log output, indicating mongod has the files, press CONTROL+C to stop the
mongod instance:
4. Preallocate journal files for the new instance of mongod by moving the journal files from the data directory of
the existing instance to the data directory of the new instance:
mv ~/tmpDbpath/journal /data/db/
Monitor Journal Status Use the following commands and methods to monitor journal status:
serverStatus
The serverStatus command returns database status information that is useful for assessing performance.
journalLatencyTest
Use journalLatencyTest to measure how long it takes on your volume to write to the disk in an append-
only fashion. You can run this command on an idle system to get a baseline sync time for journaling. You can
also run this command on a busy system to see the sync time on a busy system, which may be higher if the
journal directory is on the same volume as the data files.
The journalLatencyTest command also provides a way to check if your disk drive is buffering writes in
its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the drive
is probably buffering writes. In that case, enable cache write-through for the device in your operating system,
unless you have a disk controller card with battery backed RAM.
Change the Group Commit Interval for MMAPv1 For the MMAPv1 storage engine (page 595), you can set the
group commit interval using the --journalCommitInterval command line option. The allowed range is 2 to
300 milliseconds.
Lower values increase the durability of the journal at the expense of disk performance.
Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the
journal directory before the server becomes available. If MongoDB must replay journal files, mongod notes these
events in the log output.
There is no reason to run repairDatabase in these situations.
9.3 GridFS
On this page
Use GridFS (page 604)
GridFS Collections (page 604)
GridFS Indexes (page 606)
Additional Resources (page 606)
GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB.
Instead of storing a file in a single document, GridFS divides the file into parts, or chunks 1 , and stores each chunk as
a separate document. By default, GridFS uses a chunk size of 255 kB; that is, GridFS divides a file into chunks of 255
kB with the exception of the last chunk. The last chunk is only as large as necessary. Similarly, files that are no larger
than the chunk size only have a final chunk, using only as much space as needed plus some additional metadata.
GridFS uses two collections to store files. One collection stores the file chunks, and the other stores file metadata. The
section GridFS Collections (page 604) describes each collection in detail.
When you query GridFS for a file, the driver will reassemble the chunks as needed. You can perform range queries on
files stored through GridFS. You can also access information from arbitrary sections of files, such as to skip to the
middle of a video or audio file.
GridFS is useful not only for storing files that exceed 16 MB but also for storing any files for which you want access
without having to load the entire file into memory. See also When should I use GridFS? (page 828).
Changed in version 2.4.10: The default chunk size changed from 256 kB to 255 kB.
To store and retrieve files using GridFS, use either of the following:
A MongoDB driver. See the drivers documentation for information on using GridFS with your driver.
The mongofiles command-line tool. See the mongofiles reference for documentation.
Each document in the chunks 1 collection represents a distinct chunk of a file as represented in GridFS. Documents
in this collection have the following form:
{
"_id" : <ObjectId>,
"files_id" : <ObjectId>,
"n" : <num>,
"data" : <binary>
}
chunks._id
The unique ObjectId of the chunk.
chunks.files_id
The _id of the parent document, as specified in the files collection.
chunks.n
The sequence number of the chunk. GridFS numbers all chunks, starting with 0.
chunks.data
The chunks payload as a BSON Binary type.
Each document in the files collection represents a file in GridFS. Consider a document in the files collection,
which has the following form:
{
"_id" : <ObjectId>,
"length" : <num>,
"chunkSize" : <num>,
"uploadDate" : <timestamp>,
"md5" : <hash>,
"filename" : <string>,
"contentType" : <string>,
"aliases" : <string array>,
"metadata" : <dataObject>,
}
Documents in the files collection contain some or all of the following fields. Applications may create additional
arbitrary fields:
files._id
The unique identifier for this document. The _id is of the data type you chose for the original document. The
default type for MongoDB documents is BSON ObjectId.
files.length
The size of the document in bytes.
files.chunkSize
The size of each chunk in bytes. GridFS divides the document into chunks of size chunkSize, except for the
last, which is only as large as needed. The default size is 255 kilobytes (kB).
Changed in version 2.4.10: The default chunk size changed from 256 kB to 255 kB.
files.uploadDate
The date the document was first stored by GridFS. This value has the Date type.
files.md5
An MD5 hash of the complete file returned by the filemd5 command. This value has the String type.
files.filename
Optional. A human-readable name for the GridFS file.
files.contentType
Optional. A valid MIME type for the GridFS file.
files.aliases
Optional. An array of alias strings.
files.metadata
Optional. Any additional information you want to store.
GridFS uses indexes on each of the chunks and files collections for efficiency. Drivers that conform to the
GridFS specification2 automatically create these indexes for convenience. You can also create any additional indexes
as desired to suit your applications needs.
GridFS uses a unique, compound index on the chunks collection using the files_id and n fields. This allows for
efficient retrieval of chunks, as demonstrated in the following example:
db.fs.chunks.find( { files_id: myFileID } ).sort( { n: 1 } )
Drivers that conform to the GridFS specification3 will automatically ensure that this index exists before read and
write operations. See the relevant driver documentation for the specific behavior of your GridFS application.
If this index does not exist, you can issue the following operation to create it using the mongo shell:
db.fs.chunks.createIndex( { files_id: 1, n: 1 }, { unique: true } );
GridFS uses an index on the files collection using the filename and uploadDate fields. This index allows for
efficient retrieval of files, as shown in this example:
db.fs.files.find( { filename: myFileName } ).sort( { uploadDate: 1 } )
Drivers that conform to the GridFS specification4 will automatically ensure that this index exists before read and
write operations. See the relevant driver documentation for the specific behavior of your GridFS application.
If this index does not exist, you can issue the following operation to create it using the mongo shell:
db.fs.files.createIndex( { filename: 1, uploadDate: 1 } );
On this page
Storage Engine Fundamentals (page 850)
Can you mix storage engines in a replica set? (page 851)
WiredTiger Storage Engine (page 851)
MMAPv1 Storage Engine (page 852)
Data Storage Diagnostics (page 855)
A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on
disk. Many databases support multiple storage engines, where different engines perform better for specific workloads.
For example, one storage engine might offer better performance for read-heavy workloads, and another might support
a higher-throughput for write operations.
See also:
Storage Engines (page 587)
Yes. You can have a replica set members that use different storage engines.
When designing these multi-storage engine deployments consider the following:
the oplog on each member may need to be sized differently to account for differences in throughput between
different storage engines.
recovery from backups may become more complex if your backup captures data files from MongoDB: you may
need to maintain backups for each storage engine.
Yes. See:
Change Standalone to WiredTiger (page 589)
Change Replica Set to WiredTiger (page 590)
Change Sharded Cluster to WiredTiger (page 591)
The ratio of compressed data to uncompressed data depends on your data and the compression library used. By default,
collection data in WiredTiger use Snappy block compression; zlib compression is also available. Index data use prefix
compression by default.
With WiredTiger, MongoDB utilizes both the WiredTiger cache and the filesystem cache.
Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For systems with up to 10 GB of RAM, the new default setting is less than or equal to the 3.0 default setting (For
MongoDB 3.0, the WiredTiger cache uses either 1 GB or half of the installed physical RAM, whichever is larger).
For systems with more than 10 GB of RAM, the new default setting is greater than the 3.0 setting.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or
by other processes. Data in the filesystem cache is compressed.
To adjust the size of the WiredTiger cache, see storage.wiredTiger.engineConfig.cacheSizeGB and
--wiredTigerCacheSizeGB. Avoid increasing the WiredTiger cache size above its default value.
The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node
contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM
available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less
than the amount of RAM available in the container. The exact amount depends on the other processes running in the
container.
To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the
serverStatus command.
MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds
or 2 gigabytes of journal data.
For journal data, MongoDB writes to disk according to the following intervals or condition:
New in version 3.2: Every 50 milliseconds.
MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of
journal data has been written, whichever occurs first.
If the write operation includes a write concern of j: true (page 143), WiredTiger forces a sync of the
WiredTiger journal files.
Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately
every 100 MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.
A memory-mapped file is a file with data that the operating system places in memory by way of the mmap() system
call. mmap() thus maps the file to a region of virtual memory. Memory-mapped files are the critical piece of the
MMAPv1 storage engine in MongoDB. By using memory mapped files, MongoDB can treat the contents of its data
files as if they were in memory. This provides MongoDB with an extremely fast and simple method for accessing and
manipulating data.
MongoDB uses memory mapped files for managing and interacting with all data.
Memory mapping assigns files to a block of virtual memory with a direct byte-for-byte correlation. MongoDB memory
maps data files to memory as it accesses documents. Unaccessed data is not mapped to memory.
Once mapped, the relationship between file and memory allows MongoDB to interact with the data in the file as if it
were memory.
In the default configuration for the MMAPv1 storage engine (page 595), MongoDB writes to the data files on disk
every 60 seconds and writes to the journal files roughly every 100 milliseconds.
To change the interval for writing to the data files, use the storage.syncPeriodSecs setting. For the journal
files, see storage.journal.commitIntervalMs setting.
These values represent the maximum amount of time between the completion of a write operation and when MongoDB
writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk
more frequently, so that the above values represents a theoretical maximum.
Why are the files in my data directory larger than the data in my database?
The data files in your data directory, which is the /data/db directory in default configurations, might be larger than
the data set inserted into the database. Consider the following possible causes:
MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do
not necessarily reflect the size of your data.
The storage.mmapv1.smallFiles option will reduce the size of these files, which may be useful if you have
many small databases on disk.
The oplog
If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated
capped collection in the local database.
The default allocation is approximately 5% of disk space on 64-bit installations. In most cases, you should not need
to resize the oplog. See Oplog Sizing (page 647) for more information.
The journal
The data directory contains the journal files, which store write operations on disk before MongoDB applies them to
databases. See Journaling (page 598).
Empty records
MongoDB maintains lists of empty records in data files as it deletes documents and collections. MongoDB can reuse
this space, but will not, by default, return this space to the operating system.
To allow MongoDB to more effectively reuse the space, you can de-fragment your data. To de-fragment, use the
compact command. The compact requires up to 2 gigabytes of extra disk space to run. Do not use compact if
you are critically low on disk space. For more information on its behavior and other considerations, see compact.
compact only removes fragmentation from MongoDB data files within a collection and does not return any disk space
to the operating system. To return disk space to the operating system, see How do I reclaim disk space? (page 853).
The following provides some options to consider when reclaiming disk space.
Note: You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records (page 853) for
information on reuse of freed space.
repairDatabase
You can use repairDatabase on a database to rebuilds the database, de-fragmenting the associated storage in the
process.
repairDatabase requires free disk space equal to the size of your current data set plus 2 gigabytes. If the volume
that holds dbpath lacks sufficient space, you can mount a separate volume and use that for the repair. For additional
information and considerations, see repairDatabase.
Warning: Do not use repairDatabase if you are critically low on disk space.
repairDatabase will block all other operations and may take a long time to complete.
For a secondary member of a replica set, you can perform a resync of the member (page 690) by: stopping the
secondary member to resync, deleting all data and subdirectories from the members data directory, and restarting.
For details, see Resync a Member of a Replica Set (page 690).
Working set represents the total body of data that the application uses in the course of normal operation. Often this is
a subset of the total data size, but the specific size of the working set depends on actual moment-to-moment use of the
database.
If you run a query that requires MongoDB to scan every document in a collection, the working set will expand to
include every document. Depending on physical memory size, this may cause documents in the working set to page
out, or to be removed from physical memory by the operating system. The next time MongoDB needs to access these
documents, MongoDB may incur a hard page fault.
For best performance, the majority of your active set should fit in RAM.
With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts of its data
files that are not currently located in physical memory. In contrast, operating system page faults happen when physical
memory is exhausted and pages of physical memory are swapped to disk.
If there is free memory, then the operating system can find the page on disk and load it to memory directly. However,
if there is no free memory, the operating system must:
find a page in memory that is stale or no longer needed, and write the page to disk.
read the requested page from disk and load it into memory.
This process, on an active system, can take a long time, particularly in comparison to reading a page that is already in
memory.
See Page Faults (page 234) for more information.
Page faults occur when MongoDB, with the MMAP storage engine, needs access to data that isnt currently in active
memory. A hard page fault refers to situations when MongoDB must access a disk to access the data. A soft page
fault, by contrast, merely moves memory pages from one list to another, such as from an operating system file cache.
See Page Faults (page 234) for more information.
To view the statistics for a collection, including the data size, use the db.collection.stats() method from the
mongo shell. The following example issues db.collection.stats() for the orders collection:
db.orders.stats();
MongoDB also provides the following methods to return specific sizes for the collection:
db.collection.dataSize() to return data size in bytes for the collection.
db.collection.storageSize() to return allocation size in bytes, including unused space.
db.collection.totalSize() to return the data size plus the index size in bytes.
db.collection.totalIndexSize() to return the index size in bytes.
The following script prints the statistics for each collection in each database:
db._adminCommand("listDatabases").databases.forEach(function (d) {
mdb = db.getSiblingDB(d.name);
mdb.getCollectionNames().forEach(function(c) {
s = mdb[c].stats();
printjson(s);
})
})
To view the size of the data allocated for an index, use the db.collection.stats() method and check the
indexSizes field in the returned document.
The db.stats() method in the mongo shell returns the current state of the active database. For the description
of the returned fields, see dbStats Output.
Replication
A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide
redundancy and high availability, and are the basis for all production deployments. This section introduces replication
in MongoDB as well as the components and architecture of replica sets. The section also provides tutorials for common
tasks related to replica sets.
Replication Introduction (page 613) An introduction to replica sets, their behavior, operation, and use.
Replication Concepts (page 617) The core documentation of replica set operations, configurations, architectures and
behaviors.
Replica Set Members (page 618) Introduces the components of replica sets.
Replica Set Deployment Architectures (page 626) Introduces architectural considerations related to replica
sets deployment planning.
Replica Set High Availability (page 635) Presents the details of the automatic failover and recovery process
with replica sets.
Replica Set Read and Write Semantics (page 639) Presents the semantics for targeting read and write opera-
tions to the replica set, with an awareness of location and set configuration.
Replica Set Tutorials (page 655) Tutorials for common tasks related to the use and maintenance of replica sets.
Replication Reference (page 708) Reference for functions and operations related to replica sets.
On this page
Redundancy and Data Availability (page 613)
Replication in MongoDB (page 614)
Additional Resources (page 617)
Replication provides redundancy and increases data availability. With multiple copies of data on different database
servers, replication provides a level of fault tolerance against the loss of a single database server.
613
MongoDB Documentation, Release 3.2.4
In some cases, replication can provide increased read capacity as clients can send read operations to different servers.
Maintaining copies of data in different data centers can increase data locality and availability for distributed applica-
tions. You can also maintain additional copies for dedicated purposes, such as disaster recovery, reporting, or backup.
A replica set is a group of mongod instances that maintain the same data set. A replica set contains several data
bearing nodes and optionally one arbiter node. Of the data bearing nodes, one and only one member is deemed the
primary node, while the other nodes are deemed secondary nodes.
The primary node (page 618) receives all write operations. A replica set can have only one primary capable of
confirming writes with { w: "majority" } (page 142) write concern; although in some circumstances, another
mongod instance may transiently believe itself to also be primary. 1 The primary records all changes to its data sets
in its operation log, i.e. oplog (page 647). For more information on primary node operation, see Replica Set Primary
(page 618).
1 In some circumstances (page 722), two nodes in a replica set may transiently believe that they are the primary, but at most, one of them
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
The secondaries (page 618) replicate the primarys oplog and apply the operations to their data sets such that the
secondaries data sets reflect the primarys data set. If the primary is unavailable, an eligible secondary will hold
an election to elect itself the new primary. For more information on secondary members, see Replica Set Secondary
Members (page 618).
You may add an extra mongod instance to a replica set as an arbiter (page 625). Arbiters do not maintain a data set.
The purpose of an arbiter is to maintain a quorum in a replica set by responding to heartbeat and election requests
by other replica set members. Because they do not store a data set, arbiters can be a good way to provide replica set
quorum functionality with a cheaper resource cost than a fully functional replica set member with a data set. If your
replica set has an even number of members, add an arbiter to obtain a majority of votes in an election for primary.
Arbiters do not require dedicated hardware. For more information on arbiters, see Replica Set Arbiter (page 625).
An arbiter (page 625) will always be an arbiter whereas a primary (page 618) may step down and become a secondary
(page 618) and a secondary (page 618) may become the primary during an election.
Asynchronous Replication
Secondaries apply operations from the primary asynchronously. By applying operations after the primary, sets can
continue to function despite the failure of one or more members. For more information on replication mechanics, see
Replica Set Oplog (page 647) and Replica Set Data Synchronization (page 648).
Automatic Failover
When a primary does not communicate with the other members of the set for more than 10 seconds, an eligible
secondary will hold an election to elect itself the new primary. The first secondary to hold an election and receive a
majority of the members votes becomes primary.
New in version 3.2: MongoDB introduces a version 1 of the replication protocol (protocolVersion: 1
(page 711)) to reduce replica set failover time and accelerates the detection of multiple simultaneous primaries. New
replica sets will, by default, use protocolVersion: 1 (page 711). Previous versions of MongoDB use version
0 of the protocol.
See Replica Set Elections (page 635) and Rollbacks During Replica Set Failover (page 638) for more information.
Read Operations
By default, clients read from the primary 1 ; however, clients can specify a read preference (page 641) to send read
operations to secondaries. Asynchronous replication (page 615) to secondaries means that reads from secondaries may
return data that does not reflect the state of the data on the primary. For information on reading from replica sets, see
Read Preference (page 641).
In MongoDB, clients can see the results of writes before the writes are durable:
Regardless of write concern (page 141), other clients using "local" (page 144) (i.e. the default) readConcern
can see the result of a write operation before the write operation is acknowledged to the issuing client.
Clients using "local" (page 144) (i.e. the default) readConcern can read data which may be subsequently
rolled back (page 638).
For more information on read isolations, consistency and recency for MongoDB, see Read Isolation, Consistency, and
Recency (page 96).
Additional Features
Replica sets provide a number of options to support application needs. For example, you may deploy a replica
set with members in multiple data centers (page 634), or control the outcome of elections by adjusting the
members[n].priority (page 713) of some members. Replica sets also support dedicated members for reporting,
disaster recovery, or backup functions.
See Priority 0 Replica Set Members (page 621), Hidden Replica Set Members (page 623) and Delayed Replica Set
Members (page 624) for more information.
These documents describe and provide examples of replica set operation, configuration, and behavior. For an overview
of replication, see Replication Introduction (page 613). For documentation of the administration of replica sets, see
Replica Set Tutorials (page 655). The Replication Reference (page 708) documents commands and operations specific
to replica sets.
Replica Set Members (page 618) Introduces the components of replica sets.
Replica Set Primary (page 618) The primary is the only member of a replica set that accepts write operations.
Replica Set Secondary Members (page 618) Secondary members replicate the primarys data set and accept
read operations. If the set has no primary, a secondary can become primary.
Priority 0 Replica Set Members (page 621) Priority 0 members are secondaries that cannot become the pri-
mary.
Hidden Replica Set Members (page 623) Hidden members are secondaries that are invisible to applications.
These members support dedicated workloads, such as reporting or backup.
Replica Set Arbiter (page 625) An arbiter does not maintain a copy of the data set but participate in elections.
Replica Set Deployment Architectures (page 626) Introduces architectural considerations related to replica sets de-
ployment planning.
Replica Set High Availability (page 635) Presents the details of the automatic failover and recovery process with
replica sets.
Replica Set Elections (page 635) Elections occur when the primary becomes unavailable and the replica set
members autonomously select a new primary.
Read Preference (page 641) Read preference specifies where (i.e. which members of the replica set) the drivers
should direct the read operations.
2 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
3 http://www.mongodb.com/webinar/managing-mission-critical-app-downtime?jmp=docs
Replication Processes (page 646) Mechanics of the replication process and related topics.
Master Slave Replication (page 649) Master-slave replication provided redundancy in early versions of MongoDB.
Replica sets replace master-slave for most use cases.
A replica set in MongoDB is a group of mongod processes that provide redundancy and high availability. The
members of a replica set are:
Primary (page ??). The primary receives all write operations.
Secondaries (page ??). Secondaries replicate operations from the primary to maintain an identical data set. Secon-
daries may have additional configurations for special usage profiles. For example, secondaries may be non-
voting (page 637) or priority 0 (page 621).
You can also maintain an arbiter (page ??) as part of a replica set. Arbiters do not keep a copy of the data. However,
arbiters play a role in the elections that select a primary if the current primary is unavailable.
The minimum requirements for a replica set are: A primary (page ??), a secondary (page ??), and an arbiter (page ??).
Most deployments, however, will keep three members that store data: A primary (page ??) and two secondary members
(page ??).
4
Changed in version 3.0.0: A replica set can have up to 50 members (page 934) but only 7 voting members. In
previous versions, replica sets can have up to 12 members.
The primary is the only member in the replica set that receives write operations. MongoDB applies write operations
on the primary and then records the operations on the primarys oplog (page 647). Secondary (page ??) members
replicate this log and apply the operations to their data sets.
In the following three-member replica set, the primary accepts all write operations. Then the secondaries replicate the
oplog to apply to their data sets.
All members of the replica set can accept read operations. However, by default, an application directs its read opera-
tions to the primary member. See Read Preference (page 641) for details on changing the default read behavior.
The replica set can have at most one primary. 5 If the current primary becomes unavailable, an election determines the
new primary. See Replica Set Elections (page 635) for more details.
In the following 3-member replica set, the primary becomes unavailable. This triggers an election which selects one
of the remaining secondaries as the new primary.
A secondary maintains a copy of the primarys data set. To replicate data, a secondary applies operations from the
primarys oplog (page 647) to its own data set in an asynchronous process. A replica set can have one or more
secondaries.
4 While replica sets are the recommended solution for production, a replica set can support up to 50 members in total. If your deployment
requires more than 50 members, youll need to use master-slave (page 649) replication. However, master-slave replication lacks the automatic
failover capabilities.
5 In some circumstances (page 722), two nodes in a replica set may transiently believe that they are the primary, but at most, one of them
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
The following three-member replica set has two secondary members. The secondaries replicate the primarys oplog
and apply the operations to their data sets.
Although clients cannot write data to secondaries, clients can read data from secondary members. See Read Preference
(page 641) for more information on how clients direct read operations to replica sets.
A secondary can become a primary. If the current primary becomes unavailable, the replica set holds an election to
choose which of the secondaries becomes the new primary.
In the following three-member replica set, the primary becomes unavailable. This triggers an election where one of
the remaining secondaries becomes the new primary.
See Replica Set Elections (page 635) for more details.
You can configure a secondary member for a specific purpose. You can configure a secondary to:
Prevent it from becoming a primary in an election, which allows it to reside in a secondary data center or to
serve as a cold standby. See Priority 0 Replica Set Members (page 621).
Prevent applications from reading from it, which allows it to run applications that require separation from normal
traffic. See Hidden Replica Set Members (page 623).
Keep a running historical snapshot for use in recovery from certain errors, such as unintentionally deleted
databases. See Delayed Replica Set Members (page 624).
On this page
Priority 0 Members as Standbys (page 623)
Priority 0 Members and Failover (page 623)
Configuration (page 623)
A priority 0 member is a secondary that cannot become primary. Priority 0 members cannot trigger elections.
Otherwise these members function as normal secondaries. A priority 0 member maintains a copy of the data set,
accepts read operations, and votes in elections. Configure a priority 0 member to prevent secondaries from becoming
primary, which is particularly useful in multi-data center deployments.
In a three-member replica set, in one data center hosts the primary and a secondary. A second data center hosts one
priority 0 member that cannot become primary.
Priority 0 Members as Standbys A priority 0 member can function as a standby. In some replica sets, it might not
be possible to add a new member in a reasonable amount of time. A standby member keeps a current copy of the data
to be able to replace an unavailable member.
In many cases, you need not set standby to priority 0. However, in sets with varied hardware or geographic distribution
(page 634), a priority 0 standby ensures that only qualified members become primary.
A priority 0 standby may also be valuable for some members of a set with different hardware or workload profiles.
In these cases, deploy a member with priority 0 so it cant become primary. Also consider using an hidden member
(page 623) for this purpose.
If your set already has seven voting members, also configure the member as non-voting (page 637).
Priority 0 Members and Failover When configuring a priority 0 member, consider potential failover patterns,
including all possible network partitions. Always ensure that your main data center contains both a quorum of voting
members and contains members that are eligible to be primary.
Configuration To configure a priority 0 member, see Prevent Secondary from Becoming Primary (page 677).
On this page
Behavior (page 623)
Further Reading (page 624)
A hidden member maintains a copy of the primarys data set but is invisible to client applications. Hidden members
are good for workloads with different usage patterns from the other members in the replica set. Hidden members must
always be priority 0 members (page 621) and so cannot become primary. The db.isMaster() method does not
display hidden members. Hidden members, however, may vote in elections (page 635).
In the following five-member replica set, all four secondary members have copies of the primarys data set, but one of
the secondary members is hidden.
Behavior
Read Operations Clients will not distribute reads with the appropriate read preference (page 641) to hidden mem-
bers. As a result, these members receive no traffic other than basic replication. Use hidden members for dedicated
tasks such as reporting and backups. Delayed members (page 624) should be hidden.
In a sharded cluster, mongos do not interact with hidden members.
Voting Hidden members may vote in replica set elections. If you stop a voting hidden member, ensure that the set
has an active majority or the primary will step down.
For the purposes of backups,
If using the MMAPv1 storage engine, you can avoid stopping a hidden member with the db.fsyncLock()
and db.fsyncUnlock() operations to flush all writes and lock the mongod instance for the duration of the
backup operation.
Changed in version 3.2: Starting in MongoDB 3.2, db.fsyncLock() can ensure that the data files do not
change for MongoDB instances using either the MMAPv1 or the WiredTiger storage engine, thus providing
consistency for the purposes of creating backups.
In previous MongoDB version, db.fsyncLock() cannot guarantee a consistent set of files for low-level
backups (e.g. via file copy cp, scp, tar) for WiredTiger.
Further Reading For more information about backing up MongoDB databases, see MongoDB Backup Methods
(page 200). To configure a hidden member, see Configure a Hidden Replica Set Member (page 678).
On this page
Considerations (page 624)
Example (page 625)
Configuration (page 625)
Delayed members contain copies of a replica sets data set. However, a delayed members data set reflects an earlier,
or delayed, state of the set. For example, if the current time is 09:52 and a member has a delay of an hour, the delayed
member has no operation more recent than 08:52.
Because delayed members are a rolling backup or a running historical snapshot of the data set, they may help
you recover from various kinds of human error. For example, a delayed member can make it possible to recover from
unsuccessful application upgrades and operator errors including dropped databases and collections.
Considerations
Behavior Delayed members apply operations from the oplog on a delay. When choosing the amount of delay,
consider that the amount of delay:
must be is equal to or greater than your maintenance windows.
must be smaller than the capacity of the oplog. For more information on oplog size, see Oplog Size (page 647).
Sharding In sharded clusters, delayed members have limited utility when the balancer is enabled. Because delayed
members replicate chunk migrations with a delay, the state of delayed members in a sharded cluster are not useful for
recovering to a previous state of the sharded cluster if any migrations occur during the delay window.
Example In the following 5-member replica set, the primary and all secondaries have copies of the data set. One
member applies operations with a delay of 3600 seconds, or an hour. This delayed member is also hidden and is a
priority 0 member.
To configure a delayed member, see Configure a Delayed Replica Set Member (page 680).
On this page
Example (page 626)
Security (page 626)
An arbiter does not have a copy of data set and cannot become a primary. Replica sets may have arbiters to add a
vote in elections of for primary (page 635). Arbiters always have exactly 1 vote election, and thus allow replica sets
to have an uneven number of members, without the overhead of a member that replicates data.
Important: Do not run an arbiter on systems that also host the primary or the secondary members of the replica set.
Only add an arbiter to sets with even numbers of members. If you add an arbiter to a set with an odd number of
members, the set may suffer from tied elections. To add an arbiter, see Add an Arbiter to Replica Set (page 668).
Example
For example, in the following replica set, an arbiter allows the set to have an odd number of votes for elections:
Security
Authentication When running with authorization, arbiters exchange credentials with other members of the
set to authenticate. MongoDB encrypts the authentication process. The MongoDB authentication exchange is crypto-
graphically secure.
Arbiters use keyfiles to authenticate to the replica set.
Communication The only communication between arbiters and other set members are: votes during elections,
heartbeats, and configuration data. These exchanges are not encrypted.
However, if your MongoDB deployment uses TLS/SSL, MongoDB will encrypt all communication between replica
set members. See Configure mongod and mongos for TLS/SSL (page 382) for more information.
As with all MongoDB components, run arbiters in trusted network environments.
On this page
Strategies (page 627)
Replica Set Naming (page 628)
Deployment Patterns (page 628)
The architecture of a replica set affects the sets capacity and capability. This document provides strategies for replica
set deployments and describes common architectures.
The standard replica set deployment for production system is a three-member replica set. These sets provide re-
dundancy and fault tolerance. Avoid complexity when possible, but let your application requirements dictate the
architecture.
Strategies
Deploy an Odd Number of Members An odd number of members ensures that the replica set is always able to
elect a primary. If you have an even number of members, add an arbiter to get an odd number. Arbiters do not store
a copy of the data and require fewer resources. As a result, you may run an arbiter on an application server or other
shared process.
Consider Fault Tolerance Fault tolerance for a replica set is the number of members that can become unavailable
and still leave enough members in the set to elect a primary. In other words, it is the difference between the number
of members in the set and the majority needed to elect a primary. Without a primary, a replica set cannot accept write
operations. Fault tolerance is an effect of replica set size, but the relationship is not direct. See the following table:
Number of Members. Majority Required to Elect a New Primary. Fault Tolerance.
3 2 1
4 3 1
5 3 2
6 4 2
Adding a member to the replica set does not always increase the fault tolerance. However, in these cases, additional
members can provide support for dedicated functions, such as backups or reporting.
Use Hidden and Delayed Members for Dedicated Functions Add hidden (page 623) or delayed (page 624) mem-
bers to support dedicated functions, such as backup or reporting.
Load Balance on Read-Heavy Deployments In a deployment with very high read traffic, you can improve read
throughput by distributing reads to secondary members. As your deployment grows, add or move members to alternate
data centers to improve redundancy and availability.
Always ensure that the main facility is able to elect a primary.
Add Capacity Ahead of Demand The existing members of a replica set must have spare capacity to support adding
a new member. Always add new members before the current demand saturates the capacity of the set.
Distribute Members Geographically To protect your data if your main data center fails, keep at least one member
in an alternate data center. Set these members members[n].priority (page 713) to 0 to prevent them from
becoming primary.
Keep a Majority of Members in One Location When a replica set has members in multiple data centers, network
partitions can prevent communication between data centers. To replicate data, members must be able to communicate
to other members.
In an election, members must see each other to create a majority. To ensure that the replica set members can confirm
a majority and elect a primary, keep a majority of the sets members in one location.
Use replica set tag sets (page 691) to ensure that operations replicate to specific data centers. Tag sets also allow the
routing of read operations to specific machines.
See also:
Data Center Awareness (page 226) and Operational Segregation in MongoDB Deployments (page 227).
Enable journaling to protect data against service interruptions. Without journaling MongoDB cannot recover data after
unexpected shutdowns, including power failures and unexpected reboots.
All 64-bit versions of MongoDB after version 2.0 have journaling enabled by default.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
Deployment Patterns
The following documents describe common replica set deployment patterns. Other patterns are possible and effective
depending on the applications requirements. If needed, combine features of each architecture in your own deployment:
Three Member Replica Sets (page 629) Three-member replica sets provide the minimum recommended architecture
for a replica set.
Replica Sets with Four or More Members (page 632) Four or more member replica sets provide greater redundancy
and can support greater distribution of read operations and dedicated functionality.
Geographically Distributed Replica Sets (page 634) Geographically distributed sets include members in multiple lo-
cations to protect against facility-specific failures, such as power outages.
On this page
Primary with Two Secondary Members (page 629)
Primary with a Secondary and an Arbiter (page 629)
The minimum architecture of a replica set has three members. A three member replica set can have either three
members that hold data, or two members that hold data and an arbiter.
Primary with Two Secondary Members A replica set with three members that store data has:
One primary (page 618).
Two secondary (page 618) members. Both secondaries can become the primary in an election (page 635).
These deployments provide two complete copies of the data set at all times in addition to the primary. These replica
sets provide additional fault tolerance and high availability (page 635). If the primary is unavailable, the replica set
elects a secondary to be primary and continues normal operation. The old primary rejoins the set when available.
Primary with a Secondary and an Arbiter A three member replica set with a two members that store data has:
One primary (page 618).
One secondary (page 618) member. The secondary can become primary in an election (page 635).
One arbiter (page 625). The arbiter only votes in elections.
Since the arbiter does not hold a copy of the data, these deployments provides only one complete copy of the data.
Arbiters require fewer resources, at the expense of more limited redundancy and fault tolerance.
However, a deployment with a primary, secondary, and an arbiter ensures that a replica set remains available if the
primary or the secondary is unavailable. If the primary is unavailable, the replica set will elect the secondary to be
primary.
See also:
Deploy a Replica Set (page 657).
On this page
Overview (page 632)
Considerations (page 632)
Overview Although the standard replica set configuration has three members, you can deploy larger sets. Add
additional members to a set to increase redundancy or to add capacity for distributing secondary read operations.
Considerations As you add new members to a replica set, consider the following:
Odd Number of Voting Members Ensure that the replica set has an odd number of voting members. If you have
an even number of voting members, deploy an arbiter (page ??) so that the set has an odd number.
For example, the following replica set includes an arbiter to ensure an odd number of voting members.
Maximum Number of Voting Members A replica set can have up to 50 members, but only 7 voting
members. 6 If the replica set already has 7 voting members, additional members must be non-voting members
(page 637).
For example, the following 9 member replica set has 7 voting members and 2 non-voting members.
See Non-Voting Members (page 637) for more information.
Location of the Members A majority of the replica sets members should be in your applications main data center.
For example, the following 5 member replica set has the majority, 3, of its members in its main data center, Data
Center 1.
6 While replica sets are the recommended solution for production, a replica set can support up to 50 members in total. If your deployment
requires more than 50 members, youll need to use master-slave (page 649) replication. However, master-slave replication lacks the automatic
failover capabilities.
Electability of Members Some members of the replica set, such as members that have networking restraint or
limited resources, should not be able to become primary in a failover. Configure members that should not become
primary to have priority 0 (page 621).
For example, the secondary member in the third data center with a priority of 0 cannot become primary:
See also:
Deploy a Replica Set (page 657), Add an Arbiter to Replica Set (page 668), and Add Members to a Replica Set
(page 670).
On this page
Additional Resource (page 635)
Adding members to a replica set in multiple data centers adds redundancy and provides fault tolerance if one data
center is unavailable. Members in additional data centers should have a priority of 0 (page 621) to prevent them from
becoming primary.
For example: the architecture of a geographically distributed replica set may be:
One primary in the main data center.
One secondary member in the main data center. This member can become primary at any time.
One priority 0 (page 621) member in a second data center. This member cannot become primary.
In the following replica set, the primary and one secondary are in Data Center 1, while Data Center 2 has a priority 0
(page 621) secondary that cannot become a primary.
If the primary is unavailable, the replica set will elect a new primary from Data Center 1. If the data centers cannot
connect to each other, the member in Data Center 2 will not become the primary.
If Data Center 1 becomes unavailable, you can manually recover the data set from Data Center 2 with minimal
downtime.
To facilitate elections, the main data center should hold a majority of members. Also ensure that the set has an odd
number of members. If adding a member in another data center results in a set with an even number of members,
deploy an arbiter (page ??). For more information on elections, see Replica Set Elections (page 635).
See also:
Deploy a Geographically Redundant Replica Set (page 662).
Additional Resource
Whitepaper: MongoDB Multi-Data Center Deployments7
Webinar: Multi-Data Center Deployment8
Replica sets provide high availability using automatic failover. Failover allows a secondary member to become pri-
mary if the current primary becomes unavailable.
Changed in version 3.2: MongoDB introduces a version 1 of the replication protocol (protocolVersion: 1
(page 711)) to reduce replica set failover time and accelerates the detection of multiple simultaneous primaries. New
replica sets will, by default, use protocolVersion: 1 (page 711). Previous versions of MongoDB use version
0 of the protocol. To upgrade existing replica sets to use protocolVersion: 1 (page 711), see Upgrade a
Replica Set to 3.2 (page 895).
Replica set members keep the same data set but are otherwise independent. If the primary becomes unavailable, an
eligible secondary holds an election (page 635) to elect itself as a new primary. In some situations, the failover process
may undertake a rollback (page 638). 9
The deployment of a replica set affects the outcome of failover situations. To support effective failover, ensure that one
facility can elect a primary if needed. Choose the facility that hosts the core application systems to host the majority
of the replica set. Place a majority of voting members and all the members that can become primary in this facility.
Otherwise, network partitions could prevent the set from being able to form a majority.
7 http://www.mongodb.com/lp/white-paper/multi-dc?jmp=docs
8 https://www.mongodb.com/presentations/webinar-multi-data-center-deployment?jmp=docs
9 Replica sets remove rollback data when needed without intervention. Administrators must apply or discard rollback data manually.
On this page
Factors and Conditions that Affect Elections (page 637)
Non-Voting Members (page 637)
Replica sets use elections to determine which set member will become primary. Elections occur after initiating a
replica set, and also any time the primary becomes unavailable. The primary is the only member in the set that can
accept write operations. If a primary becomes unavailable, elections allow the set to recover normal operations without
manual intervention. Elections are part of the failover process (page 635).
In the following three-member replica set, the primary is unavailable. One of the remaining secondaries holds an
election to elect itself as a new primary.
Elections are essential for independent operation of a replica set; however, elections take time to complete. While
an election is in process, the replica set has no primary and cannot accept writes and all remaining members become
read-only. MongoDB avoids elections unless necessary.
If a majority of the replica set is inaccessible or unavailable to the current primary, the primary will step down and
become a secondary. The replica set cannot accept writes after this occurs, but remaining members can continue to
serve read queries if such queries are configured to run on secondaries.
Replication Election Protocol New in version 3.2: MongoDB introduces a version 1 of the replication protocol
(protocolVersion: 1 (page 711)) to reduce replica set failover time and accelerates the detection of multiple
simultaneous primaries. New replica sets will, by default, use protocolVersion: 1 (page 711). Previous
versions of MongoDB use version 0 of the protocol.
Heartbeats Replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not
return within 10 seconds, the other members mark the delinquent member as inaccessible.
Member Priority After a replica set has a stable primary, the election algorithm will make a best-effort attempt
to have the secondary with the highest priority (page 713) available call an election. Higher priority secondaries
call elections relatively sooner than lower priority secondaries; however, a lower priority node can still be elected as
primary for brief periods of time, even if a higher priority secondary is available. Replica set members will continue
to call elections until the highest priority available member becomes primary.
Members with a priority value of 0 cannot become primary and do not seek election. For details, see Priority 0 Replica
Set Members (page 621).
Network Partitions Network partitions affect the formation of a majority for an election. If a primary steps down
and neither portion of the replica set has a majority the set will not elect a new primary. The replica set becomes
read-only.
To avoid this situation, place a majority of instances in one data center and a minority of instances in any other data
centers combined.
Non-Voting Members
Although non-voting members do not vote in elections, these members hold copies of the replica sets data and can
accept read operations from client applications.
Because a replica set can have up to 50 members, but only 7 voting members, non-voting members allow a
replica set to have more than seven members.
For instance, the following nine-member replica set has seven voting members and two non-voting members.
A non-voting member has a members[n].votes (page 713) setting equal to 0 in its member configuration:
{
"_id" : <num>
"host" : <hostname:port>,
"votes" : 0
}
Important: Do not alter the number of votes to control which members will become primary. Instead, modify the
members[n].priority (page 713) option. Only alter the number of votes in exceptional cases. For example, to
permit more than seven members.
To configure a non-voting member, see Configure Non-Voting Replica Set Member (page 681).
On this page
Collect Rollback Data (page 638)
Avoid Replica Set Rollbacks (page 638)
Rollback Limitations (page 639)
A rollback reverts write operations on a former primary when the member rejoins its replica set after a failover.
A rollback is necessary only if the primary had accepted write operations that the secondaries had not successfully
replicated before the primary stepped down. When the primary rejoins the set as a secondary, it reverts, or rolls back,
its write operations to maintain database consistency with the other members.
MongoDB attempts to avoid rollbacks, which should be rare. When a rollback does occur, it is often the result of a
network partition. Secondaries that can not keep up with the throughput of operations on the former primary, increase
the size and impact of the rollback.
A rollback does not occur if the write operations replicate to another member of the replica set before the primary
steps down and if that member remains available and accessible to a majority of the replica set.
When a rollback does occur, administrators must decide whether to apply or ignore the rollback data. MongoDB
writes the rollback data to BSON files in the rollback/ folder under the databases dbPath directory. The names
of rollback files have the following form:
<database>.<collection>.<timestamp>.bson
For example:
records.accounts.2011-05-09T18-10-04.0.bson
Administrators must apply rollback data manually after the member completes the rollback and returns to secondary
status. Use bsondump to read the contents of the rollback files. Then use mongorestore to apply the changes to
the new primary.
For replica sets, the default write concern {w: 1} (page 141) only provides acknowledgement of write operations
on the primary. With the default write concern, data may be rolled back if the primary steps down before the write
operations have replicated to any of the secondaries.
To prevent rollbacks of data that have been acknowledged to the client, use w: majority write concern (page 141) to
guarantee that the write operations propagate to a majority of the replica set nodes before returning with acknowledge-
ment to the issuing client.
Note:
Regardless of write concern (page 141), other clients using "local" (page 144) (i.e. the default) readConcern
can see the result of a write operation before the write operation is acknowledged to the issuing client.
Clients using "local" (page 144) (i.e. the default) readConcern can read data which may be subsequently
rolled back (page 638).
Rollback Limitations
A mongod instance will not rollback more than 300 megabytes of data. If your system must rollback more than 300
megabytes, you must manually intervene to recover the data. If this is the case, the following line will appear in your
mongod log:
[replica set sync] replSet syncThread: 13410 replSet too much data to roll back
In this situation, save the data directly or force the member to perform an initial sync. To force initial sync, sync from
a current member of the set by deleting the content of the dbPath directory for the member that requires a larger
rollback.
See also:
Replica Set High Availability (page 635) and Replica Set Elections (page 635).
From the perspective of a client application, whether a MongoDB instance is running as a single server (i.e. stan-
dalone) or a replica set is transparent. However, MongoDB provides additional read and write configurations for
replica sets.
Note: Sharded clusters where the shards are also replica sets provide the same operational semantics with regards to
write and read operations.
Write Concern for Replica Sets (page 639) Write concern describes the level of acknowledgement requested from
MongoDB for write operations.
Read Preference (page 641) Read preference specifies where (i.e. which members of the replica set) the drivers
should direct the read operations.
Read Preference Processes (page 644) Describes the mechanics of read preference.
On this page
Verify Write Operations to Replica Sets (page 640)
Modify Default Write Concern (page 641)
Custom Write Concerns (page 641)
From the perspective of a client application, whether a MongoDB instance is running as a single server (i.e. stan-
dalone) or a replica set is transparent. However, replica sets offer some configuration options for write. 10
For a replica set, the default write concern (page 141) requests acknowledgement only from the primary. You can,
however, override this default write concern, such as to confirm write operations on a specified number of the replica
set members.
To override the default write concern, specify a write concern with each write operation. For example, the following
method includes a write concern that specifies that the method return only after the write propagates to the primary
and at least one secondary or the method times out after 5 seconds.
10 Sharded clusters where the shards are also replica sets provide the same configuration options with regards to write and read operations.
db.products.insert(
{ item: "envelopes", qty : 100, type: "Clasp" },
{ writeConcern: { w: 2, wtimeout: 5000 } }
)
You can include a timeout threshold for a write concern. This prevents write operations from blocking indefinitely
if the write concern is unachievable. For example, if the write concern requires acknowledgement from 4 members
of the replica set and the replica set has only available 3 members, the operation blocks until those members become
available. See wtimeout (page 143).
See also:
Write Method Acknowledgements (page 992)
You can modify the default write concern for a replica set by setting the settings.getLastErrorDefaults
(page 714) setting in the replica set configuration (page 709). The following sequence of commands creates a config-
uration that waits for the write operation to complete on a majority of the voting members before returning:
cfg = rs.conf()
cfg.settings = {}
cfg.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000 }
rs.reconfig(cfg)
If you issue a write operation with a specific write concern, the write operation uses its own write concern instead of
the default.
See also:
Write Concern (page 141)
You can tag (page 691) the members of replica sets and use the resulting tag sets to create custom write concerns. See
Configure Replica Set Tag Sets (page 691) for information on configuring custom write concerns using tag sets.
Read Preference
On this page
Use Cases (page 642)
Read Preference Modes (page 643)
Tag Sets (page 644)
Read preference describes how MongoDB clients route read operations to the members of a replica set.
By default, an application directs its read operations to the primary member in a replica set.
In MongoDB, in a replica set with one primary member 11 ,
11 In some circumstances (page 722), two nodes in a replica set may transiently believe that they are the primary, but at most, one of them
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
With "local" (page 144) readConcern, reads from the primary reflect the latest writes in absence of a
failover;
With "majority" (page 144) readConcern, read operations from the primary or the secondaries have
eventual consistency.
Important: Exercise care when specifying read preferences: Modes other than primary (page 721) may return
stale data because with asynchronous replication (page 615), data in the secondary may not reflect the most recent
write operations. 1
Note: The read preference does not affect the visibility of data; i.e, clients can see the results of writes before they
are acknowledged or have propagated to a majority of replica set members:
Regardless of write concern (page 141), other clients using "local" (page 144) (i.e. the default) readConcern
can see the result of a write operation before the write operation is acknowledged to the issuing client.
Clients using "local" (page 144) (i.e. the default) readConcern can read data which may be subsequently
rolled back (page 638).
Use Cases
Indications The following are common use cases for using non-primary (page 721) read preference modes:
Running systems operations that do not affect the front-end application.
Note: Read preferences arent relevant to direct connections to a single mongod instance. However, in order
to perform read operations on a direct connection to a secondary member of a replica set, you must set a read
preference, such as secondary.
Counter-Indications In general, do not use secondary (page 721) and secondaryPreferred (page 721) to
provide extra capacity for reads, because:
All members of a replica have roughly equivalent write traffic; as a result, secondaries will service reads at
roughly the same rate as the primary.
Replication is asynchronous and there is some amount of delay between a successful write operation and its
replication to secondaries. Reading from a secondary can return out-of-date data; reading from different secon-
daries may result in non-monotonic reads.
Distributing read operations to secondaries can compromise availability if any members of the set become
unavailable because the remaining members of the set will need to be able to handle all application requests.
For queries of sharded collections, for clusters with the balancer (page 750) active, secondaries may return stale
results with missing or duplicated data because of incomplete or terminated chunk migrations.
Sharding (page 725) increases read and write capacity by distributing read and write operations across a group of
machines, and is often a better strategy for adding capacity.
See Read Preference Processes (page 644) for more information about the internal application of read preferences.
Important: All read preference modes except primary (page 721) may return stale data because secondaries
replicate operations from the primary with some delay. 1 Ensure that your application can tolerate stale data if you
choose to use a non-primary (page 721) mode.
Read preference modes are also available to clients connecting to a sharded cluster through a mongos. The mongos
instance obeys specified read preferences when connecting to the replica set that provides each shard in the cluster.
In the mongo shell, the readPref() cursor method provides access to read preferences.
For more information, see read preference background (page 641) and read preference behavior (page 644). See also
the documentation for your driver13 .
Tag Sets
Tag sets allow you to target read operations to specific members of a replica set.
Custom read preferences and write concerns evaluate tag sets in different ways. Read preferences consider the value
of a tag when selecting a member to read from. Write concerns ignore the value of a tag to when selecting a member,
except to consider whether or not the value is unique.
You can specify tag sets with the following read preference modes:
primaryPreferred (page 721)
secondary (page 721)
secondaryPreferred (page 721)
nearest (page 721)
Tags are not compatible with mode primary (page 721) and, in general, only apply when selecting (page 645) a
secondary member of a set for a read operation. However, the nearest (page 721) read mode, when combined with
a tag set, selects the matching member with the lowest network latency. This member may be a primary or secondary.
All interfaces use the same member selection logic (page 645) to choose the member to which to direct read operations,
basing the choice on read preference mode and tag sets.
For information on configuring tag sets, see the Configure Replica Set Tag Sets (page 691) tutorial.
For more information on how read preference modes (page 721) interact with tag sets, see the documentation for each
read preference mode (page 719).
On this page
Member Selection (page 645)
Request Association (page 645)
Auto-Retry (page 645)
Read Preference in Sharded Clusters (page 646)
Member Selection
Clients, by way of their drivers, and mongos instances for sharded clusters, periodically update their view of the
replica sets state.
When you select non-primary (page 721) read preference, the driver will determine which member to target using
the following process:
1. Assembles a list of suitable members, taking into account member type (i.e. secondary, primary, or all members).
2. Excludes members not matching the tag sets, if specified.
3. Determines which suitable member is the closest to the client in absolute terms.
4. Builds a list of members that are within a defined ping distance (in milliseconds) of the absolute nearest
member.
Applications can configure the threshold used in this stage. The default acceptable latency is 15 milliseconds,
which you can override in the drivers with their own secondaryAcceptableLatencyMS option. For
mongos you can use the --localThreshold or localPingThresholdMs runtime options to set this
value.
5. Selects a member from these hosts at random. The member receives the read operation.
Drivers can then associate the thread or connection with the selected member. This request association (page 645) is
configurable by the application. See your driver documentation about request association configuration and default
behavior.
Request Association
Important: Request association is configurable by the application. See your driver documentation about request
association configuration and default behavior.
Because secondary members of a replica set may lag behind the current primary by different amounts, reads for
secondary members may reflect data at different points in time. To prevent sequential reads from jumping around in
time, the driver can associate application threads to a specific member of the set after the first read, thereby preventing
reads from other members. The thread will continue to read from the same member until:
The application performs a read with a different read preference,
The thread terminates, or
The client receives a socket exception, as is the case when theres a network error or when the mongod closes
connections during a failover. This triggers a retry (page 645), which may be transparent to the application.
When using request association, if the client detects that the set has elected a new primary, the driver will discard all
associations between threads and members.
Auto-Retry
Connections between MongoDB drivers and mongod instances in a replica set must balance two concerns:
1. The client should attempt to prefer current results, and any connection should read from the same member of
the replica set as much as possible. Requests should prefer request association (page 645) (e.g. pinning).
2. The client should minimize the amount of time that the database is inaccessible as the result of a connection
issue, networking problem, or failover in a replica set.
As a result, MongoDB drivers:
Reuse a connection to a specific mongod for as long as possible after establishing a connection to that instance.
This connection is pinned to this mongod.
Attempt to reconnect to a new member, obeying existing read preference modes (page 721), if the connection to
mongod is lost.
Reconnections are transparent to the application itself. If the connection permits reads from secondary mem-
bers, after reconnecting, the application can receive two sequential reads returning from different secondaries.
Depending on the state of the individual secondary members replication, the documents can reflect the state of
your database at different moments.
Return an error only after attempting to connect to three members of the set that match the read preference mode
(page 721) and tag set (page 644). If there are fewer than three members of the set, the client will error after
connecting to all existing members of the set.
After this error, the driver selects a new member using the specified read preference mode. In the absence of a
specified read preference, the driver uses primary (page 721).
14
After detecting a failover situation, the driver attempts to refresh the state of the replica set as quickly as
possible.
Changed in version 3.0.0: mongos instances take a slightly different approach. mongos instances return connections
to secondaries to the connection pool after every request. As a result, the mongos reevaluates read preference for
every operation.
Changed in version 2.2: Before version 2.2, mongos did not support the read preference mode semantics (page 721).
In most sharded clusters, each shard consists of a replica set. As such, read preferences are also applicable. With
regard to read preference, read operations in a sharded cluster are identical to unsharded replica sets.
Unlike simple replica sets, in sharded clusters, all interactions with the shards pass from the clients to the mongos
instances that are actually connected to the set members. mongos is then responsible for the application of read
preferences, which is transparent to applications.
There are no configuration changes required for full support of read preference modes in sharded environments, as long
as the mongos is at least version 2.2. All mongos maintain their own connection pool to the replica set members.
As a result:
A request without a specified preference has primary (page 721), the default, unless, the mongos reuses an
existing connection that has a different mode set.
To prevent confusion, always explicitly set your read preference mode.
All nearest (page 721) and latency calculations reflect the connection between the mongos and the mongod
instances, not the client and the mongod instances.
This produces the desired result, because all results must pass through the mongos before returning to the
client.
Members of a replica set replicate data continuously. First, a member uses initial sync to capture the data set. Then the
member continuously records and applies every operation that modifies the data set. Every member records operations
in its oplog (page 647), which is a capped collection.
14 When a failover occurs, all members of the set close all client connections that produce a socket error in the driver. This behavior prevents or
minimizes rollback.
Replica Set Oplog (page 647) The oplog records all operations that modify the data in the replica set.
Replica Set Data Synchronization (page 648) Secondaries must replicate all changes accepted by the primary. This
process is the basis of replica set operations.
On this page
Oplog Size (page 647)
Workloads that Might Require a Larger Oplog Size (page 648)
Oplog Status (page 648)
The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the
data stored in your databases. MongoDB applies database operations on the primary and then records the operations
on the primarys oplog. The secondary members then copy and apply these operations in an asynchronous process.
All replica set members contain a copy of the oplog, in the local.oplog.rs (page 717) collection, which allows
them to maintain the current state of the database.
To facilitate replication, all replica set members send heartbeats (pings) to all other members. Any member can import
oplog entries from any other member.
Whether applied once or multiple times to the target dataset, each operation in the oplog produces the same results, i.e.
each operation in the oplog is idempotent. For proper replication operations, entries in the oplog must be idempotent:
initial sync
post-rollback catch-up
sharding chunk migrations
Oplog Size
When you start a replica set member for the first time, MongoDB creates an oplog of a default size. The size depends
on the architectural details of your operating system.
In most cases, the default oplog size is sufficient. For example, if an oplog is 5% of free disk space and fills up in 24
hours of operations, then secondaries can stop copying entries from the oplog for up to 24 hours without becoming
too stale to continue replicating. However, most replica sets have much lower operation volumes, and their oplogs can
hold much higher numbers of operations.
Before mongod creates an oplog, you can specify its size with the oplogSizeMB option. However, after you have
started a replica set member for the first time, you can only change the size of the oplog using the Change the Size of
the Oplog (page 684) procedure.
By default, the size of the oplog is as follows:
For 64-bit Linux, Solaris, FreeBSD, and Windows systems, MongoDB allocates 5% of the available free disk
space, but will always allocate at least 1 gigabyte and never more than 50 gigabytes.
For 64-bit OS X systems, MongoDB allocates 183 megabytes of space to the oplog.
For 32-bit systems, MongoDB allocates about 48 megabytes of space to the oplog.
If you can predict your replica sets workload to resemble one of the following patterns, then you might want to create
an oplog that is larger than the default. Conversely, if your application predominantly performs reads with a minimal
amount of write operations, a smaller oplog may be sufficient.
The following workloads might require a larger oplog size.
Updates to Multiple Documents at Once The oplog must translate multi-updates into individual operations in order
to maintain idempotency. This can use a great deal of oplog space without a corresponding increase in data size or
disk use.
Deletions Equal the Same Amount of Data as Inserts If you delete roughly the same amount of data as you insert,
the database will not grow significantly in disk use, but the size of the operation log can be quite large.
Significant Number of In-Place Updates If a significant portion of the workload is updates that do not increase the
size of the documents, the database records a large number of operations but does not change the quantity of data on
disk.
Oplog Status
To view oplog status, including the size and the time range of operations, issue the
rs.printReplicationInfo() method. For more information on oplog status, see Check the Size of the
Oplog (page 707).
Under various exceptional situations, updates to a secondarys oplog might lag behind the desired performance time.
Use db.getReplicationInfo() from a secondary member and the replication status output to assess
the current state of replication and determine if there is any unintended replication delay.
See Replication Lag (page 704) for more information.
On this page
Initial Sync (page 648)
Replication (page 649)
In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate
data from other members. MongoDB uses two forms of data synchronization: initial sync (page 648) to populate new
members with the full data set, and replication to apply ongoing changes to the entire data set.
Initial Sync
Initial sync copies all the data from one member of the replica set to another member. A member uses initial sync
when the member has no data, such as when the member is new, or when the member has data but is missing a history
of the sets replication.
When you perform an initial sync, MongoDB:
1. Clones all databases. To clone, the mongod queries every collection in each source database and inserts all data
into its own copies of these collections. At this time, _id indexes are also built. The clone process only copies
valid data, omitting invalid documents.
2. Applies all changes to the data set. Using the oplog from the source, the mongod updates its data set to reflect
the current state of the replica set.
3. Builds all indexes on all collections (except _id indexes, which were already completed).
When the mongod finishes building all index builds, the member can transition to a normal state, i.e. secondary.
Changed in version 3.0: When the clone process omits an invalid document from the sync, MongoDB writes a message
to the logs that begins with Cloner: found corrupt document in <collection>.
To perform an initial sync, see Resync a Member of a Replica Set (page 690).
Replication
Secondary members replicate data continuously after the initial sync. Secondary members copy the oplog (page 647)
from their sync from source and apply these operations in an asynchronous process.
In most cases, secondaries sync from the primary. Secondaries may automatically change their sync from source if
needed based on changes in the ping time and state of other members replication.
For a member to sync from another, both members must have the same value for the members[n].buildIndexes
(page 712) setting.
Secondaries avoid syncing from delayed members (page 624) and hidden members (page 623).
Multithreaded Replication MongoDB applies write operations in batches using multiple threads to improve con-
currency. MongoDB groups batches by namespace (MMAPv1 (page 595)) or by document id (WiredTiger (page 587))
and simultaneously applies each group of operations using a different thread. MongoDB always applies write opera-
tions to a given document in their original write order.
While applying a batch, MongoDB blocks all read operations. As a result, secondary read queries can never return
data that reflect a state that never existed on the primary.
With the MMAPv1 (page 595) storage engine, MongoDB fetches memory pages that hold affected data and indexes to
help improve the performance of applying oplog entries. This pre-fetch stage minimizes the amount of time MongoDB
holds write locks while applying oplog entries. By default, secondaries will pre-fetch all Indexes (page 487).
Optionally, you can disable all pre-fetching or only pre-fetch the index on the _id field. See the
secondaryIndexPrefetch setting for more information.
On this page
Fundamental Operations (page 650)
Run time Master-Slave Configuration (page 651)
Security (page 651)
Ongoing Administration and Operation of Master-Slave Deployments (page 652)
Important: Replica sets (page 617) replace master-slave replication for most use cases. If possible, use replica
sets rather than master-slave replication for all new production deployments. This documentation remains to support
legacy deployments and for archival purposes only.
In addition to providing all the functionality of master-slave deployments, replica sets are also more robust for pro-
duction use. Master-slave replication preceded replica sets and made it possible to have a large number of non-master
(i.e. slave) nodes, as well as to restrict replicated operations to only a single database; however, master-slave repli-
cation provides less redundancy and does not automate failover. See Deploy Master-Slave Equivalent using Replica
Sets (page 652) for a replica set configuration that is equivalent to master-slave replication. If you wish to convert an
existing master-slave deployment to a replica set, see Convert a Master-Slave Deployment to a Replica Set (page 652).
Fundamental Operations
Initial Deployment
To configure a master-slave deployment, start two mongod instances: one in master mode, and the other in slave
mode.
To start a mongod instance in master mode, invoke mongod as follows:
mongod --master --dbpath /data/masterdb/
With the --master option, the mongod will create a local.oplog.$main (page 717) collection, which the op-
eration log that queues operations that the slaves will apply to replicate operations from the master. The --dbpath
is optional.
To start a mongod instance in slave mode, invoke mongod as follows:
mongod --slave --source <masterhostname><:<port>> --dbpath /data/slavedb/
Specify the hostname and port of the master instance to the --source argument. The --dbpath is optional.
For slave instances, MongoDB stores data about the source server in the local.sources (page 717) collection.
As an alternative to specifying the --source run-time option, can add a document to local.sources (page 717)
specifying the master instance, as in the following operation in the mongo shell:
use local
db.sources.find()
db.sources.insert( { host: <masterhostname> <,only: <databasename>> } );
In line 1, you switch context to the local database. In line 2, the find() operation should return no documents, to
ensure that there are no documents in the sources collection. Finally, line 3 uses db.collection.insert()
to insert the source document into the local.sources (page 717) collection. The model of the local.sources
(page 717) document is as follows:
host
The host field specifies the master mongod instance, and holds a resolvable hostname, i.e. IP address, or a name
from a host file, or preferably a fully qualified domain name.
You can append <:port> to the host name if the mongod is not running on the default 27017 port.
only
Optional. Specify a name of a database. When specified, MongoDB will only replicate the indicated database.
Master instances store operations in an oplog which is a capped collection (page 228). As a result, if a slave falls too
far behind the state of the master, it cannot catchup and must re-sync from scratch. Slave may become out of sync
with a master if:
The slave falls far behind the data updates available from that master.
The slave stops (i.e. shuts down) and restarts later after the master has overwritten the relevant operations from
the master.
When slaves are out of sync, replication stops. Administrators must intervene manually to restart replication. Use the
resync command. Alternatively, the --autoresync allows a slave to restart replication automatically, after ten
second pause, when the slave falls out of sync with the master. With --autoresync specified, the slave will only
attempt to re-sync once in a ten minute period.
To prevent these situations you should specify a larger oplog when you start the master instance, by adding the
--oplogSize option when starting mongod. If you do not specify --oplogSize, mongod will allocate 5%
of available disk space on start up to the oplog, with a minimum of 1 GB for 64-bit machines and 50 MB for 32-bit
machines.
MongoDB provides a number of command line options for mongod instances in master-slave deployments. See the
Master-Slave Replication Command Line Options for options.
Diagnostics
On a master instance, issue the following operation in the mongo shell to return replication status from the perspective
of the master:
rs.printReplicationInfo()
See server status repl fields for documentation of the relevant section of output.
Security
When running with authorization enabled, in master-slave deployments configure a keyFile so that slave
mongod instances can authenticate and communicate with the master mongod instance.
To enable authentication and configure the keyFile add the following option to your configuration file:
keyFile = /srv/mongodb/keyfile
Note: You may chose to set these run-time configuration options using the --keyFile option on the command line.
Setting keyFile enables authentication and specifies a key file for the mongod instances to use when authenticating
to each other. The content of the key file is arbitrary but must be the same on all members of the deployment can
connect to each other.
The key file must be less one kilobyte in size and may only contain characters in the base64 set. The key file must not
have group or world permissions on UNIX systems. Use the following command to use the OpenSSL package to
generate random content for use in a key file:
openssl rand -base64 741
See also:
Security (page 315) for more information about security in MongoDB
If you want a replication configuration that resembles master-slave replication, using replica sets replica sets, con-
sider the following replica configuration document. In this deployment hosts <master> and <slave> 15 provide
replication that is roughly equivalent to a two-instance master-slave deployment:
{
_id : 'setName',
members : [
{ _id : 0, host : "<master>", priority : 1 },
{ _id : 1, host : "<slave>", priority : 0, votes : 0 }
]
}
See Replica Set Configuration (page 709) for more information about replica set configurations.
To convert a master-slave deployment to a replica set, restart the current master as a one-member replica set. Then
remove the data directories from previous secondaries and add them as new secondaries to the new replica set.
1. To confirm that the current instance is master, run:
db.isMaster()
2. Shut down the mongod processes on the master and all slave(s), using the following command while connected
to each instance:
db.adminCommand({shutdown : 1, force : true})
3. Back up your /data/db directories, in case you need to revert to the master-slave deployment.
4. Start the former master with the --replSet option, as in the following:
mongod --replSet <setname>
5. Connect to the mongod with the mongo shell, and initiate the replica set with the following command:
rs.initiate()
When the command returns, you will have successfully deployed a one-member replica set. You can check the
status of your replica set at any time by running the following command:
rs.status()
You can now follow the convert a standalone to a replica set (page 669) tutorial to deploy your replica set, picking up
from the Expand the Replica Set (page 670) section.
To permanently failover from a unavailable or damaged master (A in the following example) to a slave (B):
1. Shut down A.
2. Stop mongod on B.
3. Back up and move all data files that begin with local on B from the dbPath.
Warning: Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
caution.
Note: This is a one time operation, and is not reversible. A cannot become a slave of B until it completes a full resync.
If you have a master (A) and a slave (B) and you would like to reverse their roles, follow this procedure. The procedure
assumes A is healthy, up-to-date and available.
If A is not healthy but the hardware is okay (power outage, server crash, etc.), skip steps 1 and 2 and in step 8 replace
all of As files with Bs files in step 8.
If A is not healthy and the hardware is not okay, replace A with a new machine. Also follow the instructions in the
previous paragraph.
To invert the master and slave in a deployment:
1. Halt writes on A using the fsync command.
2. Make sure B is up to date with the state of A.
3. Shut down B.
4. Back up and move all data files that begin with local on B from the dbPath to remove the existing
local.sources data.
Warning: Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
caution.
If you can stop write operations to the master for an indefinite period, you can copy the data files from the master to
the new slave and then start the slave with --fastsync.
Warning: Be careful with --fastsync. If the data on both instances is not identical, a discrepancy will exist
forever.
fastsync is a way to start a slave by starting with an existing master disk image/backup. This option declares that
the administrator guarantees the image is correct and completely up-to-date with that of the master. If you have a full
and complete copy of data from a master you can use this option to avoid a full synchronization upon starting the
slave.
You can just copy the other slaves data file snapshot without any special options. Only take data snapshots when:
a mongod process is down, or
when the mongod is locked using db.fsyncLock() for MMAPv1 or WiredTiger storage engine.
Changed in version 3.2: Starting in MongoDB 3.2, db.fsyncLock() can ensure that the data files do not change
for MongoDB instances using either the MMAPv1 or the WiredTiger storage engine, thus providing consistency for
the purposes of creating backups.
In previous MongoDB version, db.fsyncLock() cannot guarantee a consistent set of files for low-level backups
(e.g. via file copy cp, scp, tar) for WiredTiger.
Slaves asynchronously apply write operations from the master that the slaves poll from the masters oplog. The oplog
is finite in length, and if a slave is too far behind, a full resync will be necessary. To resync the slave, connect to a
slave using the mongo and issue the resync command:
use admin
db.runCommand( { resync: 1 } )
This forces a full resync of all data (which will be very slow on a large database). You can achieve the same effect by
stopping mongod on the slave, deleting the entire content of the dbPath on the slave, and restarting the mongod.
Slave Chaining
Slaves cannot be chained. They must all connect to the master directly.
If a slave attempts slave from another slave you will see the following line in the mongod long of the shell:
assertion 13051 tailable cursor requested on non capped collection ns:local.oplog.$main
To change a slaves source, manually modify the slaves local.sources (page 717) collection.
Example
Consider the following: If you accidentally set an incorrect hostname for the slaves source, as in the following
example:
mongod --slave --source prod.mississippi
You can correct this, by restarting the slave without the --slave and --source arguments:
mongod
Connect to this mongod instance using the mongo shell and update the local.sources (page 717) collection,
with the following operation sequence:
use local
Restart the slave with the correct command line arguments or with no --source option. After configuring
local.sources (page 717) the first time, the --source will have no subsequent effect. Therefore, both of
the following invocations are correct:
mongod --slave --source prod.mississippi.example.net
or
mongod --slave
The administration of replica sets includes the initial deployment of the set, adding and removing members to a set,
and configuring the operational parameters and properties of the set. Administrators generally need not intervene in
failover or replication processes as MongoDB automates these functions. In the exceptional situations that require
manual interventions, the tutorials in these sections describe processes such as resyncing a member. The tutorials in
this section form the basis for all replica set administration.
Replica Set Deployment Tutorials (page 656) Instructions for deploying replica sets, as well as adding and removing
members from an existing replica set.
Deploy a Replica Set (page 657) Configure a three-member replica set for production systems.
Convert a Standalone to a Replica Set (page 669) Convert an existing standalone mongod instance into a
three-member replica set.
Add Members to a Replica Set (page 670) Add a new member to an existing replica set.
Remove Members from Replica Set (page 673) Remove a member from a replica set.
Continue reading from Replica Set Deployment Tutorials (page 656) for additional tutorials of related to setting
up replica set deployments.
Member Configuration Tutorials (page 675) Tutorials that describe the process for configuring replica set members.
Adjust Priority for Replica Set Member (page 676) Change the precedence given to a replica set members in
an election for primary.
Prevent Secondary from Becoming Primary (page 677) Make a secondary member ineligible for election as
primary.
Configure a Hidden Replica Set Member (page 678) Configure a secondary member to be invisible to appli-
cations in order to support significantly different usage, such as a dedicated backups.
Continue reading from Member Configuration Tutorials (page 675) for more tutorials that describe replica set
configuration.
Replica Set Maintenance Tutorials (page 684) Procedures and tasks for common operations on active replica set
deployments.
Change the Size of the Oplog (page 684) Increase the size of the oplog which logs operations. In most cases,
the default oplog size is sufficient.
Resync a Member of a Replica Set (page 690) Sync the data on a member. Either perform initial sync on a
new member or resync the data on an existing member that has fallen too far behind to catch up by way of
normal replication.
Force a Member to Become Primary (page 688) Force a replica set member to become primary.
Change Hostnames in a Replica Set (page 699) Update the replica set configuration to reflect changes in
members hostnames.
Continue reading from Replica Set Maintenance Tutorials (page 684) for descriptions of additional replica set
maintenance procedures.
Troubleshoot Replica Sets (page 704) Describes common issues and operational challenges for replica sets. For ad-
ditional diagnostic information, see FAQ: MongoDB Diagnostics (page 857).
Add an Arbiter to Replica Set (page 668) Add an arbiter give a replica set an odd number of voting members to
prevent election ties.
Convert a Standalone to a Replica Set (page 669) Convert an existing standalone mongod instance into a three-
member replica set.
Add Members to a Replica Set (page 670) Add a new member to an existing replica set.
Remove Members from Replica Set (page 673) Remove a member from a replica set.
Replace a Replica Set Member (page 674) Update the replica set configuration when the hostname of a members
corresponding mongod instance has changed.
On this page
Overview (page 657)
Requirements (page 657)
Considerations When Deploying a Replica Set (page 658)
Procedure (page 658)
This tutorial describes how to create a three-member replica set from three existing mongod instances running with
access control (page 331) disabled.
To deploy a replica set with enabled access control (page 331), see Deploy New Replica Set with Access Control
(page 349). If you wish to deploy a replica set from a single MongoDB instance, see Convert a Standalone to a
Replica Set (page 669). For more information on replica set deployments, see the Replication (page 613) and Replica
Set Deployment Architectures (page 626) documentation.
Overview
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 635) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 613).
The basic procedure is to start the mongod instances that will become members of the replica set, configure the replica
set itself, and then add the mongod instances to it.
Requirements
For production deployments, you should maintain as much separation between members as possible by hosting the
mongod instances on separate machines. When using virtual machines for production deployments, you should place
each mongod instance on a separate host server serviced by redundant power circuits and redundant network paths.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 5).
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 706).
Architecture In a production, deploy each member of the replica set to its own machine and if possible bind to the
standard MongoDB port of 27017. Use the bind_ip option to ensure that MongoDB listens for connections from
applications on configured addresses.
For a geographically distributed replica sets, ensure that the majority of the sets mongod instances reside in the
primary site.
See Replica Set Deployment Architectures (page 626) for more information.
Connectivity Ensure that network traffic can pass between all members of the set and all clients in the network
securely and efficiently. Consider the following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members within
a single site over the local area network.
Configure access control to prevent connections from unknown clients to the replica set.
Configure networking and firewall rules so that incoming and outgoing packets are permitted only on the default
MongoDB port and only from within your deployment.
Finally ensure that each member of a replica set is accessible by way of resolvable DNS or hostnames. You should
either configure your DNS names appropriately or set up your systems /etc/hosts file to reflect this configuration.
Configuration Specify the run time configuration on each system in a configuration file stored in
/etc/mongod.conf or a related location. Create the directory where MongoDB stores data files before deploying
MongoDB.
For more information about the run time options used above and other configuration options, see
https://docs.mongodb.org/manual/reference/configuration-options.
Procedure
The following procedure outlines the steps to deploy a replica set when access control is disabled.
Step 1: Start each member of the replica set with the appropriate options. For each member, start a mongod and
specify the replica set name through the replSet option. Specify any other parameters specific to your deployment.
For replication-specific parameters, see cli-mongod-replica-set.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
The following example specifies the replica set name through the --replSet command-line option:
mongod --replSet "rs0"
You can also specify the replica set name in the configuration file. To start mongod with a configu-
ration file, specify the file with the --config option:
mongod --config $HOME/.mongodb/config
In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of
this document.
Step 2: Connect a mongo shell to a replica set member. For example, to connect to a mongod running on
localhost on the default port of 27017, simply issue:
mongo
Step 3: Initiate the replica set. Use rs.initiate() on one and only one member of the replica set:
rs.initiate()
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration.
Step 4: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object
(page 709):
rs.conf()
Step 5: Add the remaining members to the replica set. Add the remaining members with the rs.add() method.
You must be connected to the primary to add members to a replica set.
rs.add() can, in some cases, trigger an election. If the mongod you are connected to becomes a secondary, you
need to connect the mongo shell to the new primary to continue adding new replica set members. Use rs.status()
to identify the primary in the replica set.
The following example adds two members:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you have a fully functional replica set. The new replica set will elect a primary.
Step 6: Check the status of the replica set. Use the rs.status() operation:
rs.status()
See also:
Deploy New Replica Set with Access Control (page 349)
On this page
Overview (page 660)
Requirements (page 660)
Considerations (page 660)
Procedure (page 661)
This procedure describes deploying a replica set in a development or test environment. For a production deployment,
refer to the Deploy a Replica Set (page 657) tutorial.
This tutorial describes how to create a three-member replica set from three existing mongod instances running with
access control (page 331) disabled.
To deploy a replica set with enabled access control (page 331), see Deploy New Replica Set with Access Control
(page 349). If you wish to deploy a replica set from a single MongoDB instance, see Convert a Standalone to a
Replica Set (page 669). For more information on replica set deployments, see the Replication (page 613) and Replica
Set Deployment Architectures (page 626) documentation.
Overview
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 635) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 613).
The basic procedure is to start the mongod instances that will become members of the replica set, configure the replica
set itself, and then add the mongod instances to it.
Requirements
For test and development systems, you can run your mongod instances on a local system, or within a virtual instance.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 5).
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 706).
Considerations
The examples in this procedure create a new replica set named rs0.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
You will begin by starting three mongod instances as members of a replica set named rs0.
Procedure
1. Create the necessary data directories for each member by issuing a command similar to the following:
mkdir -p /srv/mongodb/rs0-0 /srv/mongodb/rs0-1 /srv/mongodb/rs0-2
This will create directories called rs0-0, rs0-1, and rs0-2, which will contain the instances database files.
2. Start your mongod instances in their own shell windows by issuing the following commands:
First member:
mongod --port 27017 --dbpath /srv/mongodb/rs0-0 --replSet rs0 --smallfiles --oplogSize 128
Second member:
mongod --port 27018 --dbpath /srv/mongodb/rs0-1 --replSet rs0 --smallfiles --oplogSize 128
Third member:
mongod --port 27019 --dbpath /srv/mongodb/rs0-2 --replSet rs0 --smallfiles --oplogSize 128
This starts each instance as a member of a replica set named rs0, each running on a distinct port, and specifies
the path to your data directory with the --dbpath setting. If you are already using the suggested ports, select
different ports.
The --smallfiles and --oplogSize settings reduce the disk space that each mongod
instance uses. This is ideal for testing and development deployments as it prevents over-
loading your machine. For more information on these and other configuration options, see
https://docs.mongodb.org/manual/reference/configuration-options.
3. Connect to one of your mongod instances through the mongo shell. You will need to indicate which instance
by specifying its port number. For the sake of simplicity and clarity, you may want to choose the first one, as in
the following command;
mongo --port 27017
4. In the mongo shell, use rs.initiate() to initiate the replica set. You can create a replica set configuration
object in the mongo shell environment, as in the following example:
rsconf = {
_id: "rs0",
members: [
{
_id: 0,
host: "<hostname>:27017"
}
]
}
replacing <hostname> with your systems hostname, and then pass the rsconf file to rs.initiate() as
follows:
rs.initiate( rsconf )
5. Display the current replica configuration (page 709) by issuing the following command:
rs.conf()
{
"_id" : "rs0",
"version" : 4,
"members" : [
{
"_id" : 1,
"host" : "localhost:27017"
}
]
}
6. In the mongo shell connected to the primary, add the second and third mongod instances to the replica set
using the rs.add() method. Replace <hostname> with your systems hostname in the following examples:
rs.add("<hostname>:27018")
rs.add("<hostname>:27019")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Check the status of your replica set at any time with the rs.status() operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate()
rs.conf()
rs.reconfig()
rs.add()
You may also consider the simple setup script16 as an example of a basic automatically-configured replica set.
Refer to Replica Set Read and Write Semantics (page 639) for a detailed explanation of read and write semantics in
MongoDB.
On this page
Overview (page 662)
Considerations (page 663)
Prerequisites (page 663)
Procedures (page 663)
Overview
This tutorial outlines the process for deploying a replica set with members in multiple locations. The tutorial addresses
three-member sets, four-member sets, and sets with more than four members.
For appropriate background, see Replication (page 613) and Replica Set Deployment Architectures (page 626). For
related tutorials, see Deploy a Replica Set (page 657) and Add Members to a Replica Set (page 670).
16 https://github.com/mongodb/mongo-snippets/blob/master/replication/simple-setup.py
Considerations
While replica sets provide basic protection against single-instance failure, replica sets whose members are all located
in a single facility are susceptible to errors in that facility. Power outages, network interruptions, and natural disasters
are all issues that can affect replica sets whose members are colocated. To protect against these classes of failures,
deploy a replica set with one or more members in a geographically distinct facility or data center to provide redundancy.
Prerequisites
In general, the requirements for any geographically redundant replica set are as follows:
Ensure that a majority of the voting members (page 637) are within a primary facility, Site A. This includes
priority 0 members (page 621) and arbiters (page 625). Deploy other members in secondary facilities, Site B,
Site C, etc., to provide additional copies of the data. See Determine the Distribution of Members (page 628)
for more information on the voting requirements for geographically redundant replica sets.
If you deploy a replica set with an even number of members, deploy an arbiter (page 625) on Site A. The arbiter
must be on site A to keep the majority there.
For instance, for a three-member replica set you need two instances in a Site A, and one member in a secondary facility,
Site B. Site A should be the same facility or very close to your primary application infrastructure (i.e. application
servers, caching layer, users, etc.)
A four-member replica set should have at least two members in Site A, with the remaining members in one or more
secondary sites, as well as a single arbiter in Site A.
For all configurations in this tutorial, deploy each replica set member on a separate system. Although you may deploy
more than one replica set member on a single system, doing so reduces the redundancy and capacity of the replica set.
Such deployments are typically for testing purposes and beyond the scope of this tutorial.
This tutorial assumes you have installed MongoDB on each system that will be part of your replica set. If you have
not already installed MongoDB, see the installation tutorials (page 5).
Procedures
General Considerations
Architecture In a production, deploy each member of the replica set to its own machine and if possible bind to the
standard MongoDB port of 27017. Use the bind_ip option to ensure that MongoDB listens for connections from
applications on configured addresses.
For a geographically distributed replica sets, ensure that the majority of the sets mongod instances reside in the
primary site.
See Replica Set Deployment Architectures (page 626) for more information.
Connectivity Ensure that network traffic can pass between all members of the set and all clients in the network
securely and efficiently. Consider the following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members within
a single site over the local area network.
Configure access control to prevent connections from unknown clients to the replica set.
Configure networking and firewall rules so that incoming and outgoing packets are permitted only on the default
MongoDB port and only from within your deployment.
Finally ensure that each member of a replica set is accessible by way of resolvable DNS or hostnames. You should
either configure your DNS names appropriately or set up your systems /etc/hosts file to reflect this configuration.
Configuration Specify the run time configuration on each system in a configuration file stored in
/etc/mongod.conf or a related location. Create the directory where MongoDB stores data files before deploying
MongoDB.
For more information about the run time options used above and other configuration options, see
https://docs.mongodb.org/manual/reference/configuration-options.
Step 1: Start each member of the replica set with the appropriate options. For each member, start a mongod and
specify the replica set name through the replSet option. Specify any other parameters specific to your deployment.
For replication-specific parameters, see cli-mongod-replica-set.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
The following example specifies the replica set name through the --replSet command-line option:
mongod --replSet "rs0"
You can also specify the replica set name in the configuration file. To start mongod with a configu-
ration file, specify the file with the --config option:
mongod --config $HOME/.mongodb/config
In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of
this document.
Step 2: Connect a mongo shell to a replica set member. For example, to connect to a mongod running on
localhost on the default port of 27017, simply issue:
mongo
Step 3: Initiate the replica set. Use rs.initiate() on one and only one member of the replica set:
rs.initiate()
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration.
Step 4: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object
(page 709):
rs.conf()
Step 5: Add the remaining members to the replica set. Add the remaining members with the rs.add() method.
You must be connected to the primary to add members to a replica set.
rs.add() can, in some cases, trigger an election. If the mongod you are connected to becomes a secondary, you
need to connect the mongo shell to the new primary to continue adding new replica set members. Use rs.status()
to identify the primary in the replica set.
The following example adds two members:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you have a fully functional replica set. The new replica set will elect a primary.
Step 6: Configure the outside member as priority 0 members. Configure the member located in Site B (in this
example, mongodb2.example.net) as a priority 0 member (page 621).
1. View the replica set configuration to determine the members (page 711) array position for the member. Keep
in mind the array position is not the same as the _id:
rs.conf()
2. Copy the replica set configuration object to a variable (to cfg in the example below). Then, in the variable,
set the correct priority for the member. Then pass the variable to rs.reconfig() to update the replica set
configuration.
For example, to set priority for the third member in the array (i.e., the member at position 2), issue the following
sequence of commands:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
Note: The rs.reconfig() shell method can force the current primary to step down, causing an election.
When the primary steps down, all clients will disconnect. This is the intended behavior. While most elec-
tions complete within a minute, always make sure any replica configuration changes occur during scheduled
maintenance periods.
After these commands return, you have a geographically redundant three-member replica set.
Step 7: Check the status of the replica set. Use the rs.status() operation:
rs.status()
Deploy a Geographically Redundant Four-Member Replica Set A geographically redundant four-member de-
ployment has two additional considerations:
One host (e.g. mongodb4.example.net) must be an arbiter. This host can run on a system that is also used
for an application server or on the same machine as another MongoDB process.
You must decide how to distribute your systems. There are three possible architectures for the four-member
replica set:
Three members in Site A, one priority 0 member (page 621) in Site B, and an arbiter in Site A.
Two members in Site A, two priority 0 members (page 621) in Site B, and an arbiter in Site A.
Two members in Site A, one priority 0 member in Site B, one priority 0 member in Site C, and an arbiter
in site A.
In most cases, the first architecture is preferable because it is the least complex.
Step 1: Start each member of the replica set with the appropriate options. For each member, start a mongod and
specify the replica set name through the replSet option. Specify any other parameters specific to your deployment.
For replication-specific parameters, see cli-mongod-replica-set.
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group
replica set connections by replica set name.
The following example specifies the replica set name through the --replSet command-line option:
mongod --replSet "rs0"
You can also specify the replica set name in the configuration file. To start mongod with a configu-
ration file, specify the file with the --config option:
mongod --config $HOME/.mongodb/config
In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of
this document.
Step 2: Connect a mongo shell to a replica set member. For example, to connect to a mongod running on
localhost on the default port of 27017, simply issue:
mongo
Step 3: Initiate the replica set. Use rs.initiate() on one and only one member of the replica set:
rs.initiate()
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration.
Step 4: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object
(page 709):
rs.conf()
Step 5: Add the remaining members to the replica set. Use rs.add() in a mongo shell connected to the current
primary. The commands should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
rs.add("mongodb3.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Step 6: Add the arbiter. In the same shell session, issue the following command to add the arbiter (e.g.
mongodb4.example.net):
rs.addArb("mongodb4.example.net")
Step 7: Configure outside members as priority 0 members. Configure each member located outside of Site A (e.g.
mongodb3.example.net) as a priority 0 member (page 621).
1. View the replica set configuration to determine the members (page 711) array position for the member. Keep
in mind the array position is not the same as the _id:
rs.conf()
2. Copy the replica set configuration object to a variable (to cfg in the example below). Then, in the variable,
set the correct priority for the member. Then pass the variable to rs.reconfig() to update the replica set
configuration.
For example, to set priority for the third member in the array (i.e., the member at position 2), issue the following
sequence of commands:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
Note: The rs.reconfig() shell method can force the current primary to step down, causing an election.
When the primary steps down, all clients will disconnect. This is the intended behavior. While most elec-
tions complete within a minute, always make sure any replica configuration changes occur during scheduled
maintenance periods.
After these commands return, you have a geographically redundant four-member replica set.
Step 8: Check the status of the replica set. Use the rs.status() operation:
rs.status()
Deploy a Geographically Redundant Set with More than Four Members The above procedures detail the steps
necessary for deploying a geographically redundant replica set. Larger replica set deployments follow the same steps,
but have additional considerations:
Never deploy more than seven voting members.
If you have an even number of members, use the procedure for a four-member set (page 666)). Ensure that
a single facility, Site A, always has a majority of the members by deploying the arbiter in that site. For
example, if a set has six members, deploy at least three voting members in addition to the arbiter in Site A, and
the remaining members in alternate sites.
If you have an odd number of members, use the procedure for a three-member set (page 664). Ensure that a
single facility, Site A always has a majority of the members of the set. For example, if a set has five members,
deploy three members within Site A and two members in other facilities.
If you have a majority of the members of the set outside of Site A and the network partitions to prevent com-
munication between sites, the current primary in Site A will step down, even if none of the members outside of
Site A are eligible to become primary.
On this page
Considerations (page 668)
Add an Arbiter (page 669)
Arbiters are mongod instances that are part of a replica set but do not hold data. Arbiters participate in elections
(page 635) in order to break ties. If a replica set has an even number of members, add an arbiter.
Arbiters have minimal resource requirements and do not require dedicated hardware. You can deploy an arbiter on an
application server or a monitoring host.
Important: Do not run an arbiter on the same system as a member of the replica set.
Considerations
An arbiter does not store data, but until the arbiters mongod process is added to the replica set, the arbiter will act
like any other mongod process and start up with a set of data files and with a full-sized journal.
To minimize the default creation of data, set the following in the arbiters configuration file:
journal.enabled to false
smallFiles to true
These settings are specific to arbiters. Do not set journal.enabled to false on a data-bearing node. Similarly,
do not set smallFiles unless specifically indicated.
Add an Arbiter
1. Create a data directory (e.g. dbPath) for the arbiter. The mongod instance uses the directory for configuration
data. The directory will not hold the data set. For example, create the /data/arb directory:
mkdir /data/arb
2. Start the arbiter. Specify the data directory and the replica set name. The following, starts an arbiter using the
/data/arb dbPath for the rs replica set:
mongod --port 30000 --dbpath /data/arb --replSet rs
3. Connect to the primary and add the arbiter to the replica set. Use the rs.addArb() method, as in the following
example:
rs.addArb("m1.example.net:30000")
This operation adds the arbiter running on port 30000 on the m1.example.net host.
On this page
Procedure (page 669)
This tutorial describes the process for converting a standalone mongod instance into a three-member replica set. Use
standalone instances for testing and development, but always use replica sets in production. To install a standalone
instance, see the installation tutorials (page 5).
To deploy a replica set without using a pre-existing mongod instance, see Deploy a Replica Set (page 657).
Procedure
If your application connects to more than one replica set, each set should have a distinct name. Some drivers
group replica set connections by replica set name.
Expand the Replica Set Add additional replica set members by doing the following:
1. On two distinct systems, start two new standalone mongod instances. For information on starting a standalone
instance, see the installation tutorial (page 5) specific to your environment.
2. On your connection to the original mongod instance (the former standalone instance), issue a command in the
following form for each new instance to add to the replica set:
rs.add("<hostname><:port>")
Replace <hostname> and <port> with the resolvable hostname and port of the mongod instance to add to
the set. For more information on adding a host to a replica set, see Add Members to a Replica Set (page 670).
Sharding Considerations If the new replica set is part of a sharded cluster, change the shard host information in
the config database by doing the following:
1. Connect to one of the sharded clusters mongos instances and issue a command in the following form:
db.getSiblingDB("config").shards.save( {_id: "<name>", host: "<replica-set>/<member,><member,><.
Replace <name> with the name of the shard. Replace <replica-set> with the name of the replica set.
Replace <member,><member,><> with the list of the members of the replica set.
2. Restart all mongos instances. If possible, restart all components of the replica sets (i.e., all mongos and all
shard mongod instances).
On this page
Overview (page 670)
Requirements (page 671)
Procedures (page 671)
Overview
This tutorial explains how to add an additional member to an existing replica set. For background on replication
deployment patterns, see the Replica Set Deployment Architectures (page 626) document.
Maximum Voting Members A replica set can have a maximum of seven voting members (page 635). To add a
member to a replica set that already has seven voting members, you must either add the member as a non-voting
member (page 637) or remove a vote from an existing member.
Init Scripts In production deployments you can configure a init script to manage member processes.
Existing Members You can use these procedures to add new members to an existing set. You can also use the same
procedure to re-add a removed member. If the removed members data is still relatively recent, it can recover and
catch up easily.
Data Files If you have a backup or snapshot of an existing member, you can move the data files (e.g. the dbPath
directory) to a new system and use them to quickly initiate a new member. The files must be:
A valid copy of the data files from a member of the same replica set. See Backup and Restore with Filesystem
Snapshots (page 266) document for more information.
Important: Always use filesystem snapshots to create a copy of a member of the existing replica set. Do not
use mongodump and mongorestore to seed a new replica set member.
More recent than the oldest operation in the primarys oplog. The new member must be able to become current
by applying operations from the primarys oplog.
Requirements
Procedures
Prepare the Data Directory Before adding a new member to an existing replica set, prepare the new members data
directory using one of the following strategies:
Make sure the new members data directory does not contain data. The new member will copy the data from an
existing member.
If the new member is in a recovering state, it must exit and become a secondary before MongoDB can copy all
data as part of the replication process. This process takes time but does not require administrator intervention.
Manually copy the data directory from an existing member. The new member becomes a secondary member
and will catch up to the current state of the replica set. Copying the data over may shorten the amount of time
for the new member to become current.
Ensure that you can copy the data directory to the new member and begin replication within the window allowed
by the oplog (page 647). Otherwise, the new instance will have to perform an initial sync, which completely
resynchronizes the data, as described in Resync a Member of a Replica Set (page 690).
Use rs.printReplicationInfo() to check the current state of replica set members with regards to the
oplog.
For background on replication deployment patterns, see the Replica Set Deployment Architectures (page 626) docu-
ment.
Take note of the host name and port information for the new mongod instance.
For more information on configuration options, see the mongod manual page.
Optional
You can specify the data directory and replica set in the mongod.conf configuration file, and start
the mongod with the following command:
mongod --config /etc/mongod.conf
4. Verify that the member is now part of the replica set. Call the rs.conf() method, which displays the replica
set configuration (page 709):
rs.conf()
To view replica set status, issue the rs.status() method. For a description of the status fields, see
https://docs.mongodb.org/manual/reference/command/replSetGetStatus.
Configure and Add a Member You can add a member to a replica set by passing to the rs.add() method a
members (page 711) document. The document must be in the form of a members (page 711) document. These
documents define a replica set member in the same form as the replica set configuration document (page 710).
Important: Specify a value for the _id field of the members (page 711) document. MongoDB does not automat-
ically populate the _id field in this case. Finally, the members (page 711) document must declare the host value.
All other fields are optional.
Example
To add a member with the following configuration:
an _id of 1.
a hostname and port number of mongodb3.example.net:27017.
a priority value within the replica set of 0.
a configuration as hidden,
On this page
Remove a Member Using rs.remove() (page 673)
Remove a Member Using rs.reconfig() (page 673)
1. Shut down the mongod instance for the member you wish to remove. To shut down the instance, connect using
the mongo shell and the db.shutdownServer() method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster() while
connected to any member of the replica set.
3. Use rs.remove() in either of the following forms to remove the member:
rs.remove("mongod3.example.net:27017")
rs.remove("mongod3.example.net")
MongoDB disconnects the shell briefly as the replica set elects a new primary. The shell then automatically
reconnects. The shell displays a DBClientCursor::init call() failed error even though the com-
mand succeeds.
To remove a member you can manually edit the replica set configuration document (page 709), as described here.
1. Shut down the mongod instance for the member you wish to remove. To shut down the instance, connect using
the mongo shell and the db.shutdownServer() method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster() while
connected to any member of the replica set.
3. Issue the rs.conf() method to view the current configuration document and determine the position in the
members array of the member to remove:
Example
mongod_C.example.net is in position 2 of the following configuration file:
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
},
{
"_id" : 2,
"host" : "mongod_C.example.net:27017"
}
]
}
Example
To remove mongod_C.example.net:27017 use the following JavaScript operation:
cfg.members.splice(2,1)
6. Overwrite the replica set configuration document with the new configuration by issuing the following:
rs.reconfig(cfg)
As a result of rs.reconfig() the shell will disconnect while the replica set renegotiates which member is
primary. The shell displays a DBClientCursor::init call() failed error even though the com-
mand succeeds, and will automatically reconnected.
7. To confirm the new configuration, issue rs.conf().
For the example above the output would be:
{
"_id" : "rs",
"version" : 8,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
}
]
}
On this page
Operation (page 675)
Example (page 675)
If you need to change the hostname of a replica set member without changing the configuration of that member or the
set, you can use the operation outlined in this tutorial. For example if you must re-provision systems or rename hosts,
you can use this pattern to minimize the scope of that change.
Operation
To change the hostname for a replica set member modify the members[n].host (page 711) field. The value of
members[n]._id (page 711) field will not change when you reconfigure the set.
See Replica Set Configuration (page 709) and rs.reconfig() for more information.
Note: Any replica set configuration change can trigger the current primary to step down, which forces an election
(page 635). During the election, the current shell session and clients connected to this replica set disconnect, which
produces an error even when the operation succeeds.
Example
To change the hostname to mongo2.example.net for the replica set member configured at members[0], issue
the following sequence of commands:
cfg = rs.conf()
cfg.members[0].host = "mongo2.example.net"
rs.reconfig(cfg)
The following tutorials provide information in configuring replica set members to support specific operations, such as
to provide dedicated backups, to support reporting, or to act as a cold standby.
Adjust Priority for Replica Set Member (page 676) Change the precedence given to a replica set members in an elec-
tion for primary.
Prevent Secondary from Becoming Primary (page 677) Make a secondary member ineligible for election as pri-
mary.
Configure a Hidden Replica Set Member (page 678) Configure a secondary member to be invisible to applications
in order to support significantly different usage, such as a dedicated backups.
Configure a Delayed Replica Set Member (page 680) Configure a secondary member to keep a delayed copy of the
data set in order to provide a rolling backup.
Configure Non-Voting Replica Set Member (page 681) Create a secondary member that keeps a copy of the data set
but does not vote in an election.
Convert a Secondary to an Arbiter (page 682) Convert a secondary to an arbiter.
On this page
Overview (page 676)
Considerations (page 676)
Procedure (page 676)
Overview
The priority settings of replica set members affect the outcomes of elections (page 635) for primary. Use this setting
to ensure that some members are more likely to become primary and that others can never become primary.
The value of the members members[n].priority (page 713) setting determines the members priority in elec-
tions. The higher the number, the higher the priority.
Considerations
To modify priorities, you update the members (page 711) array in the replica configuration object. The array index
begins with 0. Do not confuse this index value with the value of the replica set members members[n]._id
(page 711) field in the array.
The value of members[n].priority (page 713) can be any floating point (i.e. decimal) number between 0 and
1000. The default value for the members[n].priority (page 713) field is 1.
To block a member from seeking election as primary, assign it a priority of 0. Hidden members (page 623), delayed
members (page 624), and arbiters (page ??) all have members[n].priority (page 713) set to 0.
Adjust priority during a scheduled maintenance window. Reconfiguring priority can force the current primary to step
down, leading to an election. Before an election the primary closes all open client connections.
Procedure
Step 1: Copy the replica set configuration to a variable. In the mongo shell, use rs.conf() to retrieve the
replica set configuration and assign it to a variable. For example:
cfg = rs.conf()
Step 2: Change each members priority value. Change each members members[n].priority (page 713)
value, as configured in the members (page 711) array.
cfg.members[0].priority = 0.5
cfg.members[1].priority = 2
cfg.members[2].priority = 2
This sequence of operations modifies the value of cfg to set the priority for the first three members defined in the
members (page 711) array.
Step 3: Assign the replica set the new configuration. Use rs.reconfig() to apply the new configuration.
rs.reconfig(cfg)
This operation updates the configuration of the replica set using the configuration defined by the value of cfg.
On this page
Overview (page 677)
Considerations (page 677)
Procedure (page 677)
Related Documents (page 678)
Overview
In a replica set, by default all secondary members are eligible to become primary through the election process. You
can use the priority to affect the outcome of these elections by making some members more likely to become
primary and other members less likely or unable to become primary.
Secondaries that cannot become primary are also unable to trigger elections. In all other respects these secondaries
are identical to other secondaries.
To prevent a secondary member from ever becoming a primary in a failover, assign the secondary a priority of 0, as
described here. For a detailed description of secondary-only members and their purposes, see Priority 0 Replica Set
Members (page 621).
Considerations
When updating the replica configuration object, access the replica set members in the members (page 711) ar-
ray with the array index. The array index begins with 0. Do not confuse this index value with the value of the
members[n]._id (page 711) field in each document in the members (page 711) array.
Note: MongoDB does not permit the current primary to have a priority of 0. To prevent the current primary from
again becoming a primary, you must first step down the current primary using rs.stepDown().
Procedure
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 635). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 668) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Step 1: Retrieve the current replica set configuration. The rs.conf() method returns a replica set configura-
tion document (page 709) that contains the current configuration for a replica set.
In a mongo shell connected to a primary, run the rs.conf() method and assign the result to a variable:
cfg = rs.conf()
The returned document contains a members (page 711) field which contains an array of member configuration docu-
ments, one document for each member of the replica set.
Step 2: Assign priority value of 0. To prevent a secondary member from becoming a primary, update the secondary
members members[n].priority (page 713) to 0.
To assign a priority value to a member of the replica set, access the member configuration document using the array
index. In this tutorial, the secondary member to change corresponds to the configuration document found at position
2 of the members (page 711) array.
cfg.members[2].priority = 0
The configuration change does not take effect until you reconfigure the replica set.
Step 3: Reconfigure the replica set. Use rs.reconfig() method to reconfigure the replica set with the updated
replica set configuration document.
Pass the cfg variable to the rs.reconfig() method:
rs.reconfig(cfg)
Related Documents
On this page
Considerations (page 679)
Examples (page 679)
Related Documents (page 680)
Hidden members are part of a replica set but cannot become primary and are invisible to client applications. Hidden
members may vote in elections (page 635). For a more information on hidden members and their uses, see Hidden
Replica Set Members (page 623).
Considerations
The most common use of hidden nodes is to support delayed members (page 624). If you only need to prevent a
member from becoming primary, configure a priority 0 member (page 621).
If the settings.chainingAllowed (page 714) setting allows secondary members to sync from other secon-
daries, MongoDB by default prefers non-hidden members over hidden members when selecting a sync target. Mon-
goDB will only choose hidden members as a last resort. If you want a secondary to sync from a hidden member,
use the replSetSyncFrom database command to override the default sync target. See the documentation for
replSetSyncFrom before using the command.
See also:
Manage Chained Replication (page 698)
Changed in version 2.0: For sharded clusters running with replica sets before 2.0, if you reconfigured a member as
hidden, you had to restart mongos to prevent queries from reaching the hidden member.
Examples
Configuration Procedure The following example hides the secondary member currently at the index 0 in the
members (page 711) array. To configure a hidden member, use the following sequence of operations in a mongo
shell connected to the primary, specifying the member to configure by its array index in the members (page 711)
array:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
rs.reconfig(cfg)
After re-configuring the set, this secondary member has a priority of 0 so that it cannot become primary and is hidden.
The other members in the set will not advertise the hidden member in the isMaster or db.isMaster() output.
When updating the replica configuration object, access the replica set members in the members (page 711) ar-
ray with the array index. The array index begins with 0. Do not confuse this index value with the value of the
members[n]._id (page 711) field in each document in the members (page 711) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 635). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 668) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Related Documents
On this page
Example (page 680)
Related Documents (page 681)
To configure a delayed secondary member, set its members[n].priority (page 713) value to 0, its
members[n].hidden (page 712) value to true, and its members[n].slaveDelay (page 713) value to the
number of seconds to delay.
Important: The length of the secondary members[n].slaveDelay (page 713) must fit within the window of
the oplog. If the oplog is shorter than the members[n].slaveDelay (page 713) window, the delayed member
cannot successfully replicate operations.
When you configure a delayed member, the delay applies both to replication and to the members oplog. For details
on delayed members and their uses, see Delayed Replica Set Members (page 624).
Example
The following example sets a 1-hour delay on a secondary member currently at the index 0 in the members (page 711)
array. To set the delay, issue the following sequence of operations in a mongo shell connected to the primary:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
cfg.members[0].slaveDelay = 3600
rs.reconfig(cfg)
After the replica set reconfigures, the delayed secondary member cannot become primary and is hidden from applica-
tions. The members[n].slaveDelay (page 713) value delays both replication and the members oplog by 3600
seconds (1 hour).
When updating the replica configuration object, access the replica set members in the members (page 711) ar-
ray with the array index. The array index begins with 0. Do not confuse this index value with the value of the
members[n]._id (page 711) field in each document in the members (page 711) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 635). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 668) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Related Documents
On this page
Example (page 681)
Related Documents (page 682)
Non-voting members allow you to add additional members for read distribution beyond the maximum seven voting
members. To configure a member as non-voting, set its members[n].votes (page 713) value to 0.
Example
To disable the ability to vote in elections for the fourth, fifth, and sixth replica set members, use the following command
sequence in the mongo shell connected to the primary. You identify each replica set member by its array index in the
members (page 711) array:
cfg = rs.conf()
cfg.members[3].votes = 0
cfg.members[4].votes = 0
cfg.members[5].votes = 0
rs.reconfig(cfg)
This sequence gives 0 votes to the fourth, fifth, and sixth members of the set according to the order of the members
(page 711) array in the output of rs.conf(). This setting allows the set to elect these members as primary but does
not allow them to vote in elections. Place voting members so that your designated primary or primaries can reach a
majority of votes in the event of a network partition.
When updating the replica configuration object, access the replica set members in the members (page 711) ar-
ray with the array index. The array index begins with 0. Do not confuse this index value with the value of the
members[n]._id (page 711) field in each document in the members (page 711) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 635). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 668) to ensure that members can quickly obtain a
majority of votes in an election for primary.
In general and when possible, all members should have only 1 vote. This prevents intermittent ties, deadlocks, or the
wrong members from becoming primary. Use members[n].priority (page 713) to control which members are
more likely to become primary.
Related Documents
On this page
Convert Secondary to Arbiter and Reuse the Port Number (page 682)
Convert Secondary to Arbiter Running on a New Port Number (page 683)
If you have a secondary in a replica set that no longer needs to hold data but that needs to remain in the set to ensure that
the set can elect a primary (page 635), you may convert the secondary to an arbiter (page ??) using either procedure
in this tutorial. Both procedures are operationally equivalent:
You may operate the arbiter on the same port as the former secondary. In this procedure, you must shut down
the secondary and remove its data before restarting and reconfiguring it as an arbiter.
For this procedure, see Convert Secondary to Arbiter and Reuse the Port Number (page 682).
Run the arbiter on a new port. In this procedure, you can reconfigure the server as an arbiter before shutting
down the instance running as a secondary.
For this procedure, see Convert Secondary to Arbiter Running on a New Port Number (page 683).
1. If your application is connecting directly to the secondary, modify the application so that MongoDB queries
dont reach the secondary.
2. Shut down the secondary.
3. Remove the secondary from the replica set by calling the rs.remove() method. Perform this operation while
connected to the current primary in the mongo shell:
rs.remove("<hostname><:port>")
4. Verify that the replica set no longer includes the secondary by calling the rs.conf() method in the mongo
shell:
rs.conf()
Optional
You may remove the data instead.
6. Create a new, empty data directory to point to when restarting the mongod instance. You can reuse the previous
name. For example:
mkdir /data/db
7. Restart the mongod instance for the secondary, specifying the port number, the empty data directory, and the
replica set. You can use the same port number you used before. Issue a command similar to the following:
mongod --port 27021 --dbpath /data/db --replSet rs
8. In the mongo shell convert the secondary to an arbiter using the rs.addArb() method:
rs.addArb("<hostname><:port>")
9. Verify the arbiter belongs to the replica set by calling the rs.conf() method in the mongo shell.
rs.conf()
1. If your application is connecting directly to the secondary or has a connection string referencing the secondary,
modify the application so that MongoDB queries dont reach the secondary.
2. Create a new, empty data directory to be used with the new port number. For example:
mkdir /data/db-temp
3. Start a new mongod instance on the new port number, specifying the new data directory and the existing replica
set. Issue a command similar to the following:
mongod --port 27021 --dbpath /data/db-temp --replSet rs
4. In the mongo shell connected to the current primary, convert the new mongod instance to an arbiter using the
rs.addArb() method:
rs.addArb("<hostname><:port>")
5. Verify the arbiter has been added to the replica set by calling the rs.conf() method in the mongo shell.
rs.conf()
8. Verify that the replica set no longer includes the old secondary by calling the rs.conf() method in the mongo
shell:
rs.conf()
mv /data/db /data/db-old
Optional
You may remove the data instead.
On this page
Overview (page 685)
Procedure (page 685)
The oplog exists internally as a capped collection, so you cannot modify its size in the course of normal operations. In
most cases the default oplog size (page 647) is an acceptable size; however, in some situations you may need a larger
or smaller oplog. For example, you might need to change the oplog size if your applications perform large numbers of
multi-updates or deletes in short periods of time.
This tutorial describes how to resize the oplog. For a detailed explanation of oplog sizing, see Oplog Size (page 647).
For details how oplog size affects delayed members and affects replication lag, see Delayed Replica Set Members
(page 624).
Overview
To change the size of the oplog, you must perform maintenance on each member of the replica set in turn. The
procedure requires: stopping the mongod instance and starting as a standalone instance, modifying the oplog size,
and restarting the member.
Important: Always start rolling replica set maintenance with the secondaries, and finish with the maintenance on
primary member.
Procedure
Tip
Always use rs.stepDown() to force the primary to become a secondary, before stopping the server. This
facilitates a more efficient election process.
Recreate the oplog with the new size and with an old oplog entry as a seed.
Restart the mongod instance as a member of the replica set.
Restart a Secondary in Standalone Mode on a Different Port Shut down the mongod instance for one of the
non-primary members of your replica set. For example, to shut down, use the db.shutdownServer() method:
db.shutdownServer()
Restart this mongod as a standalone instance running on a different port and without the --replSet parameter. Use
a command similar to the following:
mongod --port 37017 --dbpath /srv/mongodb
Create a Backup of the Oplog (Optional) Optionally, backup the existing oplog on the standalone instance, as in
the following example:
mongodump --db local --collection 'oplog.rs' --port 37017
Recreate the Oplog with a New Size and a Seed Entry Save the last entry from the oplog. For example, connect
to the instance using the mongo shell, and enter the following command to switch to the local database:
use local
In mongo shell scripts you can use the following operation to set the db object:
db = db.getSiblingDB('local')
Ensure that the temp temporary collection is empty by dropping the collection:
db.temp.drop()
Use the db.collection.save() method and a sort on reverse natural order to find the last entry and save it to a
temporary collection:
Remove the Existing Oplog Collection Drop the old oplog.rs collection in the local database. Use the fol-
lowing command:
db = db.getSiblingDB('local')
db.oplog.rs.drop()
Create a New Oplog Use the create command to create a new oplog of a different size. Specify the size
argument in bytes. A value of 2 * 1024 * 1024 * 1024 will create a new oplog thats 2 gigabytes:
db.runCommand( { create: "oplog.rs", capped: true, size: (2 * 1024 * 1024 * 1024) } )
Insert the Last Entry of the Old Oplog into the New Oplog Insert the previously saved last entry from the old
oplog into the new oplog. For example:
db.oplog.rs.save( db.temp.findOne() )
To confirm the entry is in the new oplog, use the following operation:
db.oplog.rs.find()
Restart the Member Restart the mongod as a member of the replica set on its usual port. For example:
db.shutdownServer()
mongod --replSet rs0 --dbpath /srv/mongodb
The replica set member will recover and catch up before it is eligible for election to primary.
Repeat Process for all Members that may become Primary Repeat this procedure for all members you want to
change the size of the oplog. Repeat the procedure for the primary as part of the following step.
Change the Size of the Oplog on the Primary To finish the rolling maintenance operation, step down the primary
with the rs.stepDown() method and repeat the oplog resizing procedure above.
On this page
Overview (page 687)
Procedure (page 687)
Overview
Replica sets allow a MongoDB deployment to remain available during the majority of a maintenance window.
This document outlines the basic procedure for performing maintenance on each of the members of a replica set.
Furthermore, this particular sequence strives to minimize the amount of time that the primary is unavailable and
controlling the impact on the entire deployment.
Use these steps as the basis for common replica set operations, particularly for procedures such as upgrading to the
latest version of MongoDB (page 258) and changing the size of the oplog (page 684).
Procedure
For each member of a replica set, starting with a secondary member, perform the following sequence of events, ending
with the primary:
Restart the mongod instance as a standalone.
Perform the task on the standalone instance.
Restart the mongod instance as a member of the replica set.
Step 1: Stop a secondary. In the mongo shell, shut down the mongod instance:
db.shutdownServer()
Step 2: Restart the secondary as a standalone on a different port. At the operating system shell prompt, restart
mongod as a standalone instance running on a different port and without the --replSet parameter:
mongod --port 37017 --dbpath /srv/mongodb
Always start mongod with the same user, even when restarting a replica set member as a standalone instance.
Step 3: Perform maintenance operations on the secondary. While the member is a standalone, use the mongo
shell to perform maintenance:
mongo --port 37017
Step 4: Restart mongod as a member of the replica set. After performing all maintenance tasks, use the following
procedure to restart the mongod as a member of the replica set on its usual port.
From the mongo shell, shut down the standalone server after completing the maintenance:
db.shutdownServer()
Restart the mongod instance as a member of the replica set using its normal command-line arguments or configuration
file.
The secondary takes time to catch up to the primary (page 648). From the mongo shell, use the following command
to verify that the member has caught up from the RECOVERING (page 719) state to the SECONDARY (page 718) state.
rs.status()
Step 5: Perform maintenance on the primary last. To perform maintenance on the primary after completing
maintenance tasks on all secondaries, use rs.stepDown() in the mongo shell to step down the primary and allow
one of the secondaries to be elected the new primary. Specify a 300 second waiting period to prevent the member from
being elected primary again for five minutes:
rs.stepDown(300)
After the primary steps down, the replica set will elect a new primary. See Replica Set Elections (page 635) for more
information about replica set elections.
On this page
Overview (page 688)
Consideration (page 688)
Procedures (page 688)
Overview
You can force a replica set member to become primary by giving it a higher members[n].priority (page 713)
value than any other member in the set.
Optionally, you also can force a member never to become primary by setting its members[n].priority
(page 713) value to 0, which means the member can never seek election (page 635) as primary. For more information,
see Priority 0 Replica Set Members (page 621).
For more information on priorities, see members[n].priority (page 713).
Consideration
A majority of the configured members of a replica set must be available for a set to reconfigure a set or elect a primary.
See Replica Set Elections (page 635) for more information.
Procedures
Force a Member to be Primary by Setting its Priority High This procedure assumes your current primary is
m1.example.net and that youd like to instead make m3.example.net primary. The procedure also assumes
you have a three-member replica set with the configuration below. For more information on configurations, see Replica
Set Configuration Use.
This procedure assumes this configuration:
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "m1.example.net:27017"
},
{
"_id" : 1,
"host" : "m2.example.net:27017"
},
{
"_id" : 2,
"host" : "m3.example.net:27017"
}
]
}
1. In a mongo shell connected to the primary, use the following sequence of operations to make
m3.example.net the primary:
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 0.5
cfg.members[2].priority = 1
rs.reconfig(cfg)
The last statement calls rs.reconfig() with the modified configuration document to configure
m3.example.net to have a higher members[n].priority (page 713) value than the other mongod
instances.
The following sequence of events occur:
m3.example.net and m2.example.net sync with m1.example.net (typically within 10 sec-
onds).
m1.example.net sees that it no longer has highest priority and, in most cases, steps down.
m1.example.net does not step down if m3.example.nets sync is far behind. In that case,
m1.example.net waits until m3.example.net is within 10 seconds of its optime and then steps
down. This minimizes the amount of time with no primary following failover.
The step down forces on election in which m3.example.net becomes primary based on its priority
setting.
2. Optionally, if m3.example.net is more than 10 seconds behind m1.example.nets optime, and if you
dont need to have a primary designated within 10 seconds, you can force m1.example.net to step down by
running:
db.adminCommand({replSetStepDown: 86400, force: 1})
This prevents m1.example.net from being primary for 86,400 seconds (24 hours), even if there is no other
member that can become primary. When m3.example.net catches up with m1.example.net it will
become primary.
If you later want to make m1.example.net primary again while it waits for m3.example.net to catch
up, issue the following command to make m1.example.net seek election again:
rs.freeze()
mdb2.example.net - a secondary .
To force a member to become primary use the following procedure:
1. In a mongo shell, run rs.status() to ensure your replica set is running as expected.
2. In a mongo shell connected to the mongod instance running on mdb2.example.net, freeze
mdb2.example.net so that it does not attempt to become primary for 120 seconds.
rs.freeze(120)
3. In a mongo shell connected the mongod running on mdb0.example.net, step down this instance that the
mongod is not eligible to become primary for 120 seconds:
rs.stepDown(120)
Note: During the transition, there is a short window where the set does not have a primary.
For more information, consider the rs.freeze() and rs.stepDown() methods that wrap the
replSetFreeze and replSetStepDown commands.
On this page
Procedures (page 690)
A replica set member becomes stale when its replication process falls so far behind that the primary overwrites
oplog entries the member has not yet replicated. The member cannot catch up and becomes stale. When this occurs,
you must completely resynchronize the member by removing its data and performing an initial sync (page 648).
This tutorial addresses both resyncing a stale member and to creating a new member using seed data from another
member. When syncing a member, choose a time when the system has the bandwidth to move a large amount of data.
Schedule the synchronization during a time of low usage or during a maintenance window.
MongoDB provides two options for performing an initial sync:
Restart the mongod with an empty data directory and let MongoDBs normal initial syncing feature restore the
data. This is the more simple option but may take longer to replace the data.
See Procedures (page 690).
Restart the machine with a copy of a recent data directory from another member in the replica set. This procedure
can replace the data more quickly but requires more manual steps.
See Sync by Copying Data Files from Another Member (page 691).
Procedures
Automatically Sync a Member Warning: During initial sync, mongod will remove the content of the dbPath.
This procedure relies on MongoDBs regular process for initial sync (page 648). This will store the current data on the
member. For an overview of MongoDB initial sync process, see the Replication Processes (page 646) section.
If the instance has no data, you can simply follow the Add Members to a Replica Set (page 670) or Replace a Replica
Set Member (page 674) procedure to add a new member to a replica set.
You can also force a mongod that is already a member of the set to to perform an initial sync by restarting the instance
without the content of the dbPath as follows:
1. Stop the members mongod instance. To ensure a clean shutdown, use the db.shutdownServer() method
from the mongo shell or on Linux systems, the mongod --shutdown option.
2. Delete all data and sub-directories from the members data directory. By removing the data dbPath, MongoDB
will perform a complete resync. Consider making a backup first.
At this point, the mongod will perform an initial sync. The length of the initial sync process depends on the size of
the database and network connection between members of the replica set.
Initial sync operations can impact the other members of the set and create additional traffic to the primary and can only
occur if another member of the set is accessible and up to date.
Sync by Copying Data Files from Another Member This approach seeds a new or stale member using the data
files from an existing member of the replica set. The data files must be sufficiently recent to allow the new member to
catch up with the oplog. Otherwise the member would need to perform an initial sync.
Copy the Data Files You can capture the data files as either a snapshot or a direct copy. However, in most cases you
cannot copy data files from a running mongod instance to another because the data files will change during the file
copy operation.
Important: If copying data files, you must copy the content of the local database.
You cannot use a mongodump backup for the data files, only a snapshot backup. For approaches to capturing a
consistent snapshot of a running mongod instance, see the MongoDB Backup Methods (page 200) documentation.
Sync the Member After you have copied the data files from the seed source, start the mongod instance and allow
it to apply all operations from the oplog until it reflects the current state of the replica set.
On this page
Differences Between Read Preferences and Write Concerns (page 692)
Add Tag Sets to a Replica Set (page 692)
Custom Multi-Datacenter Write Concerns (page 693)
Configure Tag Sets for Functional Segregation of Read and Write Operations (page 694)
Tag sets let you customize write concern and read preferences for a replica set. MongoDB stores tag sets in the replica
set configuration object, which is the document returned by rs.conf(), in the members[n].tags (page 713)
embedded document.
This section introduces the configuration of tag sets. For an overview on tag sets and their use, see w: <tag set>
and Tag Sets (page 644).
Custom read preferences and write concerns evaluate tags sets in different ways:
Read preferences consider the value of a tag when selecting a member to read from.
Write concerns do not use the value of a tag to select a member except to consider whether or not the value is
unique.
For example, a tag set for a read operation may resemble the following document:
{ "disk": "ssd", "use": "reporting" }
To fulfill such a read operation, a member would need to have both of these tags. Any of the following tag sets would
satisfy this requirement:
{ "disk": "ssd", "use": "reporting" }
{ "disk": "ssd", "use": "reporting", "rack": "a" }
{ "disk": "ssd", "use": "reporting", "rack": "d" }
{ "disk": "ssd", "use": "reporting", "mem": "r"}
The following tag sets would not be able to fulfill this query:
{ "disk": "ssd" }
{ "use": "reporting" }
{ "disk": "ssd", "use": "production" }
{ "disk": "ssd", "use": "production", "rack": "k" }
{ "disk": "spinning", "use": "reporting", "mem": "32" }
You could add tag sets to the members of this replica set with the following command sequence in the mongo shell:
conf = rs.conf()
conf.members[0].tags = { "dc": "east", "use": "production" }
conf.members[1].tags = { "dc": "east", "use": "reporting" }
conf.members[2].tags = { "use": "production" }
rs.reconfig(conf)
After this operation the output of rs.conf() would resemble the following:
{
"_id" : "rs0",
"version" : 2,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017",
"tags" : {
"dc": "east",
"use": "production"
}
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017",
"tags" : {
"dc": "east",
"use": "reporting"
}
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017",
"tags" : {
"use": "production"
}
}
]
}
Given a five member replica set with members in two data centers:
1. a facility VA tagged dc_va
2. a facility GTO tagged dc_gto
Create a custom write concern to require confirmation from two data centers using replica set tags, using the following
sequence of operations in the mongo shell:
1. Create a replica set configuration JavaScript object conf:
conf = rs.conf()
3. Create a custom settings.getLastErrorModes (page 714) setting to ensure that the write operation will
propagate to at least one member of each facility:
conf.settings = { getLastErrorModes: { MultipleDC : { "dc_va": 1, "dc_gto": 1 } } }
4. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
To ensure that a write operation propagates to at least one member of the set in both data centers, use the MultipleDC
write concern mode as follows:
db.users.insert( { id: "xyz", status: "A" }, { writeConcern: { w: "MultipleDC" } } )
Alternatively, if you want to ensure that each write operation propagates to at least 2 racks in each facility, reconfigure
the replica set as follows in the mongo shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
2. Redefine the settings.getLastErrorModes (page 714) value to require two different values of both
dc_va and dc_gto:
conf.settings = { getLastErrorModes: { MultipleDC : { "dc_va": 2, "dc_gto": 2}}
3. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
Now, the following write operation will only return after the write operation propagates to at least two different racks
in the each facility:
Changed in version 2.6: A new protocol for write operations (page 986) integrates write concerns with the write
operations. Previous versions used the getLastError command to specify the write concerns.
db.users.insert( { id: "xyz", status: "A" }, { writeConcern: { w: "MultipleDC" } } )
Configure Tag Sets for Functional Segregation of Read and Write Operations
To target a read operation to a member of the replica set with a disk type of ssd, you could use the following tag set:
17 Since read preferences and write concerns use the value of fields in tag sets differently, larger deployments may have some redundancy.
{ disk: "ssd" }
However, to create comparable write concern modes, you would specify a different set of
settings.getLastErrorModes (page 714) configuration. Consider the following sequence of operations in
the mongo shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
2. Redefine the settings.getLastErrorModes (page 714) value to configure two write concern modes:
conf.settings = {
"getLastErrorModes" : {
"ssd" : {
"ssd" : 1
},
"MultipleDC" : {
"dc_va" : 1,
"dc_gto" : 1
}
}
}
3. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
Now you can specify the MultipleDC write concern mode, as in the following, to ensure that a write operation
propagates to each data center.
Changed in version 2.6: A new protocol for write operations (page 986) integrates write concerns with the write
operations. Previous versions used the getLastError command to specify the write concerns.
db.users.insert( { id: "xyz", status: "A" }, { writeConcern: { w: "MultipleDC" } } )
Additionally, you can specify the ssd write concern mode to ensure that a write operation propagates to at least one
instance with an SSD.
On this page
Reconfigure by Forcing the Reconfiguration (page 696)
Reconfigure by Replacing the Replica Set (page 696)
To reconfigure a replica set when a majority of members are available, use the rs.reconfig() operation on the
current primary, following the example in the Replica Set Reconfiguration Procedure.
This document provides the following options for re-configuring a replica set when only a minority of members are
accessible:
Reconfigure by Forcing the Reconfiguration (page 696)
Reconfigure by Replacing the Replica Set (page 696)
You may need to use one of these procedures, for example, in a geographically distributed replica set, where no local
group of members can reach a majority. See Replica Set Elections (page 635) for more information on this situation.
printjson(cfg)
3. On the same member, remove the down and unreachable members of the replica set from the members
(page 711) array by setting the array equal to the surviving members alone. Consider the following example,
which uses the cfg variable created in the previous step:
cfg.members = [cfg.members[0] , cfg.members[4] , cfg.members[7]]
4. On the same member, reconfigure the set by using the rs.reconfig() command with the force option set
to true:
rs.reconfig(cfg, {force : true})
This operation forces the secondary to use the new configuration. The configuration is then propagated to all the
surviving members listed in the members array. The replica set then elects a new primary.
Note: When you use force : true, the version number in the replica set configuration increases signif-
icantly, by tens or hundreds of thousands. This is normal and designed to prevent set version collisions if you
accidentally force re-configurations on both sides of a network partition and then the network partitioning ends.
5. If the failure or partition was only temporary, shut down or decommission the removed members as soon as
possible.
Use the following procedure only for versions of MongoDB prior to version 2.0. If youre running MongoDB 2.0 or
later, use the above procedure, Reconfigure by Forcing the Reconfiguration (page 696).
These procedures are for situations where a majority of the replica set members are down or unreachable. If a majority
is running, then skip these procedures and instead use the rs.reconfig() command according to the examples in
replica-set-reconfiguration-usage.
If you run a pre-2.0 version and a majority of your replica set is down, you have the two options described here. Both
involve replacing the replica set.
Reconfigure by Turning Off Replication This option replaces the replica set with a standalone server.
1. Stop the surviving mongod instances. To ensure a clean shutdown, use an existing init script or use the
db.shutdownServer() method.
For example, to use the db.shutdownServer() method, connect to the server using the mongo shell and
issue the following sequence of commands:
use admin
db.shutdownServer()
2. Create a backup of the data directory (i.e. dbPath) of the surviving members of the set.
Optional
If you have a backup of the database you may instead remove this data.
Reconfigure by Breaking the Mirror This option selects a surviving replica set member to be the new primary
and to seed a new replica set. In the following procedure, the new primary is db0.example.net. MongoDB
copies the data from db0.example.net to all the other members.
1. Stop the surviving mongod instances. To ensure a clean shutdown, use an existing init script or use the
db.shutdownServer() method.
For example, to use the db.shutdownServer() method, connect to the server using the mongo shell and
issue the following sequence of commands:
use admin
db.shutdownServer()
2. Move the data directories (i.e. dbPath) for all the members except db0.example.net, so that all the
members except db0.example.net have empty data directories. For example:
mv /data/db /data/db-old
3. Move the data files for local database (i.e. local.*) so that db0.example.net has no local database.
For example
mkdir /data/local-old
mv /data/db/local* /data/local-old/
MongoDB performs an initial sync on the added members by copying all data from db0.example.net to
the added members.
See also:
Resync a Member of a Replica Set (page 690)
On this page
Disable Chained Replication (page 698)
Re-enable Chained Replication (page 699)
Starting in version 2.0, MongoDB supports chained replication. A chained replication occurs when a secondary
member replicates from another secondary member instead of from the primary. This might be the case, for example,
if a secondary selects its replication target based on ping time and if the closest member is another secondary.
Chained replication can reduce load on the primary. But chained replication can also result in increased replication
lag, depending on the topology of the network.
New in version 2.2.2.
You can use the settings.chainingAllowed (page 714) setting in Replica Set Configuration (page 709) to
disable chained replication for situations where chained replication is causing lag.
MongoDB enables chained replication by default. This procedure describes how to disable it and how to re-enable it.
Note: If chained replication is disabled, you still can use replSetSyncFrom to specify that a secondary replicates
from another secondary. But that configuration will last only until the secondary recalculates which member to sync
from.
To disable chained replication, set the settings.chainingAllowed (page 714) field in Replica Set Configura-
tion (page 709) to false.
You can use the following sequence of commands to set settings.chainingAllowed (page 714) to false:
1. Copy the configuration settings into the cfg object:
cfg = rs.config()
2. Take note of whether the current configuration settings contain the settings embedded document. If they do,
skip this step.
Warning: To avoid data loss, skip this step if the configuration settings contain the settings embedded
document.
If the current configuration settings do not contain the settings embedded document, create the embedded
document by issuing the following command:
cfg.settings = { }
3. Issue the following sequence of commands to set settings.chainingAllowed (page 714) to false:
cfg.settings.chainingAllowed = false
rs.reconfig(cfg)
To re-enable chained replication, set settings.chainingAllowed (page 714) to true. You can use the fol-
lowing sequence of commands:
cfg = rs.config()
cfg.settings.chainingAllowed = true
rs.reconfig(cfg)
On this page
Overview (page 699)
Assumptions (page 699)
Change Hostnames while Maintaining Replica Set Availability (page 700)
Change All Hostnames at the Same Time (page 701)
For most replica sets, the hostnames in the members[n].host (page 711) field never change. However, if organi-
zational needs change, you might need to migrate some or all host names.
Note: Always use resolvable hostnames for the value of the members[n].host (page 711) field in the replica set
configuration to avoid confusion and complexity.
Overview
This document provides two separate procedures for changing the hostnames in the members[n].host (page 711)
field. Use either of the following approaches:
Change hostnames without disrupting availability (page 700). This approach ensures your applications will
always be able to read and write data to the replica set, but the approach can take a long time and may incur
downtime at the application layer.
If you use the first procedure, you must configure your applications to connect to the replica set at both the old
and new locations, which often requires a restart and reconfiguration at the application layer and which may
affect the availability of your applications. Re-configuring applications is beyond the scope of this document.
Stop all members running on the old hostnames at once (page 701). This approach has a shorter maintenance
window, but the replica set will be unavailable during the operation.
See also:
Replica Set Reconfiguration Process, Deploy a Replica Set (page 657), and Add Members to a Replica Set (page 670).
Assumptions
(d) Use rs.reconfig() to update the replica set configuration document (page 709) with the new host-
name.
For example, the following sequence of commands updates the hostname for the secondary at the array
index 1 of the members array (i.e. members[1]) in the replica set configuration document:
cfg = rs.conf()
cfg.members[1].host = "mongodb1.example.net:27017"
rs.reconfig(cfg)
2. Open a mongo shell connected to the primary and step down the primary using the rs.stepDown() method:
rs.stepDown()
3. For each member of the replica set, perform the following sequence of operations:
(a) Open a mongo shell connected to the mongod running on the new, temporary port. For example, for a
member running on a temporary port of 37017, you would issue this command:
mongo --port 37017
(b) Edit the replica set configuration manually. The replica set configuration is the only document in the
system.replset collection in the local database. Edit the replica set configuration with the new
hostnames and correct ports for all the members of the replica set. Consider the following sequence of
commands to change the hostnames in a three-member set:
use local
cfg.members[0].host = "mongodb0.example.net:27017"
cfg.members[1].host = "mongodb1.example.net:27017"
cfg.members[2].host = "mongodb2.example.net:27017"
5. Connect to one of the mongod instances using the mongo shell. For example:
mongo --port 27017
On this page
Overview (page 703)
Considerations (page 703)
Procedure (page 704)
Overview
Secondaries capture data from the primary member to maintain an up to date copy of the sets data. However, by
default secondaries may automatically change their sync targets to secondary members based on changes in the ping
time between members and the state of other members replication. See Replica Set Data Synchronization (page 648)
and Manage Chained Replication (page 698) for more information.
For some deployments, implementing a custom replication sync topology may be more effective than the default sync
target selection logic. MongoDB provides the ability to specify a host to use as a sync target.
To override the default sync target selection logic, you may manually configure a secondary members sync target to
temporarily pull oplog entries. The following provide access to this functionality:
replSetSyncFrom command, or
rs.syncFrom() helper in the mongo shell
Considerations
Sync Logic Only modify the default sync logic as needed, and always exercise caution. rs.syncFrom() will
not affect an in-progress initial sync operation. To affect the sync target for the initial sync, run rs.syncFrom()
operation before initial sync.
If you run rs.syncFrom() during initial sync, MongoDB produces no error messages, but the sync target will not
change until after the initial sync operation.
Target The member to sync from must be a valid source for data in the set. To sync from a member, the member
must:
Have data. It cannot be an arbiter, in startup or recovering mode, and must be able to answer data queries.
Be accessible.
Be a member of the same set in the replica set configuration.
Build indexes with the members[n].buildIndexes (page 712) setting.
A different member of the set, to prevent syncing from itself.
If you attempt to replicate from a member that is more than 10 seconds behind the current member, mongod will log
a warning but will still replicate from the lagging member.
If you run replSetSyncFrom during initial sync, MongoDB produces no error messages, but the sync target will
not change until after the initial sync operation.
Procedure
On this page
Check Replica Set Status (page 704)
Check the Replication Lag (page 704)
Test Connections Between all Members (page 706)
Socket Exceptions when Rebooting More than One Secondary (page 706)
Check the Size of the Oplog (page 707)
Oplog Entry Timestamp Error (page 707)
Duplicate Key Error on local.slaves (page 708)
This section describes common strategies for troubleshooting replica set deployments.
To display the current state of the replica set and current state of each member, run the rs.status() method in a
mongo shell connected to the replica sets primary. For descriptions of the information displayed by rs.status(),
see https://docs.mongodb.org/manual/reference/command/replSetGetStatus.
Note: The rs.status() method is a wrapper that runs the replSetGetStatus database command.
Replication lag is a delay between an operation on the primary and the application of that operation from the oplog to
the secondary. Replication lag can be a significant issue and can seriously affect MongoDB replica set deployments.
Excessive replication lag makes lagged members ineligible to quickly become primary and increases the possibility
that distributed read operations will be inconsistent.
To check the current length of replication lag:
In a mongo shell connected to the primary, call the rs.printSlaveReplicationInfo() method.
Returns the syncedTo value for each member, which shows the time when the last oplog entry was written to
the secondary, as shown in the following example:
source: m1.example.net:27017
syncedTo: Thu Apr 10 2014 10:27:47 GMT-0400 (EDT)
0 secs (0 hrs) behind the primary
source: m2.example.net:27017
syncedTo: Thu Apr 10 2014 10:27:47 GMT-0400 (EDT)
0 secs (0 hrs) behind the primary
A delayed member (page 624) may show as 0 seconds behind the primary when the inactivity period on the
primary is greater than the members[n].slaveDelay (page 713) value.
Note: The rs.status() method is a wrapper around the replSetGetStatus database command.
Monitor the rate of replication by watching the oplog time in the replica graph in the MongoDB Cloud Man-
ager18 and in Ops Manager, an on-premise solution available in MongoDB Enterprise Advanced19 . For more
information see the MongoDB Cloud Manager documentation20 and Ops Manager documentation21 .
Possible causes of replication lag include:
Network Latency
Check the network routes between the members of your set to ensure that there is no packet loss or network
routing issue.
Use tools including ping to test latency between set members and traceroute to expose the routing of
packets network endpoints.
Disk Throughput
If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary, then
the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multi-tenant
systems, including virtualized instances, and can be transient if the system accesses disk devices over an IP
network (as is the case with Amazons EBS system.)
Use system-level tools to assess disk status, including iostat or vmstat.
Concurrency
In some cases, long-running operations on the primary can block replication on secondaries. For best results,
configure write concern (page 141) to require confirmation of replication to secondaries. This prevents write
operations from returning if replication cannot keep up with the write load.
Use the database profiler to see if there are slow queries or long-running operations that correspond to the
incidences of lag.
Appropriate Write Concern
If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the
primary, particularly with unacknowledged write concern, the secondaries will not be able to read the
oplog fast enough to keep up with changes.
To prevent this, request write acknowledgment write concern (page 141) after every 100, 1,000, or an another
interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
Write Concern (page 141)
Replica Set Write Concern (page 90)
18 https://cloud.mongodb.com/?jmp=docs
19 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
20 https://docs.cloud.mongodb.com/
21 https://docs.opsmanager.mongodb.com/current/
All members of a replica set must be able to connect to every other member of the set to support replication. Always
verify connections in both directions. Networking topologies and firewall configurations can prevent normal and
required connectivity, which can block replication.
Consider the following example of a bidirectional test of networking:
Example
Given a replica set with three members running on three separate hosts:
m1.example.net
m2.example.net
m3.example.net
1. Test the connection from m1.example.net to the other hosts with the following operation set
m1.example.net:
mongo --host m2.example.net --port 27017
2. Test the connection from m2.example.net to the other two hosts with the following operation set from
m2.example.net, as in:
mongo --host m1.example.net --port 27017
You have now tested the connection between m2.example.net and m1.example.net in both directions.
3. Test the connection from m3.example.net to the other two hosts with the following operation set from the
m3.example.net host, as in:
mongo --host m1.example.net --port 27017
If any connection, in any direction fails, check your networking and firewall configuration and reconfigure your envi-
ronment to allow these connections.
When you reboot members of a replica set, ensure that the set is able to elect a primary during the maintenance. This
means ensuring that a majority of the sets members[n].votes (page 713) are available.
When a sets active members can no longer form a majority, the sets primary steps down and becomes a secondary.
The former primary closes all open connections to client applications. Clients attempting to write to the former primary
receive socket exceptions and Connection reset errors until the set can elect a primary.
Example
Given a three-member replica set where every member has one vote, the set can elect a primary if at least two members
can connect to each other. If you reboot the two secondaries at once, the primary steps down and becomes a secondary.
Until at least another secondary becomes available, i.e. at least one of the rebooted secondaries also becomes available,
the set has no primary and cannot elect a new primary.
For more information on votes, see Replica Set Elections (page 635). For related information on connection errors,
see Does TCP keepalive time affect MongoDB Deployments? (page 858).
A larger oplog can give a replica set a greater tolerance for lag, and make the set more resilient.
To check the size of the oplog for a given replica set member, connect to the member in a mongo shell and run the
rs.printReplicationInfo() method.
The output displays the size of the oplog and the date ranges of the operations contained in the oplog. In the following
example, the oplog is about 10 MB and is able to fit about 26 hours (94400 seconds) of operations:
configured oplog size: 10.10546875MB
log length start to end: 94400 (26.22hrs)
oplog first event time: Mon Mar 19 2012 13:50:38 GMT-0400 (EDT)
oplog last event time: Wed Oct 03 2012 14:59:10 GMT-0400 (EDT)
now: Wed Oct 03 2012 15:00:21 GMT-0400 (EDT)
The oplog should be long enough to hold all transactions for the longest downtime you expect on a secondary. At a
minimum, an oplog should be able to hold minimum 24 hours of operations; however, many users prefer to have 72
hours or even a weeks work of operations.
For more information on how oplog size affects operations, see:
Oplog Size (page 647),
Delayed Replica Set Members (page 624), and
Check the Replication Lag (page 704).
Note: You normally want the oplog to be the same size on all members. If you resize the oplog, resize it on all
members.
To change oplog size, see the Change the Size of the Oplog (page 684) tutorial.
Often, an incorrectly typed value in the ts field in the last oplog entry causes this error. The correct data type is
Timestamp.
Check the type of the ts value using the following two queries against the oplog collection:
db = db.getSiblingDB("local")
db.oplog.rs.find().sort({$natural:-1}).limit(1)
db.oplog.rs.find({ts:{$type:17}}).sort({$natural:-1}).limit(1)
The first query returns the last document in the oplog, while the second returns the last document in the oplog where
the ts value is a Timestamp. The $type operator allows you to select BSON type 17, is the Timestamp data type.
If the queries dont return the same document, then the last document in the oplog has the wrong data type in the ts
field.
Example
If the first query returns this as the last oplog entry:
{ "ts" : {t: 1347982456000, i: 1},
"h" : NumberLong("8191276672478122996"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 4 } }
And the second query returns this as the last entry where ts has the Timestamp type:
{ "ts" : Timestamp(1347982454000, 1),
"h" : NumberLong("6188469075153256465"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 3 } }
Then the value for the ts field in the last oplog entry is of the wrong data type.
To set the proper type for this value and resolve this issue, use an update operation that resembles the following:
db.oplog.rs.update( { ts: { t:1347982456000, i:1 } },
{ $set: { ts: new Timestamp(1347982456000, 1)}})
Modify the timestamp values as needed based on your oplog entry. This operation may take some period to complete
because the update must scan and pull the entire oplog into memory.
On this page
Replication Methods in the mongo Shell (page 709)
Replication Database Commands (page 709)
Replica Set Reference Documentation (page 709)
Name Description
rs.add() Adds a member to a replica set.
rs.addArb() Adds an arbiter to a replica set.
rs.conf() Returns the replica set configuration document.
rs.freeze() Prevents the current member from seeking election as primary for a period of time.
rs.help() Returns basic help text for replica set functions.
rs.initiate() Initializes a new replica set.
Prints a report of the status of the replica set from the perspective of the primary.
rs.printReplicationInfo()
Prints a report of the status of the replica set from the perspective of the secondaries.
rs.printSlaveReplicationInfo()
rs.reconfig() Re-configures a replica set by applying a new replica set configuration object.
rs.remove() Remove a member from a replica set.
rs.slaveOk() Sets the slaveOk property for the current connection. Deprecated. Use
readPref() and Mongo.setReadPref() to set read preference.
rs.status() Returns a document with information about the state of the replica set.
rs.stepDown() Causes the current primary to become a secondary which forces an election.
rs.syncFrom() Sets the member that this replica set member will sync from, overriding the default
sync target selection logic.
Name Description
replSetFreeze Prevents the current member from seeking election as primary for a period of time.
replSetGetStatus Returns a document that reports on the status of the replica set.
replSetInitiate Initializes a new replica set.
replSetMaintenanceEnables or disables a maintenance mode, which puts a secondary node in a
RECOVERING state.
replSetReconfig Applies a new configuration to an existing replica set.
replSetStepDown Forces the current primary to step down and become a secondary, forcing an election.
replSetSyncFrom Explicitly override the default logic for selecting a member to replicate from.
resync Forces a mongod to re-synchronize from the master. For master-slave replication only.
applyOps Internal command that applies oplog entries to the current data set.
isMaster Displays information about this members role in the replica set, including whether it is
the master.
replSetGetConfig Returns the replica sets configuration object.
Replica Set Configuration (page 709) Complete documentation of the replica set configuration object returned by
rs.conf().
The local Database (page 715) Complete documentation of the content of the local database that mongod in-
stances use to support replication.
Replica Set Member States (page 717) Reference for the replica set member states.
Read Preference Reference (page 719) Complete documentation of the five read preference modes that the Mon-
goDB drivers support.
On this page
Example Output (page 710)
Replica Set Configuration Fields (page 710)
You can access the configuration of a replica set using the rs.conf() method or the replSetGetConfig com-
mand.
To modify the configuration for a replica set, use the rs.reconfig() method, passing a configuration document to
the method. See rs.reconfig() for more information.
Example Output
The following document provides a representation of a replica set configuration document. The configuration of your
replica set may include only a subset of these settings:
{
_id: <string>,
version: <int>,
protocolVersion: <number>,
members: [
{
_id: <int>,
host: <string>,
arbiterOnly: <boolean>,
buildIndexes: <boolean>,
hidden: <boolean>,
priority: <number>,
tags: <document>,
slaveDelay: <int>,
votes: <number>
},
...
],
settings: {
chainingAllowed : <boolean>,
heartbeatIntervalMillis : <int>,
heartbeatTimeoutSecs: <int>,
electionTimeoutMillis : <int>,
getLastErrorModes : <document>,
getLastErrorDefaults : <document>
}
}
_id
Type: string
The name of the replica set. Once set, you cannot change the name of a replica set.
_id (page 710) must be identical to the replication.replSetName or the value of replSet specified to
mongod on the command line.
See
version
Type: int
An incrementing number used to distinguish revisions of the replica set configuration object from previous
iterations of the configuration.
configsvr
New in version 3.2.
Type: boolean
Default: false
Indicates whether the replica set is used for a sharded clusters config servers. Set to true if the replica set is
for a sharded clusters config servers.
See also:
Sharded Cluster Enhancements (page 882)
protocolVersion
New in version 3.2.
Type: number
Default: 1 for new replica sets
Version of the replica set election protocol (page 881).
Set to 1 to enable the replication election enhancements (page 881) introduced in MongoDB 3.2.
By default, new replica sets in MongoDB 3.2 use protocolVersion: 1. Previous versions of Mon-
goDB use version 0 of the protocol and cannot run as members of a replica set configuration that specifies
protocolVersion 1.
members
members
Type: array
An array of member configuration documents, one for each member of the replica set. The members (page 711)
array is a zero-indexed array.
Each member-specific configuration document can contain the following fields:
members[n]._id
Type: integer
An integer identifier of every member in the replica set. Values must be between 0 and 255 inclusive. Each
replica set member must have a unique _id<members[n]._id>. Once set, you cannot change the _id
(page 711) of a member.
Note: When updating the replica configuration object, access the replica set members in the members
(page 711) array with the array index. The array index begins with 0. Do not confuse this index value
with the value of the members[n]._id (page 711) field in each document in the members (page 711)
array.
members[n].host
Type: string
The hostname and, if specified, the port number, of the set member.
The hostname name must be resolvable for every host in the replica set.
Warning: members[n].host (page 711) cannot hold a value that resolves to localhost or the
local interface unless all members of the set are on hosts that resolve to localhost.
members[n].arbiterOnly
Optional.
Type: boolean
Default: false
A boolean that identifies an arbiter. A value of true indicates that the member is an arbiter.
When using the rs.addArb() method to add an arbiter, the method automatically sets
members[n].arbiterOnly (page 712) to true for the added member.
members[n].buildIndexes
Optional.
Type: boolean
Default: true
A boolean that indicates whether the mongod builds indexes on this member. You can only set this
value when adding a member to a replica set. You cannot change members[n].buildIndexes
(page 712) field after the member has been added to the set. To add a member, see rs.add() and
rs.reconfig().
Do not set to false for mongod instances that receive queries from clients.
Setting buildIndexes to false may be useful if all the following conditions are true:
you are only using this instance to perform backups using mongodump, and
this member will receive no queries, and
index creation and maintenance overburdens the host system.
Even if set to false, secondaries will build indexes on the _id field in order to facilitate operations
required for replication.
Warning: If you set members[n].buildIndexes (page 712) to false, you must also set
members[n].priority (page 713) to 0. If members[n].priority (page 713) is not 0, Mon-
goDB will return an error when attempting to add a member with members[n].buildIndexes
(page 712) equal to false.
To ensure the member receives no queries, you should make all instances that do not build indexes
hidden.
Other secondaries cannot replicate from a member where members[n].buildIndexes
(page 712) is false.
members[n].hidden
Optional.
Type: boolean
Default: false
When this value is true, the replica set hides this instance and does not include the member in the output
of db.isMaster() or isMaster. This prevents read operations (i.e. queries) from ever reaching this
host by way of secondary read preference.
See also:
For more information on configuring tag sets for read preference and write concern, see Configure Replica
Set Tag Sets (page 691).
members[n].slaveDelay
Optional.
Type: integer
Default: 0
The number of seconds behind the primary that this replica set member should lag.
Use this option to create delayed members (page 624). Delayed members maintain a copy of the data that
reflects the state of the data at some time in the past.
See also:
Delayed Replica Set Members (page 624)
members[n].votes
Optional.
Type: integer
Default: 1
The number of votes a server will cast in a replica set election (page 635). The number of votes each
member has is either 1 or 0, and arbiters (page ??) always have exactly 1 vote.
A replica set can have up to 50 members but only 7 voting members. If you need more than 7 members
in one replica set, set members[n].votes (page 713) to 0 for the additional non-voting members.
Changed in version 3.0.0: Members cannot have members[n].votes (page 713) greater than 1. For
details, see Replica Set Configuration Validation (page 939).
settings
settings
Optional.
Type: document
A document that contains configuration options that apply to the whole replica set.
The settings (page 714) document contain the following fields:
settings.chainingAllowed
New in version 2.2.4.
Optional.
Type: boolean
Default: true
When settings.chainingAllowed (page 714) is true, the replica set allows secondary mem-
bers to replicate from other secondary members. When settings.chainingAllowed (page 714) is
false, secondaries can replicate only from the primary.
See also:
Manage Chained Replication (page 698)
settings.getLastErrorDefaults
Optional.
Type: document
A document that specifies the write concern (page 639) for the replica set. The replica set will use this
write concern only when write operations (page 992) or getLastError specify no other write concern.
If settings.getLastErrorDefaults (page 714) is not set, the default write concern for the replica
set only requires confirmation from the primary.
settings.getLastErrorModes
Optional.
Type: document
A document used to define an extended write concern through the use of members[n].tags (page 713).
The extended write concern can provide data-center awareness.
For example, the following document defines an extended write concern named eastCoast and asso-
ciates with a write to a member that has the east tag.
{ getLastErrorModes: { eastCoast: { "east": 1 } } }
Write operations to the replica set can use the extended write concern, e.g. { w: "eastCoast" }.
See Configure Replica Set Tag Sets (page 691) for more information and example.
settings.heartbeatTimeoutSecs
Optional.
Type: int
Default: 10
Number of seconds that the replica set members wait for a successful heartbeat from each other. If a
member does not respond in time, other members mark the delinquent member as inaccessible.
settings.electionTimeoutMillis
New in version 3.2.
Optional.
Type: int
Default: 10000 (10 seconds)
The time limit in milliseconds for detecting when a replica sets primary is unreachable:
Higher values result in slower failovers but decreased sensitivity to primary node or network slowness
or spottiness.
Lower values result in faster failover, but increased sensitivity to primary node or network slowness
or spottiness.
The setting only applies when using protocolVersion: 1.
settings.heartbeatIntervalMillis
New in version 3.2.
Internal use only.
The frequency in milliseconds of the heartbeats.
On this page
Overview (page 715)
Collection on all mongod Instances (page 716)
Collections on Replica Set Members (page 717)
Collections used in Master/Slave Replication (page 717)
Overview
Every mongod instance has its own local database, which stores data used in the replication process, and other
instance-specific data. The local database is invisible to replication: collections in the local database are not
replicated.
In replication, the local database store stores internal replication data for each member of a replica set. The local
stores the following collections:
Changed in version 2.4: When running with authentication (i.e. authorization), authenticating to the local
database is not equivalent to authenticating to the admin database. In previous versions, authenticating to the local
database provided access to all databases.
local.startup_log
On startup, each mongod instance inserts a document into startup_log (page 716) with diagnostic informa-
tion about the mongod instance itself and host information. startup_log (page 716) is a capped collection.
This information is primarily useful for diagnostic purposes.
Example
Consider the following prototype of a document from the startup_log (page 716) collection:
{
"_id" : "<string>",
"hostname" : "<string>",
"startTime" : ISODate("<date>"),
"startTimeLocal" : "<string>",
"cmdLine" : {
"dbpath" : "<path>",
"<option>" : <value>
},
"pid" : <number>,
"buildinfo" : {
"version" : "<string>",
"gitVersion" : "<string>",
"sysInfo" : "<string>",
"loaderFlags" : "<string>",
"compilerFlags" : "<string>",
"allocator" : "<string>",
"versionArray" : [ <num>, <num>, <...> ],
"javascriptEngine" : "<string>",
"bits" : <number>,
"debug" : <boolean>,
"maxBsonObjectSize" : <number>
}
}
Documents in the startup_log (page 716) collection contain the following fields:
local.startup_log._id
Includes the system hostname and a millisecond epoch value.
local.startup_log.hostname
The systems hostname.
local.startup_log.startTime
A UTC ISODate value that reflects when the server started.
local.startup_log.startTimeLocal
A string that reports the startTime (page 716) in the systems local time zone.
local.startup_log.cmdLine
An embedded document that reports the mongod runtime options and their values.
local.startup_log.pid
The process identifier for this process.
local.startup_log.buildinfo
An embedded document that reports information about the build environment and settings used to compile
this mongod. This is the same output as buildInfo. See buildInfo.
local.system.replset
local.system.replset (page 717) holds the replica sets configuration object as its single document. To
view the objects configuration information, issue rs.conf() from the mongo shell. You can also query this
collection directly.
local.oplog.rs
local.oplog.rs (page 717) is the capped collection that holds the oplog. You set its size at creation using
the oplogSizeMB setting. To resize the oplog after replica set initiation, use the Change the Size of the Oplog
(page 684) procedure. For additional information, see the Oplog Size (page 647) section.
local.replset.minvalid
This contains an object used internally by replica sets to track replication status.
local.slaves
Removed in version 3.0: Replica set members no longer mirror replication status of the set to the
local.slaves (page 717) collection. Use rs.status() instead.
On this page
States (page 718)
Each member of a replica set has a state that reflects its disposition within the set.
States
Core States
PRIMARY
Members in PRIMARY (page 718) state accept write operations. A replica set has at most one primary at a time.
A SECONDARY (page 718) member becomes primary after an election (page 635). Members in the PRIMARY
(page 718) state are eligible to vote.
SECONDARY
Members in SECONDARY (page 718) state replicate the primarys data set and can be configured to accept read
operations. Secondaries are eligible to vote in elections, and may be elected to the PRIMARY (page 718) state if
the primary becomes unavailable.
ARBITER
Members in ARBITER (page 718) state do not replicate data or accept write operations. They are eligible to
vote, and exist solely to break a tie during elections. Replica sets should only have a member in the ARBITER
(page 718) state if the set would otherwise have an even number of members, and could suffer from tied elec-
tions. There should only be at most one arbiter configured in any replica set.
See Replica Set Members (page 618) for more information on core states.
Other States
STARTUP
Each member of a replica set starts up in STARTUP (page 718) state. mongod then loads that members
replica set configuration, and transitions the members state to STARTUP2 (page 718). Members in STARTUP
(page 718) are not eligible to vote, as they are not yet a recognized member of any replica set.
STARTUP2
Each member of a replica set enters the STARTUP2 (page 718) state as soon as mongod finishes loading
that members configuration, at which time it becomes an active member of the replica set. The member then
decides whether or not to undertake an initial sync. If a member begins an initial sync, the member remains in
STARTUP2 (page 718) until all data is copied and all indexes are built. Afterwards, the member transitions to
RECOVERING (page 719).
RECOVERING
A member of a replica set enters RECOVERING (page 719) state when it is not ready to accept reads. The
RECOVERING (page 719) state can occur during normal operation, and doesnt necessarily reflect an error
condition. Members in the RECOVERING (page 719) state are eligible to vote in elections, but are not eligible
to enter the PRIMARY (page 718) state.
A member transitions from RECOVERING (page 719) to SECONDARY (page 718) after replicating enough
data to guarantee a consistent view of the data for client reads. The only difference between RECOVERING
(page 719) and SECONDARY (page 718) states is that RECOVERING (page 719) prohibits client reads and
SECONDARY (page 718) permits them. SECONDARY (page 718) state does not guarantee anything about the
staleness of the data with respect to the primary.
Due to overload, a secondary may fall far enough behind the other members of the replica set such that it may
need to resync (page 690) with the rest of the set. When this happens, the member enters the RECOVERING
(page 719) state and requires manual intervention.
On this page
Read Preference Modes (page 721)
Use Cases (page 722)
Read Preferences for Database Commands (page 723)
Read preference describes how MongoDB clients route read operations to the members of a replica set.
By default, an application directs its read operations to the primary member in a replica set.
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
primary
All read operations use only the current replica set primary. 5 This is the default read mode. If the primary is
unavailable, read operations produce an error or throw an exception.
The primary (page 721) read preference mode is not compatible with read preference modes that use tag sets
(page 644). If you specify a tag set with primary (page 721), the driver will produce an error.
primaryPreferred
In most situations, operations read from the primary member of the set. However, if the primary is unavailable,
as is the case during failover situations, operations read from secondary members.
When the read preference includes a tag set (page 644), the client reads first from the primary, if available, and
then from secondaries that match the specified tags. If no secondaries have matching tags, the read operation
produces an error.
Since the application may receive data from a secondary, read operations using the primaryPreferred
(page 721) mode may return stale data in some situations.
Warning: Changed in version 2.2: mongos added full support for read preferences.
When connecting to a mongos instance older than 2.2, using a client that supports read preference modes,
primaryPreferred (page 721) will send queries to secondaries.
secondary
Operations read only from the secondary members of the set. If no secondaries are available, then this read
operation produces an error or exception.
Most sets have at least one secondary, but there are situations where there may be no available secondary. For
example, a set with a primary, a secondary, and an arbiter may not have any secondaries if a member is in
recovering state or unavailable.
When the read preference includes a tag set (page 644), the client attempts to find secondary members that
match the specified tag set and directs reads to a random secondary from among the nearest group (page 645).
If no secondaries have matching tags, the read operation produces an error. 23
Read operations using the secondary (page 721) mode may return stale data.
secondaryPreferred
In most situations, operations read from secondary members, but in situations where the set consists of a single
primary (and no other members), the read operation will use the sets primary.
When the read preference includes a tag set (page 644), the client attempts to find a secondary member that
matches the specified tag set and directs reads to a random secondary from among the nearest group (page 645).
If no secondaries have matching tags, the client ignores tags and reads from the primary.
Read operations using the secondaryPreferred (page 721) mode may return stale data.
nearest
The driver reads from the nearest member of the set according to the member selection (page 645) process.
23 If your set has more than one secondary, and you use the secondary (page 721) read preference mode, consider the following effect. If
you have a three member replica set (page 629) with a primary and two secondaries, and one secondary becomes unavailable, all secondary
(page 721) queries must target the remaining secondary. This will double the load on this secondary. Plan and provide capacity to support this as
needed.
Reads in the nearest (page 721) mode do not consider the members type. Reads in nearest (page 721)
mode may read from both primaries and secondaries.
Set this mode to minimize the effect of network latency on read operations without preference for current or
stale data.
If you specify a tag set (page 644), the client attempts to find a replica set member that matches the specified
tag set and directs reads to an arbitrary member from among the nearest group (page 645).
Read operations using the nearest (page 721) mode may return stale data.
Note: All operations read from a member of the nearest group of the replica set that matches the specified
read preference mode. The nearest (page 721) mode prefers low latency reads over a members primary or
secondary status.
For nearest (page 721), the client assembles a list of acceptable hosts based on tag set and then narrows that
list to the host with the shortest ping time and all other members of the set that are within the local threshold,
or acceptable latency. See Member Selection (page 645) for more information.
Use Cases
Depending on the requirements of an application, you can configure different applications to use different read prefer-
ences, or use different read preferences for different queries in the same application. Consider the following applica-
tions for different read preference strategies.
Maximize Consistency To avoid stale reads, use primary (page 721) read preference and "majority"
(page 144) readConcern. If the primary is unavailable, e.g. during elections or when a majority of the replica
set is not accessible, read operations using primary (page 721) read preference produce an error or throw an excep-
tion. In some circumstances, it may be possible for a replica set to temporarily have two primaries; however, only one
primary will be capable of confirming writes with the "majority" (page 142) write concern.
A partial network partition may segregate a primary (pold ) into a partition with a minority of the nodes, while the
other side of the partition contains a majority of nodes. The partition with the majority will elect a new primary
(Pnew ), but for a brief period, the old primary (pold ) may still continue to serve reads and writes, as it has not yet
detected that it can only see a minority of nodes in the replica set. During this period, if the old primary (pold ) is
still visible to clients as a primary, reads from this primary may reflect stale data.
A primary (pold ) may become unresponsive, which will trigger an election and a new primary (Pnew ) can be
elected, serving reads and writes. If the unresponsive primary (pold ) starts responding again, two primaries will
be visible for a brief period. The brief period will end when pold steps down. However, during the brief period,
clients might read from the old primary pold , which can provide stale data.
To increase consistency, you can disable automatic failover; however, disabling automatic failover sacrifices availabil-
ity.
Maximize Availability To permit read operations when possible, use primaryPreferred (page 721). When
theres a primary you will get consistent reads 5 , but if there is no primary you can still query secondaries. However,
when using this read mode, consider the situation described in Reduce load on the primary (page 723).
Minimize Latency To always read from a low-latency node, use nearest (page 721). The driver or mongos will
read from the nearest member and those no more than 15 milliseconds 24 further away than the nearest member.
24 This threshold is configurable. See localPingThresholdMs for mongos or your driver documentation for the appropriate setting.
nearest (page 721) does not guarantee consistency. If the nearest member to your application server is a secondary
with some replication lag, queries could return stale data. nearest (page 721) only reflects network distance and
does not reflect I/O or CPU load.
Query From Geographically Distributed Members If the members of a replica set are geographically distributed,
you can create replica tags based that reflect the location of the instance and then configure your application to query
the members nearby.
For example, if members in east and west data centers are tagged (page 691) {dc: east} and {dc:
west}, your application servers in the east data center can read from nearby members with the following read
preference:
db.collection.find().readPref( { mode: 'nearest',
tags: [ {'dc': 'east'} ] } )
Although nearest (page 721) already favors members with low network latency, including the tag makes the choice
more predictable.
Reduce load on the primary To shift read load from the primary, use mode secondary (page 721). Although
secondaryPreferred (page 721) is tempting for this use case, it carries some risk: if all secondaries are unavail-
able and your set has enough arbiters to prevent the primary from stepping down, then the primary will receive all
traffic from clients. If the primary is unable to handle this load, queries will compete with writes. For this reason, use
secondary (page 721) to distribute read load to replica sets, not secondaryPreferred (page 721).
Because some database commands read and return data from the database, all of the official drivers support full read
preference mode semantics (page 721) for the following commands:
group
mapReduce 25
aggregate 26
collStats
dbStats
count
distinct
geoNear
geoSearch
parallelCollectionScan
New in version 2.4: mongos adds support for routing commands to shards using read preferences. Previously
mongos sent all commands to shards primaries.
25 Only inline mapReduce operations that do not write data support read preference, otherwise these operations must run on the primary
members.
26 Using the $out pipeline operator forces the aggregation pipeline to run on the primary.
Sharding
Sharding is the process of storing data records across multiple machines and is MongoDBs approach to meeting the
demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor
provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding,
you add more machines to support data growth and the demands of read and write operations.
On this page
Purpose of Sharding (page 725)
Sharding in MongoDB (page 727)
Data Partitioning (page 728)
Maintaining a Balanced Data Distribution (page 729)
Additional Resources (page 731)
Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with
very large data sets and high throughput operations.
Database systems with large data sets and high throughput applications can challenge the capacity of a single server.
High query rates can exhaust the CPU capacity of the server. Larger data sets exceed the storage capacity of a single
machine. Finally, working set sizes larger than the systems RAM stress the I/O capacity of disk drives.
To address these issues of scales, database systems have two basic approaches: vertical scaling and sharding.
Vertical scaling adds more CPU and storage resources to increase capacity. Scaling by adding capacity has lim-
itations: high performance systems with large numbers of CPUs and large amount of RAM are disproportionately
more expensive than smaller systems. Additionally, cloud-based providers may only allow users to provision smaller
instances. As a result there is a practical maximum capability for vertical scaling.
Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or
shards. Each shard is an independent database, and collectively, the shards make up a single logical database.
Sharding addresses the challenge of scaling to support high throughput and large data sets:
Sharding reduces the number of operations each shard handles. Each shard processes fewer operations as the
cluster grows. As a result, a cluster can increase capacity and throughput horizontally.
725
MongoDB Documentation, Release 3.2.4
For example, to insert data, the application only needs to access the shard responsible for that record.
Sharding reduces the amount of data that each server needs to store. Each shard stores less data as the cluster
grows.
For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard might hold only 256
GB of data. If there are 40 shards, then each shard might hold only 25 GB of data.
Sharded cluster has the following components: shards, query routers and config servers.
Shards store the data. To provide high availability and data consistency, in a production sharded cluster, each shard is
a replica set 1 . For more information on replica sets, see Replica Sets (page 617).
Query Routers, or mongos instances, interface with client applications and direct operations to the appropriate shard
or shards. A client sends requests to a mongos, which then routes the operations to the shards and returns the results
to the clients. A sharded cluster can contain more than one mongos to divide the client request load, and most sharded
clusters have more than one mongos for this reason.
Config servers store the clusters metadata. This data contains a mapping of the clusters data set to the shards. The
query router uses this metadata to target operations to specific shards.
1 For development and testing purposes only, each shard can be a single mongod instead of a replica set.
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica
set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587). MongoDB 3.2
deprecates the use of three mirrored mongod instances for config servers.
MongoDB distributes data, or shards, at the collection level. Sharding partitions a collections data by the shard key.
Shard Keys
To shard a collection, you need to select a shard key. A shard key is either an indexed field or an indexed compound
field that exists in every document in the collection. MongoDB divides the shard key values into chunks and distributes
the chunks evenly across the shards. To divide the shard key values into chunks, MongoDB uses either range based
partitioning or hash based partitioning. See the Shard Key (page 739) documentation for more information.
For range-based sharding, MongoDB divides the data set into ranges determined by the shard key values to provide
range based partitioning. Consider a numeric shard key: If you visualize a number line that goes from negative
infinity to positive infinity, each value of the shard key falls at some point on that line. MongoDB partitions this line
into smaller, non-overlapping ranges called chunks where a chunk is range of values from some minimum value to
some maximum value.
Given a range based partitioning system, documents with close shard key values are likely to be in the same chunk,
and therefore on the same shard.
For hash based partitioning, MongoDB computes a hash of a fields value, and then uses these hashes to create chunks.
With hash based partitioning, two documents with close shard key values are unlikely to be part of the same chunk.
This ensures a more random distribution of a collection in the cluster.
Range based partitioning supports more efficient range queries. Given a range query on the shard key, the query router
can easily determine which chunks overlap that range and route the query to only those shards that contain these
chunks.
However, range based partitioning can result in an uneven distribution of data, which may negate some of the benefits
of sharding. For example, if the shard key is a linearly increasing field, such as time, then all requests for a given time
range will map to the same chunk, and thus the same shard. In this situation, a small set of shards may receive the
majority of requests and the system would not scale very well.
Hash based partitioning, by contrast, ensures an even distribution of data at the expense of efficient range queries.
Hashed key values results in random distribution of data across chunks and therefore shards. But random distribution
makes it more likely that a range query on the shard key will not be able to target a few shards but would more likely
query every shard in order to return a result.
MongoDB allows administrators to direct the balancing policy using tag aware sharding. Administrators create and
associate tags with ranges of the shard key, and then assign those tags to the shards. Then, the balancer migrates
tagged data to the appropriate shards and ensures that the cluster always enforces the distribution of data that the tags
describe.
Tags are the primary mechanism to control the behavior of the balancer and the distribution of chunks in a cluster.
Most commonly, tag aware sharding serves to improve the locality of data for sharded clusters that span multiple data
centers.
See Tag Aware Sharding (page 748) for more information.
The addition of new data or the addition of new servers can result in data distribution imbalances within the cluster,
such as a particular shard contains significantly more chunks than another shard or a size of a chunk is significantly
greater than other chunk sizes.
MongoDB ensures a balanced cluster using two background process: splitting and the balancer.
Splitting
Splitting is a background process that keeps chunks from growing too large. When a chunk grows beyond a specified
chunk size (page 754), MongoDB splits the chunk in half. Inserts and updates triggers splits. Splits are an efficient
meta-data change. To create splits, MongoDB does not migrate any data or affect the shards.
Balancing
The balancer (page 750) is a background process that manages chunk migrations. The balancer can run from any of
the mongos instances in a cluster.
When the distribution of a sharded collection in a cluster is uneven, the balancer process migrates chunks from the
shard that has the largest number of chunks to the shard with the least number of chunks until the collection balances.
For example: if collection users has 100 chunks on shard 1 and 50 chunks on shard 2, the balancer will migrate
chunks from shard 1 to shard 2 until the collection achieves balance.
The shards manage chunk migrations as a background operation between an origin shard and a destination shard.
During a chunk migration, the destination shard is sent all the current documents in the chunk from the origin shard.
Next, the destination shard captures and applies all changes made to the data during the migration process. Finally,
the metadata regarding the location of the chunk on config server is updated.
If theres an error during the migration, the balancer aborts the process leaving the chunk unchanged on the origin
shard. MongoDB removes the chunks data from the origin shard after the migration completes successfully.
Adding a shard to a cluster creates an imbalance since the new shard has no chunks. While MongoDB begins migrating
data to the new shard immediately, it can take some time before the cluster balances.
When removing a shard, the balancer migrates all chunks from a shard to other shards. After migrating all data and
updating the meta data, you can safely remove the shard.
These documents present the details of sharding in MongoDB. These include the components, the architectures, and the
behaviors of MongoDB sharded clusters. For an overview of sharding and sharded clusters, see Sharding Introduction
(page 725).
Sharded Cluster Components (page 732) A sharded cluster consists of shards, config servers, and mongos in-
stances.
Shards (page 732) A shard is a single server or replica set that holds a part of the sharded collection.
Config Servers (page 734) Config servers hold the metadata about the cluster, such as the shard location of the
data.
Sharded Cluster Architectures (page 736) Outlines the requirements for sharded clusters, and provides examples of
several possible architectures for sharded clusters.
Sharded Cluster Requirements (page 736) Discusses the requirements for sharded clusters in MongoDB.
Production Cluster Architecture (page 737) Outlines the components required to deploy a redundant and
highly available sharded cluster.
Continue reading from Sharded Cluster Architectures (page 736) for additional descriptions of sharded cluster
deployments.
Sharded Cluster Behavior (page 739) Discusses the operations of sharded clusters with regards to the automatic bal-
ancing of data in a cluster and other related availability and security considerations.
Shard Keys (page 739) MongoDB uses the shard key to divide a collections data across the clusters shards.
2 http://www.mongodb.com/presentations/webinar-sharding-methods-mongodb?jmp=docs
3 http://www.mongodb.com/presentations/webinar-everything-you-need-know-about-sharding?jmp=docs
4 http://www.mongodb.com/presentations/mongodb-time-series-data-part-3-sharding?jmp=docs
5 http://www.mongodb.com/lp/white-paper/ops-best-practices?jmp=docs
6 http://www.mongodb.com/lp/contact/planning-for-scale?jmp=docs
7 https://www.mongodb.com/products/consulting?jmp=docs
8 https://www.mongodb.com/lp/misc/quick-reference-cards?jmp=docs
Sharded Cluster High Availability (page 742) Sharded clusters provide ways to address some availability con-
cerns.
Sharded Cluster Query Routing (page 744) The clusters routers, or mongos instances, send reads and writes
to the relevant shard or shards.
Sharding Mechanics (page 750) Discusses the internal operation and behavior of sharded clusters, including chunk
migration, balancing, and the cluster metadata.
Sharded Collection Balancing (page 750) Balancing distributes a sharded collections data cluster to all of the
shards.
Sharded Cluster Metadata (page 756) The cluster maintains internal metadata that reflects the location of data
within the cluster.
Continue reading from Sharding Mechanics (page 750) for more documentation of the behavior and operation
of sharded clusters.
Sharded clusters implement sharding. A sharded cluster consists of the following components:
Shards A shard is a MongoDB instance that holds a subset of a collections data. Each shard is either a single
mongod instance or a replica set. In production, all shards are replica sets. For more information see Shards
(page 732).
Config Servers Config servers (page 734) hold metadata about the sharded cluster. The metadata maps chunks to
shards.
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as
a replica set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587).
MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
For more information, see Config Servers (page 734).
mongos Instances mongos instances route the reads and writes from applications to the shards. Applications do
not access the shards directly. For more information see Sharded Cluster Query Routing (page 744).
To deploy a sharded cluster, see Deploy a Sharded Cluster (page 757).
Shards
On this page
Primary Shard (page 734)
Shard Status (page 734)
A shard is a replica set or a single mongod that contains a subset of the data for the sharded cluster. Together, the
clusters shards hold the entire data set for the cluster.
Typically each shard is a replica set. The replica set provides redundancy and high availability for the data in each
shard.
Important: MongoDB shards data on a per collection basis. You must access all data in a sharded cluster via the
mongos instances. If you connect directly to a shard, you will see only its fraction of the clusters data. There is no
particular order to the data set on a specific shard. MongoDB does not guarantee that any two contiguous chunks will
reside on a single shard.
Primary Shard
Every database has a primary 9 shard that holds all the un-sharded collections in that database.
To change the primary shard for a database, use the movePrimary command. The process of migrating the primary
shard may take significant time to complete, and you should not access the collections until it completes.
When you deploy a new sharded cluster with shards that were previously used as replica sets, all existing databases
continue to reside on their original shard. Databases created subsequently may reside on any shard in the cluster.
Shard Status
Use the sh.status() method in the mongo shell to see an overview of the cluster. This reports includes which
shard is primary for the database and the chunk distribution across the shards. See sh.status() method for more
details.
Config Servers
On this page
Replica Set Config Servers (page 735)
Read and Write Operations on Config Servers (page 735)
Config Server Availability (page 735)
9 The term primary shard has nothing to do with the term primary in the context of replica sets.
Config servers store the metadata (page 756) for a sharded cluster.
Warning: If the config servers become inaccessible, the cluster is not accessible. If you cannot recover the data
on a config server, the cluster will be inoperable.
Config servers store the clusters metadata in the config database (page 816). The mongos instances cache this data
and use it to route reads and writes to shards.
MongoDB only writes data to the config servers when the metadata changes, such as
after a chunk migration (page 751), or
after a chunk split (page 754).
When writing to the replica set config servers, MongoDB uses a write concern (page 141) of "majority".
MongoDB reads data from the config server in the following cases:
A new mongos starts for the first time, or an existing mongos restarts.
After change in the cluster metadata, such as after a chunk migration.
When reading from the replica set config servers, MongoDB uses a Read Concern (page 143) level of "majority"
(page 144).
If the config server replica set loses its primary and cannot elect a primary, the clusters metadata becomes read only.
You can still read and write data from the shards, but no chunk migration or chunk splits will occur until the replica
set can elect a primary. If all config databases become unavailable, the cluster can become inoperable.
The mongos instances cache the metadata from the config servers. As such, if all config server members become
unavailable, you can still use the cluster if you do not restart the mongos instances until after the config servers are
accessible again. If you restart the mongos instances before the config servers are available, the mongos will be
unable to route reads and writes.
Clusters become inoperable without the cluster metadata. To ensure that the config servers remain available and intact,
backups of config servers are critical. The data on the config server is small compared to the data stored in a cluster,
and the config server has a relatively low activity load.
See A Config Server Replica Set Member Become Unavailable (page 743) for more information.
On this page
Data Quantity Requirements (page 737)
While sharding is a powerful and compelling feature, sharded clusters have significant infrastructure requirements
and increases the overall complexity of a deployment. As a result, only deploy sharded clusters when indicated by
application and operational requirements
Sharding is the only solution for some classes of deployments. Use sharded clusters if:
your data set approaches or exceeds the storage capacity of a single MongoDB instance.
the size of your systems active working set will soon exceed the capacity of your systems maximum RAM.
a single MongoDB instance cannot meet the demands of your write operations, and all other approaches have
not reduced contention.
If these attributes are not present in your system, sharding will only add complexity to your system without adding
much benefit.
Important: It takes time and resources to deploy sharding. If your system has already reached or exceeded its
capacity, it will be difficult to deploy sharding without impacting your application.
As a result, if you think you will need to partition your database in the future, do not wait until your system is over
capacity to enable sharding.
When designing your data model, take into consideration your sharding needs.
Your cluster should manage a large quantity of data if sharding is to have an effect. The default chunk size is 64
megabytes. And the balancer (page 750) will not begin moving data across shards until the imbalance of chunks among
the shards exceeds the migration threshold (page 751). In practical terms, unless your cluster has many hundreds of
megabytes of data, your data will remain on a single shard.
In some situations, you may need to shard a small collection of data. But most of the time, sharding a small collection
is not worth the added complexity and overhead unless you need additional write capacity. If you have a small data
set, a properly configured single MongoDB instance or a replica set will usually be enough for your persistence layer
needs.
Chunk size is user configurable. For most deployments, the default value is of 64 megabytes is ideal. See
Chunk Size (page 754) for more information.
In a production cluster, you must ensure that data is redundant and that your systems are highly available. To that end,
a production cluster must have the following components:
Config Servers Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be
deployed as a replica set (page 613). The replica set config servers must run the WiredTiger storage engine
(page 587). MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
A single sharded cluster must have exclusive use of its config servers (page 734). If you have multiple
sharded clusters, each cluster must have its own replica set config servers.
Two or More Replica Sets As Shards These replica sets are the shards. For information on replica sets, see
Replication (page 613).
One or More Query Routers (mongos) The mongos instances are the routers for the cluster. Typically, de-
ployments have one mongos instance on each application server.
You may also deploy a group of mongos instances and use a proxy/load balancer between the application
and the mongos. In these deployments, you must configure the load balancer for client affinity so that
every connection from a single client reaches the same mongos.
Because cursors and other resources are specific to an single mongos instance, each client must interact
with only one mongos instance.
See also:
Deploy a Sharded Cluster (page 757)
Warning: Use the test cluster architecture for testing and development only.
For testing and development, you can deploy a sharded cluster with a minimum number of components. These non-
production clusters have the following components:
A replica set config server (page 734) with one member.
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as
a replica set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587).
MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
At least one shard. Shards are either replica sets or a standalone mongod instances.
See
Production Cluster Architecture (page 737)
These documents address the distribution of data and queries to a sharded cluster as well as specific security and
availability considerations for sharded clusters.
Shard Keys (page 739) MongoDB uses the shard key to divide a collections data across the clusters shards.
Sharded Cluster High Availability (page 742) Sharded clusters provide ways to address some availability concerns.
Sharded Cluster Query Routing (page 744) The clusters routers, or mongos instances, send reads and writes to the
relevant shard or shards.
Tag Aware Sharding (page 748) Tags associate specific ranges of shard key values with specific shards for use in
managing deployment patterns.
Shard Keys
On this page
Considerations (page 740)
Hashed Shard Keys (page 740)
Impacts of Shard Keys on Cluster Operations (page 741)
Additional Information (page 742)
The shard key determines the distribution of the collections documents among the clusters shards. The shard key is
either an indexed field or an indexed compound field that exists in every document in the collection.
MongoDB partitions data in the collection using ranges of shard key values. Each range, or chunk, defines a non-
overlapping range of shard key values. MongoDB distributes the chunks, and their documents, among the shards in
the cluster.
When a chunk grows beyond the chunk size (page 754), MongoDB attempts to split the chunk into smaller chunks,
always based on ranges in the shard key.
Considerations
Shard keys are immutable and cannot be changed after insertion. See the system limits for sharded cluster for more
information.
The index on the shard key cannot be a multikey index (page 497).
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need
to compute hashes.
The shard key affects write and query performance by determining how the MongoDB partitions data in the cluster
and how effectively the mongos instances can direct operations to the cluster. Consider the following operational
impacts of shard key selection:
Write Scaling Some possible shard keys will allow your application to take advantage of the increased write capacity
that the cluster can provide, while others do not. Consider the following example where you shard by the values of the
default _id field, which is ObjectId.
MongoDB generates ObjectId values upon document creation to produce a unique identifier for the object. How-
ever, the most significant bits of data in this value represent a time stamp, which means that they increment in a regular
and predictable pattern. Even though this value has high cardinality (page 764), when using this, any date, or other
monotonically increasing number as the shard key, all insert operations will be storing data into a single chunk, and
therefore, a single shard. As a result, the write capacity of this shard will define the effective write capacity of the
cluster.
A shard key that increases monotonically will not hinder performance if you have a very low insert rate, or if most
of your write operations are update() operations distributed through your entire data set. Generally, choose shard
keys that have both high cardinality and will distribute write operations across the entire cluster.
Typically, a computed shard key that has some amount of randomness, such as ones that include a cryptographic
hash (i.e. MD5 or SHA1) of other content in the document, will allow the cluster to scale write operations. However,
random shard keys do not typically provide query isolation (page 741), which is another important characteristic of
shard keys.
New in version 2.4: MongoDB makes it possible to shard a collection on a hashed index. This can greatly improve
write scaling. See Shard a Collection Using a Hashed Shard Key (page 765).
Querying The mongos provides an interface for applications to interact with sharded clusters that hides the com-
plexity of data partitioning. A mongos receives queries from applications, and uses metadata from the config server
(page 734), to route queries to the mongod instances with the appropriate data. While the mongos succeeds in mak-
ing all querying operational in sharded environments, the shard key you select can have a profound affect on query
performance.
See also:
The Sharded Cluster Query Routing (page 744) and config server (page 734) sections for a more general overview of
querying in sharded environments.
Query Isolation Generally, the fastest queries in a sharded environment are those that mongos will route to a single
shard, using the shard key and the cluster meta data from the config server (page 734). For queries that dont include
the shard key, mongos must query all shards, wait for their responses and then return the result to the application.
These scatter/gather queries can be long running operations.
If your query includes the first component of a compound shard key 10 , the mongos can route the query directly to a
single shard, or a small number of shards, which provides better performance. Even if you query values of the shard
key that reside in different chunks, the mongos will route queries directly to specific shards.
To select a shard key for a collection:
determine the most commonly included fields in queries for a given application
find which of these operations are most performance dependent.
10 In many ways, you can think of the shard key a cluster-wide index. However, be aware that sharded systems cannot enforce cluster-wide unique
indexes unless the unique field is in the shard key. Consider the Index Concepts (page 492) page for more information on indexes and compound
indexes.
If this field has low cardinality (i.e not sufficiently selective) you should add a second field to the shard key making a
compound shard key. The data may become more splittable with a compound shard key.
See
Sharded Cluster Query Routing (page 744) for more information on query operations in the context of sharded clusters.
Sorting In sharded systems, the mongos performs a merge-sort of all sorted query results from the shards. See
Sharded Cluster Query Routing (page 744) and Use Indexes to Sort Query Results (page 575) for more information.
Indivisible Chunks An insufficiently granular shard key can result in chunks that are unsplittable. See Create a
Shard Key that is Easily Divisible (page 763) for more information.
Additional Information
On this page
Application Servers or mongos Instances Become Unavailable (page 742)
A Single mongod Becomes Unavailable in a Shard (page 742)
All Members of a Shard Become Unavailable (page 743)
A Config Server Replica Set Member Become Unavailable (page 743)
Renaming Mirrored Config Servers and Cluster Availability (page 743)
Shard Keys and Cluster Availability (page 743)
A production (page 737) cluster has no single point of failure. This section introduces the availability concerns for
MongoDB deployments in general and highlights potential failure scenarios and available resolutions.
If each application server has its own mongos instance, other application servers can continue to access the database.
Furthermore, mongos instances do not maintain persistent state, and they can restart and become unavailable without
losing any state or data. When a mongos instance starts, it retrieves a copy of the config database and can begin
routing queries.
Replica sets (page 613) provide high availability for shards. If the unavailable mongod is a primary, then the replica
set will elect (page 635) a new primary. If the unavailable mongod is a secondary, and it disconnects the primary and
secondary will continue to hold all data. In a three member replica set, even if a single member of the set experiences
catastrophic failure, two other members have full copies of the data. 11
11 If an unavailable secondary becomes available while it still has current oplog entries, it can catch up to the latest state of the set using the
Always investigate availability interruptions and failures. If a system is unrecoverable, replace it and create a new
member of the replica set as soon as possible to replace the lost redundancy.
If all members of a replica set shard are unavailable, all data held in that shard is unavailable. However, the data on
all other shards will remain available, and it is possible to read and write data to the other shards. However, your
application must be able to deal with partial results, and you should investigate the cause of the interruption and
attempt to recover the shard as soon as possible.
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica
set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587). MongoDB 3.2
deprecates the use of three mirrored mongod instances for config servers.
Replica sets (page 613) provide high availability for the config servers. If an unavailable config server is a primary,
then the replica set will elect (page 635) a new primary.
If the replica set config server loses its primary and cannot elect a primary, the clusters metadata becomes read only.
You can still read and write data from the shards, but no chunk migration (page 750) or chunk splits (page 800) will
occur until a primary is available. If all config databases become unavailable, the cluster can become inoperable.
Note: All config servers must be running and available when you first initiate a sharded cluster.
If the sharded cluster is using mirrored config servers instead of a replica set and the name or address that a sharded
cluster uses to connect to a config server changes, you must restart every mongod and mongos instance in the sharded
cluster. Avoid downtime by using CNAMEs to identify config servers within the MongoDB deployment.
To avoid downtime when renaming config servers, use DNS names unrelated to physical or virtual hostnames to refer
to your config servers (page 734).
Generally, refer to each config server using the DNS alias (e.g. a CNAME record). When specifying the config server
connection string to mongos, use these names. These records make it possible to change the IP address or rename
config servers without changing the connection string and without having to restart the entire cluster.
If the shard key allows the mongos to isolate most operations to a single shard, then the failure of a single shard
will only render some data unavailable.
If your shard key distributes data required for every operation throughout the cluster, then the failure of the entire
shard will render the entire cluster unavailable.
In essence, this concern for reliability simply underscores the importance of choosing a shard key that isolates query
operations to a single shard.
On this page
Routing Process (page 744)
Detect Connections to mongos Instances (page 745)
Broadcast Operations and Targeted Operations (page 745)
Sharded and Non-Sharded Data (page 748)
MongoDB mongos instances route queries and write operations to shards in a sharded cluster. mongos provide the
only interface to a sharded cluster from the perspective of applications. Applications never connect or communicate
directly with the shards.
The mongos tracks what data is on which shard by caching the metadata from the config servers (page 734). The
mongos uses the metadata to route operations from applications and clients to the mongod instances. A mongos
has no persistent state and consumes minimal system resources.
The most common practice is to run mongos instances on the same systems as your application servers, but you can
maintain mongos instances on the shards or on other dedicated resources.
Changed in version 3.2: For aggregation operations (page 447) that run on multiple shards, if the operations do not
require running on the databases primary shard, these operations can route the results to any shard to merge the results
and avoid overloading the primary shard for that database.
Routing Process
A mongos instance uses the following processes to route queries and return results.
How mongos Determines which Shards Receive a Query A mongos instance routes a query to a cluster by:
1. Determining the list of shards that must receive the query.
2. Establishing a cursor on all targeted shards.
In some cases, when the shard key or a prefix of the shard key is a part of the query, the mongos can route the query
to a subset of the shards. Otherwise, the mongos must direct the query to all shards that hold documents for that
collection.
Example
Given the following shard key:
{ zipcode: 1, u_id: 1, c_date: 1 }
Depending on the distribution of chunks in the cluster, the mongos may be able to target the query at a subset of
shards, if the query contains the following fields:
{ zipcode: 1 }
{ zipcode: 1, u_id: 1 }
{ zipcode: 1, u_id: 1, c_date: 1 }
How mongos Handles Query Modifiers If the result of the query is not sorted, the mongos instance opens a result
cursor that round robins results from all cursors on the shards.
If the query specifies sorted results using the sort() cursor method, the mongos instance passes the $orderby
option to the shards. The primary shard for the database receives and performs a merge sort for all results before
returning the data to the client via the mongos.
If the query limits the size of the result set using the limit() cursor method, the mongos instance passes that limit
to the shards and then re-applies the limit to the result before returning the result to the client.
If the query specifies a number of records to skip using the skip() cursor method, the mongos cannot pass the skip
to the shards, but rather retrieves unskipped results from the shards and skips the appropriate number of documents
when assembling the complete result. However, when used in conjunction with a limit(), the mongos will pass
the limit plus the value of the skip() to the shards to improve the efficiency of these operations.
To detect if the MongoDB instance that your client is connected to is mongos, use the isMaster command. When
a client connects to a mongos, isMaster returns a document with a msg field that holds the string isdbgrid. For
example:
{
"ismaster" : true,
"msg" : "isdbgrid",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
If the application is instead connected to a mongod, the returned document does not include the isdbgrid string.
Broadcast Operations mongos instances broadcast queries to all shards for the collection unless the mongos can
determine which shard or subset of shards stores this data.
Multi-update operations are always broadcast operations.
The remove() operation is always a broadcast operation, unless the operation specifies the shard key in full.
Important: All update() and remove() operations for a sharded collection that specify the justOne or
multi: false option must include the shard key or the _id field in the query specification. update() and
remove() operations specifying justOne or multi: false in a sharded collection without the shard key or
the _id field return an error.
For queries that include the shard key or portion of the shard key, mongos can target the query at a specific shard or
set of shards. This is the case only if the portion of the shard key included in the query is a prefix of the shard key. For
example, if the shard key is:
{ a: 1, b: 1, c: 1 }
The mongos program can route queries that include the full shard key or either of the following shard key prefixes at
a specific shard or set of shards:
{ a: 1 }
{ a: 1, b: 1 }
Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still have to contact
Sharding operates on the collection level. You can shard multiple collections within a database or have multiple
databases with sharding enabled. 13 However, in production deployments, some databases and collections will use
sharding, while other databases and collections will only reside on a single shard.
Regardless of the data architecture of your sharded cluster, ensure that all queries and operations use the mongos
router to access the data cluster. Use the mongos even for operations that do not impact the sharded data.
On this page
Considerations (page 749)
Behavior and Operations (page 749)
Additional Resource (page 750)
MongoDB supports tagging a range of shard key values to associate that range with a shard or group of shards. Those
shards receive all inserts within the tagged range.
The balancer obeys tagged range associations, which enables the following deployment patterns:
12 mongos will route some queries, even some that include the shard key, to all shards, if needed.
13 As you configure sharding, you will use the enableSharding command to enable sharding for a database. This simply makes it possible
to use the shardCollection command on a collection within that database.
Considerations
Shard key range tags are distinct from replica set member tags (page 644).
Hash-based sharding only supports tag-aware sharding on an entire collection.
Shard ranges are always inclusive of the lower value and exclusive of the upper boundary.
The balancer migrates chunks of documents in a sharded collection to the shards associated with a tag that has a shard
key range with an upper bound greater than the chunks lower bound.
During balancing rounds, if the balancer detects that any chunks violate configured tags, the balancer migrates those
chunks to shards associated with those tags.
After configuring a tag with a shard key range and associating it with a shard or shards, the cluster may take some time
to balance the data among the shards. This depends on the division of chunks and the current distribution of data in
the cluster.
Once configured, the balancer respects tag ranges during future balancing rounds (page 750).
See also:
Manage Shard Tags (page 808)
Additional Resource
On this page
Cluster Balancer (page 750)
Migration Thresholds (page 751)
Shard Size (page 751)
Balancing is the process MongoDB uses to distribute data of a sharded collection evenly across a sharded cluster.
When a shard has too many of a sharded collections chunks compared to other shards, MongoDB automatically
balances the chunks across the shards. The balancing procedure for sharded clusters is entirely transparent to the user
and application layer.
Cluster Balancer
The balancer process is responsible for redistributing the chunks of a sharded collection evenly among the shards for
every sharded collection. By default, the balancer process is always enabled.
Any mongos instance in the cluster can start a balancing round. When a balancer process is active, the responsible
mongos acquires a lock by modifying a document in the lock collection in the Config Database (page 816).
Changed in version 3.2: With replica set config servers, clock skew does not affect distributed lock management.
If you are using mirrored config servers, large differences in timekeeping can lead to failed distributed locks. With
mirrored config servers, minimize clock skew by running the network time protocol (NTP) ntpd on your servers.
To address uneven chunk distribution for a sharded collection, the balancer migrates chunks (page 751) from shards
with more chunks to shards with a fewer number of chunks. The balancer migrates the chunks, one at a time, until
there is an even distribution of chunks for the collection across the shards. For details about chunk migration, see
Chunk Migration Procedure (page 752).
14 http://www.mongodb.com/lp/white-paper/multi-dc?jmp=docs
15 https://www.mongodb.com/presentations/webinar-multi-data-center-deployment?jmp=docs
Changed in version 2.6: Chunk migrations can have an impact on disk space. Starting in MongoDB 2.6, the source
shard automatically archives the migrated documents by default. For details, see moveChunk directory (page 753).
Chunk migrations carry some overhead in terms of bandwidth and workload, both of which can impact database
performance. The balancer attempts to minimize the impact by:
Moving only one chunk at a time. See also Chunk Migration Queuing (page 753).
Starting a balancing round only when the difference in the number of chunks between the shard with the greatest
number of chunks for a sharded collection and the shard with the lowest number of chunks for that collection
reaches the migration threshold (page 751).
You may disable the balancer temporarily for maintenance. See Disable the Balancer (page 794) for details.
You can also limit the window during which the balancer runs to prevent it from impacting production traffic. See
Schedule the Balancing Window (page 793) for details.
Note: The specification of the balancing window is relative to the local time zone of all individual mongos instances
in the cluster.
See also:
Manage Sharded Cluster Balancer (page 792).
Migration Thresholds
To minimize the impact of balancing on the cluster, the balancer will not begin balancing until the distribution of
chunks for a sharded collection has reached certain thresholds. The thresholds apply to the difference in number
of chunks between the shard with the most chunks for the collection and the shard with the fewest chunks for that
collection. The balancer has the following thresholds:
Number of Chunks Migration Threshold
Fewer than 20 2
20-79 4
80 and greater 8
Once a balancing round starts, the balancer will not stop until, for the collection, the difference between the number
of chunks on any two shards for that collection is less than two or a chunk migration fails.
Shard Size
By default, MongoDB will attempt to fill all available disk space with data on every shard as the data set grows. To
ensure that the cluster always has the capacity to handle data growth, monitor disk usage as well as other performance
metrics.
When adding a shard, you may set a maximum size for that shard. This prevents the balancer from migrating
chunks to the shard when the value of mem.mapped exceeds the maximum size. Use the maxSize parameter of
the addShard command to set the maximum size for the shard.
See also:
Change the Maximum Storage Size for a Given Shard (page 790) and Monitoring for MongoDB (page 203).
On this page
Chunk Migration (page 752)
moveChunk directory (page 753)
Jumbo Chunks (page 754)
Chunk migration moves the chunks of a sharded collection from one shard to another and is part of the balancer
(page 750) process.
Chunk Migration
MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards.
Migrations may be either:
Manual. Only use manual migration in limited cases, such as to distribute data during bulk inserts. See Migrating
Chunks Manually (page 801) for more details.
Automatic. The balancer (page 750) process automatically migrates chunks when there is an uneven distribution
of a sharded collections chunks across the shards. See Migration Thresholds (page 751) for more details.
Chunk Migration Procedure All chunk migrations use the following procedure:
1. The balancer process sends the moveChunk command to the source shard.
2. The source starts the move with an internal moveChunk command. During the migration process, operations
to the chunk route to the source shard. The source shard is responsible for incoming write operations for the
chunk.
3. The destination shard builds any indexes required by the source that do not exist on the destination.
4. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.
5. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure
that it has the changes to the migrated documents that occurred during the migration.
6. When fully synchronized, the destination shard connects to the config database and updates the cluster metadata
with the new location for the chunk.
7. After the destination shard completes the update of the metadata, and once there are no open cursors on the
chunk, the source shard deletes its copy of the documents.
Note: If the balancer needs to perform additional chunk migrations from the source shard, the balancer can
start the next chunk migration without waiting for the current migration process to finish this deletion step. See
Chunk Migration Queuing (page 753).
Changed in version 2.6: The source shard automatically archives the migrated documents by default. For more
information, see moveChunk directory (page 753).
The migration process ensures consistency and maximizes the availability of chunks during balancing.
Chunk Migration Queuing To migrate multiple chunks from a shard, the balancer migrates the chunks one at a
time. However, the balancer does not wait for the current migrations delete phase to complete before starting the next
chunk migration. See Chunk Migration (page 752) for the chunk migration process and the delete phase.
This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as
when performing initial data loads without pre-splitting and when adding new shards.
This behavior also affect the moveChunk command, and migration scripts that use the moveChunk command may
proceed more quickly.
In some cases, the delete phases may persist longer. If multiple delete phases are queued but not yet complete, a crash
of the replica sets primary can orphan data from multiple migrations.
The _waitForDelete, available as a setting for the balancer as well as the moveChunk command, can alter
the behavior so that the delete phase of the current migration blocks the start of the next chunk migration. The
_waitForDelete is generally for internal testing purposes. For more information, see Wait for Delete (page 791).
Chunk Migration and Replication Changed in version 3.0: The default value secondaryThrottle became
true for all chunk migrations.
The new writeConcern field in the balancer configuration document allows you to specify a write concern
(page 141) semantics with the _secondaryThrottle option.
By default, each document operation during chunk migration propagates to at least one secondary before the bal-
ancer proceeds with the next document, which is equivalent to a write concern of { w: 2 }. You can set the
writeConcern option on the balancer configuration to set different write concern semantics.
To override this behavior and allow the balancer to continue without waiting for replication to a secondary, set the
_secondaryThrottle parameter to false. See Change Replication Behavior for Chunk Migration (page 791)
to update the _secondaryThrottle parameter for the balancer.
For the moveChunk command, the secondaryThrottle parameter is independent of the
_secondaryThrottle parameter for the balancer.
Independent of the secondaryThrottle setting, certain phases of the chunk migration have the following repli-
cation policy:
MongoDB briefly pauses all application writes to the source shard before updating the config servers with the
new location for the chunk, and resumes the application writes after the update. The chunk move requires all
writes to be acknowledged by majority of the members of the replica set both before and after committing the
chunk move to config servers.
When an outgoing chunk migration finishes and cleanup occurs, all writes must be replicated to a majority of
servers before further cleanup (from other outgoing migrations) or new incoming migrations can proceed.
moveChunk directory
Jumbo Chunks
During chunk migration, if the chunk exceeds the specified chunk size (page 754) or if the number of documents in the
chunk exceeds Maximum Number of Documents Per Chunk to Migrate, MongoDB does not migrate
the chunk. Instead, MongoDB attempts to split (page 754) the chunk. If the split is unsuccessful, MongoDB labels the
chunk as jumbo to avoid repeated attempts to migrate the chunk.
On this page
Chunk Size (page 754)
Limitations (page 755)
Indivisible Chunks (page 755)
As chunks grow beyond the specified chunk size (page 754) a mongos instance will attempt to split the chunk in half.
Splits may lead to an uneven distribution of the chunks for a collection across the shards. In such cases, the mongos
instances will initiate a round of migrations to redistribute chunks across shards. See Sharded Collection Balancing
(page 750) for more details on balancing chunks across shards.
Chunk Size
The default chunk size in MongoDB is 64 megabytes. You can increase or reduce the chunk size (page 805), mindful
of its effect on the clusters efficiency.
1. Small chunks lead to a more even distribution of data at the expense of more frequent migrations. This creates
expense at the query routing (mongos) layer.
2. Large chunks lead to fewer migrations. This is more efficient both from the networking perspective and in terms
of internal overhead at the query routing layer. But, these efficiencies come at the expense of a potentially more
uneven distribution of data.
3. Chunk size affects the Maximum Number of Documents Per Chunk to Migrate.
For many deployments, it makes sense to avoid frequent and potentially spurious migrations at the expense of a slightly
less evenly distributed data set.
Limitations
Changing the chunk size affects when chunks split but there are some limitations to its effects.
Automatic splitting only occurs during inserts or updates. If you lower the chunk size, it may take time for all
chunks to split to the new size.
Splits cannot be undone. If you increase the chunk size, existing chunks must grow through inserts or updates
until they reach the new size.
Note: Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary.
Indivisible Chunks
In some cases, chunks can grow beyond the specified chunk size (page 754) but cannot undergo a split; e.g. if a chunk
represents a single shard key value. See Considerations for Selecting Shard Keys (page 763) for considerations for
selecting a shard key.
On this page
Example (page 755)
All sharded collections must have an index that starts with the shard key; i.e. the index can be an index on the shard
key or a compound index where the shard key is a prefix of the index.
If you shard a collection without any documents and without such an index, the shardCollection command will
create the index on the shard key. If the collection already has documents, you must create the index before using
shardCollection.
Important: The index on the shard key cannot be a multikey index (page 497).
Example
A sharded collection named people has for its shard key the field zipcode. It currently has the index {
zipcode: 1 }. You can replace this index with a compound index { zipcode: 1, username: 1 },
as follows:
1. Create an index on { zipcode: 1, username: 1 }:
db.people.createIndex( { zipcode: 1, username: 1 } );
2. When MongoDB finishes building the index, you can safely drop the existing index on { zipcode: 1 }:
db.people.dropIndex( { zipcode: 1 } );
Since the index on the shard key cannot be a multikey index, the index { zipcode: 1, username: 1 }
can only replace the index { zipcode: 1 } if there are no array values for the username field.
If you drop the last valid index for the shard key, recover by recreating an index on just the shard key.
For restrictions on shard key indexes, see limits-shard-keys.
Config servers (page 734) store the metadata for a sharded cluster. The metadata reflects state and organization of the
sharded data sets and system. The metadata includes the list of chunks on every shard and the ranges that define the
chunks. The mongos instances cache this data and use it to route read and write operations to shards.
Config servers store the metadata in the Config Database (page 816).
Important: Always back up the config database before doing any maintenance on the config server.
To access the config database, issue the following command from the mongo shell:
use config
In general, you should never edit the content of the config database directly. The config database contains the
following collections:
changelog (page 817)
chunks (page 818)
collections (page 819)
databases (page 819)
lockpings (page 819)
locks (page 820)
mongos (page 820)
settings (page 820)
shards (page 821)
version (page 821)
For more information on these collections and their role in sharded clusters, see Config Database (page 816). See Read
and Write Operations on Config Servers (page 735) for more information about reads and updates to the metadata.
The following tutorials provide instructions for administering sharded clusters. For a higher-level overview, see Shard-
ing (page 725).
Sharded Cluster Deployment Tutorials (page 757) Instructions for deploying sharded clusters, adding shards, select-
ing shard keys, and the initial configuration of sharded clusters.
Deploy a Sharded Cluster (page 757) Set up a sharded cluster by creating the needed data directories, starting
the required MongoDB instances, and configuring the cluster settings.
Considerations for Selecting Shard Keys (page 763) Choose the field that MongoDB uses to parse a collec-
tions documents for distribution over the clusters shards. Each shard holds documents with values within
a certain range.
Shard a Collection Using a Hashed Shard Key (page 765) Shard a collection based on hashes of a fields val-
ues in order to ensure even distribution over the collections shards.
Add Shards to a Cluster (page 765) Add a shard to add capacity to a sharded cluster.
Continue reading from Sharded Cluster Deployment Tutorials (page 757) for additional tutorials.
Sharded Cluster Maintenance Tutorials (page 781) Procedures and tasks for common operations on active sharded
clusters.
View Cluster Configuration (page 781) View status information about the clusters databases, shards, and
chunks.
Remove Shards from an Existing Sharded Cluster (page 797) Migrate a single shards data and remove the
shard.
Manage Shard Tags (page 808) Use tags to associate specific ranges of shard key values with specific shards.
Continue reading from Sharded Cluster Maintenance Tutorials (page 781) for additional tutorials.
Sharded Cluster Data Management (page 799) Practices that address common issues in managing large sharded
data sets.
Troubleshoot Sharded Clusters (page 813) Presents solutions to common issues and concerns relevant to the admin-
istration and use of sharded clusters. Refer to FAQ: MongoDB Diagnostics (page 857) for general diagnostic
information.
On this page
Considerations (page 758)
Deploy the Config Server Replica Set (page 758)
Start the mongos Instances (page 759)
Add Shards to the Cluster (page 759)
Enable Sharding for a Database (page 760)
Shard a Collection (page 760)
Using 3 Mirrored Config Servers (Deprecated) (page 761)
Considerations
Connectivity All members of a sharded cluster must be able to connect to all other members of a sharded cluster,
including all shards and all config servers. Ensure that the network and security systems, including all interfaces and
firewalls, allow these connections.
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica
set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587). MongoDB 3.2
deprecates the use of three mirrored mongod instances for config servers.
The following restrictions apply to a replica set configuration when used for config servers:
Must have zero arbiters (page 625).
Must have no delayed members (page 624).
Must build indexes (i.e. no member should have buildIndexes setting set to false).
The config servers store the sharded clusters metadata. The following steps deploy a three member replica set for the
config servers.
1. Start all the config servers with both the --configsvr and --replSet <name> options:
mongod --configsvr --replSet configReplSet --port <port> --dbpath <path>
2. Connect a mongo shell to one of the config servers and run rs.initiate() to initiate the replica set.
rs.initiate( {
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "<host1>:<port1>" },
{ _id: 1, host: "<host2>:<port2>" },
{ _id: 2, host: "<host3>:<port3>" }
]
} )
To use the deprecated mirrored config server deployment topology, see Start 3 Mirrored Config Servers (Deprecated)
(page 761).
The mongos instances are lightweight and do not require data directories. You can run a mongos instance on a
system that runs other cluster components, such as on an application server or a server running a mongod process. By
default, a mongos instance runs on port 27017.
When you start the mongos instance, specify the config servers, using either the sharding.configDB setting in
the configuration file or the --configdb command line option.
Note: All config servers must be running and available when you first initiate a sharded cluster.
1. Start one or more mongos instances. For --configdb, or sharding.configDB, specify the config server
replica set name followed by a slash https://docs.mongodb.org/manual/ and at least one of the
config server hostnames and ports:
mongos --configdb configReplSet/<cfgsvr1:port1>,<cfgsvr2:port2>,<cfgsvr3:port3>
If using the deprecated mirrored config server deployment topology, see Start the mongos Instances (Deprecated)
(page 762).
A shard can be a standalone mongod or a replica set. In a production environment, each shard should be a replica set.
Use the procedure in Deploy a Replica Set (page 657) to deploy replica sets for each shard.
1. From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
For example, if a mongos is accessible at mongos0.example.net on port 27017, issue the following
command:
mongo --host mongos0.example.net --port 27017
2. Add each shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue
sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica set and
specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard database command, which lets you specify a name and maximum size for
the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To use the
database command, see addShard.
To add a shard for a standalone mongod on port 27017 of mongodb0.example.net, issue the fol-
lowing command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
Before you can shard a collection, you must enable sharding for the collections database. Enabling sharding for a
database does not redistribute data but make it possible to shard the collections in that database.
Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores
all data before sharding begins.
1. From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
2. Issue the sh.enableSharding() method, specifying the name of the database for which to enable sharding.
Use the following syntax:
sh.enableSharding("<database>")
Optionally, you can enable sharding for a database using the enableSharding command, which uses the following
syntax:
db.runCommand( { enableSharding: <database> } )
Shard a Collection
Replace the <database>.<collection> string with the full namespace of your database, which consists
of the name of your database, a dot (e.g. .), and the full name of the collection. The shard-key-pattern
represents your shard key, which you specify in the same form as you would an index key pattern.
Example
The following sequence of commands shards four collections:
sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )
sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )
sh.shardCollection("events.alerts", { "_id": "hashed" } )
Start 3 Mirrored Config Servers (Deprecated) Changed in version 3.2: Starting in MongoDB 3.2, config servers
for sharded clusters can be deployed as a replica set (page 613). The replica set config servers must run the WiredTiger
storage engine (page 587). MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
In production deployments, if using mirrored config servers, you must deploy exactly three config server instances,
each running on different servers to assure good uptime and data safety. In test environments, you can run all three
instances on a single server.
Important: All members of a sharded cluster must be able to connect to all other members of a sharded cluster,
including all shards and all config servers. Ensure that the network and security systems including all interfaces and
firewalls, allow these connections.
1. Create data directories for each of the three config server instances. By default, a config server stores its data
files in the /data/configdb directory. You can choose a different location. To create a data directory, issue a
command similar to the following:
mkdir /data/configdb
2. Start the three config server instances. Start each by issuing a command using the following syntax:
mongod --configsvr --dbpath <path> --port <port>
The default port for config servers is 27019. You can specify a different port. The following example starts a
config server using the default port and default data directory:
mongod --configsvr --dbpath /data/configdb --port 27019
Note: All config servers must be running and available when you first initiate a sharded cluster.
Start the mongos Instances (Deprecated) Changed in version 3.2: Starting in MongoDB 3.2, config servers for
sharded clusters can be deployed as a replica set (page 613). The replica set config servers must run the WiredTiger
storage engine (page 587). MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
If using 3 mirrored config servers, when you start the mongos instance, specify the hostnames of the three config
servers, either in the configuration file or as command line parameters.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the servers physical or virtual host-
name). Without logical DNS names, moving or renaming a config server requires shutting down every mongod and
mongos instance in the sharded cluster.
For example, to start a mongos that connects to config server instance running on the following hosts and on the
default ports:
cfg0.example.net
cfg1.example.net
cfg2.example.net
You would issue the following command:
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019
Each mongos in a sharded cluster must use the same configDB string, with identical host names listed in identical
order.
If you start a mongos instance with a string that does not exactly match the string used by the other mongos instances
in the cluster, the mongos instance returns a Config Database String Error (page 813) error and refuses to start.
To add shards, enable sharding and shard a collection, see Add Shards to the Cluster (page 759), Enable Sharding for
a Database (page 760), and Shard a Collection (page 760).
For many collections there may be no single, naturally occurring key that possesses all the qualities of a good shard
key. The following strategies may help construct a useful shard key from existing data:
1. Compute a more ideal shard key in your application layer, and store this in all of your documents, potentially in
the _id field.
2. Use a compound shard key that uses two or three values from all documents that provide the right mix of
cardinality with scalable write operations and query isolation.
3. Determine that the impact of using a less than ideal shard key is insignificant in your use case, given:
limited write volume,
expected data size, or
application query patterns.
4. Use a hashed shard key. Choose a field that has high cardinality and create a hashed index (page 537) on that
field. MongoDB uses these hashed index values as shard key values, which ensures an even distribution of
documents across the shards.
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do
not need to compute hashes.
Choosing the correct shard key can have a great impact on the performance, capability, and functioning of your
database and cluster. Appropriate shard key choice depends on the schema of your data and the way that your appli-
cations query and write data.
Create a Shard Key that is Easily Divisible An easily divisible shard key makes it easy for MongoDB to distribute
content among the shards. Shard keys that have a limited number of possible values can result in chunks that are
unsplittable.
For instance, if a chunk represents a single shard key value, then MongoDB cannot split the chunk even when the
chunk exceeds the size at which splits (page 754) occur.
See also:
Cardinality (page 764)
Create a Shard Key that has High Degree of Randomness A shard key with high degree of randomness prevents
any single shard from becoming a bottleneck and will distribute write operations among the cluster.
See also:
Write Scaling (page 741)
Create a Shard Key that Targets a Single Shard A shard key that targets a single shard makes it possible for the
mongos program to return most query operations directly from a single specific mongod instance. Your shard key
should be the primary field used by your queries. Fields with a high degree of randomness make it difficult to target
operations to specific shards.
See also:
Query Isolation (page 741)
Shard Using a Compound Shard Key The challenge when selecting a shard key is that there is not always an
obvious choice. Often, an existing field in your collection may not be the optimal key. In those situations, computing
a special purpose shard key into an additional field or using a compound shard key may help produce one that is more
ideal.
Cardinality Cardinality in the context of MongoDB, refers to the ability of the system to partition data into chunks.
For example, consider a collection of data such as an address book that stores address records:
Consider the use of a state field as a shard key:
The state keys value holds the US state for a given address document. This field has a low cardinality as all
documents that have the same value in the state field must reside on the same shard, even if a particular states
chunk exceeds the maximum chunk size.
Since there are a limited number of possible values for the state field, MongoDB may distribute data unevenly
among a small number of fixed chunks. This may have a number of effects:
If MongoDB cannot split a chunk because all of its documents have the same shard key, migrations involv-
ing these un-splittable chunks will take longer than other migrations, and it will be more difficult for your
data to stay balanced.
If you have a fixed maximum number of chunks, you will never be able to use more than that number of
shards for this collection.
Consider the use of a zipcode field as a shard key:
While this field has a large number of possible values, and thus has potentially higher cardinality, its possible
that a large number of users could have the same value for the shard key, which would make this chunk of users
un-splittable.
In these cases, cardinality depends on the data. If your address book stores records for a geographically dis-
tributed contact list (e.g. Dry cleaning businesses in America,) then a value like zipcode would be sufficient.
However, if your address book is more geographically concentrated (e.g ice cream stores in Boston Mas-
sachusetts,) then you may have a much lower cardinality.
Consider the use of a phone-number field as a shard key:
Phone number has a high cardinality, because users will generally have a unique value for this field, MongoDB
will be able to split as many chunks as needed.
While high cardinality, is necessary for ensuring an even distribution of data, having a high cardinality does not
guarantee sufficient query isolation (page 741) or appropriate write scaling (page 741).
If you choose a shard key with low cardinality, some chunks may grow too large for MongoDB to migrate. See Jumbo
Chunks (page 754) for more information.
When selecting a shard key, it is difficult to balance the qualities of an ideal shard key, which sometimes dictate
opposing strategies. For instance, its difficult to produce a key that has both a high degree randomness for even data
distribution and a shard key that allows your application to target specific shards. For some workloads, its more
important to have an even data distribution, and for others targeted queries are essential.
Therefore, the selection of a shard key is about balancing both your data and the performance characteristics caused
by different possible data distributions and system workloads.
On this page
Shard the Collection (page 765)
Specify the Initial Number of Chunks (page 765)
Note: If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution
may be uneven until the balancer automatically balances the collection.
To shard a collection using a hashed shard key, use an operation in the mongo that resembles the following:
sh.shardCollection( "records.active", { a: "hashed" } )
This operation shards the active collection in the records database, using a hash of the a field as the shard key.
If you shard an empty collection using a hashed shard key, MongoDB automatically creates and migrates empty chunks
so that each shard has two chunks. To control how many chunks MongoDB creates when sharding the collection, use
shardCollection with the numInitialChunks parameter.
Important: MongoDB 2.4 adds support for hashed shard keys. After sharding a collection with a hashed shard key,
you must use the MongoDB 2.4 or higher mongos and mongod instances in your sharded cluster.
Warning: MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For
example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To
prevent collisions, do not use a hashed index for floating point numbers that cannot be reliably converted to
64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values
larger than 253 .
On this page
Considerations (page 766)
Add a Shard to a Cluster (page 766)
You add shards to a sharded cluster after you create the cluster or any time that you need to add capacity to the cluster.
If you have not created a sharded cluster, see Deploy a Sharded Cluster (page 757).
In production environments, all shards should be replica sets.
Considerations
Balancing When you add a shard to a sharded cluster, you affect the balance of chunks among the shards of a cluster
for all existing sharded collections. The balancer will begin migrating chunks so that the cluster will achieve balance.
See Sharded Collection Balancing (page 750) for more information.
Changed in version 2.6: Chunk migrations can have an impact on disk space. Starting in MongoDB 2.6, the source
shard automatically archives the migrated documents by default. For details, see moveChunk directory (page 753).
Capacity Planning When adding a shard to a cluster, always ensure that the cluster has enough capacity to support
the migration required for balancing the cluster without affecting legitimate production traffic.
2. Add a shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue
sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica
set and specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard database command, which lets you specify a name and maximum size for
the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To use the
database command, see addShard.
To add a shard for a standalone mongod on port 27017 of mongodb0.example.net, issue the fol-
lowing command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
On this page
Overview (page 767)
Prerequisites (page 767)
Procedures (page 767)
Overview
This tutorial converts a single three-member replica set to a sharded cluster with two shards. Each shard is an inde-
pendent three-member replica set. This tutorial is specific to MongoDB 3.2. For other versions of MongoDB, refer to
the corresponding version of the MongoDB Manual.
The procedure is as follows:
1. Create the initial three-member replica set and insert data into a collection. See Set Up Initial Replica Set
(page 767).
2. Start the config servers and a mongos. See Deploy Config Server Replica Set and mongos (page 768).
3. Add the initial replica set as a shard. See Add Initial Replica Set as a Shard (page 769).
4. Create a second shard and add to the cluster. See Add Second Shard (page 769).
5. Shard the desired collection. See Shard a Collection (page 770).
Prerequisites
This tutorial uses a total of ten servers: one server for the mongos and three servers each for the first replica set, the
second replica set, and the config server replica set (page 734).
Each server must have a resolvable domain, hostname, or IP address within your system.
The tutorial uses the default data directories (e.g. /data/db and /data/configdb). Cre-
ate the appropriate directories with appropriate permissions. To use different paths, see
https://docs.mongodb.org/manual/reference/configuration-options .
The tutorial uses the default ports (e.g. 27017 and 27019). To use different ports, see
https://docs.mongodb.org/manual/reference/configuration-options.
Procedures
Set Up Initial Replica Set This procedure creates the initial three-member replica set rs0. The replica
set members are on the following hosts: mongodb0.example.net, mongodb1.example.net, and
mongodb2.example.net.
Step 1: Start each member of the replica set with the appropriate options. For each member, start a mongod,
specifying the replica set name through the replSet option. Include any other parameters specific to your deploy-
ment. For replication-specific parameters, see cli-mongod-replica-set.
mongod --replSet "rs0"
Repeat this step for the other two members of the rs0 replica set.
Step 2: Connect a mongo shell to a replica set member. Connect a mongo shell to one member of the replica set
(e.g. mongodb0.example.net)
mongo mongodb0.example.net
Step 3: Initiate the replica set. From the mongo shell, run rs.initiate() to initiate a replica set that consists
of the current member.
rs.initiate()
Step 5: Create and populate a new collection. The following step adds one million documents to the collection
test_collection and can take several minutes depending on your system.
Issue the following operations on the primary of the replica set:
use test
var bulk = db.test_collection.initializeUnorderedBulkOp();
people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina", "K
for(var i=0; i<1000000; i++){
user_id = i;
name = people[Math.floor(Math.random()*people.length)];
number = Math.floor(Math.random()*10001);
bulk.insert( { "user_id":user_id, "name":name, "number":number });
}
bulk.execute();
For more information on deploying a replica set, see Deploy a Replica Set (page 657).
Deploy Config Server Replica Set and mongos Starting in MongoDB 3.2, config servers for sharded clusters
can be deployed as a replica set (page 613). The replica set config servers must run the WiredTiger storage engine
(page 587). MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
This procedure deploys the three-member replica set for the config servers (page 734) and the mongos.
The config servers use the following hosts: mongodb7.example.net, mongodb8.example.net, and
mongodb9.example.net.
The mongos uses mongodb6.example.net.
Step 1: Deploy the config servers as a three-member replica set. Start a config server on
mongodb7.example.net, mongodb8.example.net, and mongodb9.example.net. Specify the same
replica set name. The config servers use the default data directory /data/configdb and the default port 27019.
mongod --configsvr --replSet configReplSet
To modify the default settings or to include additional options specific to your deploy-
ment, see https://docs.mongodb.org/manual/reference/program/mongod or
https://docs.mongodb.org/manual/reference/configuration-options.
Connect a mongo shell to one of the config servers and run rs.initiate() to initiate the replica set.
rs.initiate( {
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "mongodb07.example.net:27019" },
{ _id: 1, host: "mongodb08.example.net:27019" },
{ _id: 2, host: "mongodb09.example.net:27019" }
]
} )
Step 2: Start a mongos instance. On mongodb6.example.net, start the mongos specifying the config server
replica set name followed by a slash https://docs.mongodb.org/manual/ and at least one of the config
server hostnames and ports.
This tutorial specifies a small --chunkSize of 1 MB to test sharding with the test_collection created earlier.
Add Initial Replica Set as a Shard The following procedure adds the initial replica set rs0 as a shard.
Step 2: Add the shard. Add a shard to the cluster with the sh.addShard method:
sh.addShard( "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" )
Add Second Shard The following procedure deploys a new replica set rs1 for the second shard and
adds it to the cluster. The replica set members are on the following hosts: mongodb3.example.net,
mongodb4.example.net, and mongodb5.example.net.
Step 1: Start each member of the replica set with the appropriate options. For each member, start a mongod,
specifying the replica set name through the replSet option. Include any other parameters specific to your deploy-
ment. For replication-specific parameters, see cli-mongod-replica-set.
mongod --replSet "rs1"
Repeat this step for the other two members of the rs1 replica set.
Step 2: Connect a mongo shell to a replica set member. Connect a mongo shell to one member of the replica set
(e.g. mongodb3.example.net)
mongo mongodb3.example.net
Step 3: Initiate the replica set. From the mongo shell, run rs.initiate() to initiate a replica set that consists
of the current member.
rs.initiate()
Step 4: Add the remaining members to the replica set. Add the remaining members with the rs.add() method.
rs.add("mongodb4.example.net")
rs.add("mongodb5.example.net")
Step 6: Add the shard. In a mongo shell connected to the mongos, add the shard to the cluster with the
sh.addShard() method:
sh.addShard( "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" )
Shard a Collection
Step 2: Enable sharding for a database. Before you can shard a collection, you must first enable sharding for the
collections database. Enabling sharding for a database does not redistribute data but makes it possible to shard the
collections in that database.
The following operation enables sharding on the test database:
sh.enableSharding( "test" )
Step 3: Determine the shard key. For the collection to shard, determine the shard key. The shard key (page 739)
determines how MongoDB distributes the documents between shards. Good shard keys:
have values that are evenly distributed among all documents,
group documents that are often accessed at the same time into contiguous chunks, and
allow for effective distribution of activity among shards.
Once you shard a collection with the specified shard key, you cannot change the shard key. For more information on
shard keys, see Shard Keys (page 739) and Considerations for Selecting Shard Keys (page 763).
This procedure will use the number field as the shard key for test_collection.
Step 4: Create an index on the shard key. Before sharding a non-empty collection, create an index on the shard
key (page 755).
use test
db.test_collection.createIndex( { number : 1 } )
Step 5: Shard the collection. In the test database, shard the test_collection, specifying number as the
shard key.
use test
sh.shardCollection( "test.test_collection", { "number" : 1 } )
The balancer (page 750) will redistribute chunks of documents when it next runs. As clients insert additional docu-
ments into this collection, the mongos will route the documents between the shards.
Step 6: Confirm the shard is balancing. To confirm balancing activity, run db.stats() or
db.printShardingStatus() in the test database.
use test
db.stats()
db.printShardingStatus()
"avgObjSize" : 111,
"dataSize" : 112448944,
"storageSize" : 177561600,
"numExtents" : 21,
"indexes" : 4,
"indexSize" : 58540160,
"fileSize" : 536870912,
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
Run these commands for a second time to demonstrate that chunks are migrating from rs0 to rs1.
On this page
Prerequisites (page 773)
Procedure (page 773)
Starting in 3.2, config servers for a sharded cluster can be deployed as a replica set. Using a replica set for the config
servers improves consistency across the config servers, since MongoDB can take advantage of the standard replica
set read and write protocols for the config data. In addition, this allows a sharded cluster to have more than 3 config
servers since a replica set can have up to 50 members.
The following procedure upgrades three mirrored config servers to a config server replica set (page 735) without
downtime. To use this procedure, all the sharded cluter binaries must be at least version 3.2.4.
During this procedure there will be a period of time where the config servers will be read-only. During this period, cer-
tain catalog operations will fail if attempted. Operations that will not be available include adding and dropping shards,
creating and dropping databases, creating and dropping sharded collections, and migrating chunks (both manually and
via the balancer process). Normal read and write operations to existing collections will not be affected.
See also:
Upgrade Config Servers to Replica Set (Downtime) (page 776)
Prerequisites
All binaries in the sharded clusters must be at least version 3.2.4. See Upgrade a Sharded Cluster to 3.2
(page 895) for instructions to upgrade the sharded cluster.
The existing config servers must be in sync.
Procedure
Note: The procedure refers to the first config server, second config server, and the third config server as listed in the
configDB setting of the mongos. This means, that for the following example:
mongos --configdb confServer1:port1,confServer2:port2,confServer3:port3
_id (page 710) corresponds to the replica set name for the config servers.
configsvr (page 711) must be set be true.
members (page 711) array contains a document that specifies:
members._id (page 711) which is a numeric identifier for the member.
members.host (page 711) which is a string corresponding to the config servers hostname and port.
3. Restart this config server as a single member replica set with:
the --replSet option set to the replica set name specified during the rs.initiate(),
the --configsvrMode option set to the legacy config server mode Sync Cluster Connection Config
(sccc),
the --configsvr option,
the --storageEngine option set to the storage engine used by this config server. For this upgrade
procedure, the existing config server can be using either MMAPv1 or WiredTiger, and
the --port option set to the same port as before restart, and
the --dbpath option set to the same path as before restart.
Include additional options as specific to your deployment.
16
Important: The config server must use the same port as before.
4. Start the new mongod instances to add to the replica set. These instances must use the WiredTiger (page 587)
storage engine. Starting in 3.2, the default storage engine is WiredTiger for new mongod instances with new
data paths.
Important:
Do not add existing config servers to the replica set.
Use new dbpaths for the new instances.
The number of new mongod instances to add depends on the config server currently in the single-member
replica set:
If the config server is using MMAPv1, start 3 new mongod instances.
If the config server is using WiredTiger, start 2 new mongod instances.
16 If before the restart, your config server did not explicitly specify the --configsvr option or the --port option, the restart with the
Note: The example in this procedure assumes that the existing config servers use MMAPv1.
For each new mongod instance to add, include the --configsvr and the --replSet options:
mongod --configsvr --replSet csReplSet --port <port> --dbpath <path>
5. Using the mongo shell connected to the replica set config server, add the new mongod instances as non-voting
(page 637), priority 0 (page 621) members:
rs.add( { host: <host:port>, priority: 0, votes: 0 } )
6. Once all the new members have been added as non-voting (page 637), priority 0 (page 621) members, ensure
that the new nodes have completed the initial sync (page 648) and have reached SECONDARY (page 718) state.
To check the state of the replica set members, run rs.status() in the mongo shell:
rs.status()
7. Shut down one of the other non-replica set config servers; i.e. either the second and third config server listed
in the configDB setting of the mongos. At this point the config servers will go read-only, meaning certain
operations - such as creating and dropping databases and sharded collections - will not be available.
8. Reconfigure the replica set to allow all members to vote and have default priority of 1.
var cfg = rs.conf();
cfg.members[0].priority = 1;
cfg.members[1].priority = 1;
cfg.members[2].priority = 1;
cfg.members[3].priority = 1;
cfg.members[0].votes = 1;
cfg.members[1].votes = 1;
cfg.members[2].votes = 1;
cfg.members[3].votes = 1;
rs.reconfig(cfg);
9. Step down the first config server, i.e. the server started with --configsvrMode=sccc.
rs.stepDown(600)
sharding:
clusterRole: configsvr
replication:
replSetName: csReplSet
net:
port: <port>
storage:
dbPath: <path>
engine: <storageEngine>
If the first config server uses the MMAPv1 storage engine, the member will transition to "REMOVED" state.
At this point the config server data will return to being writeable and all catalog operations - including creating
and dropping databases and sharded collections - will once again be possible.
12. Restart mongos instances with updated --configdb or sharding.configDB setting.
For the updated --configdb or sharding.configDB setting, specify the replica set name for the config
servers and the members in the replica set.
mongos --configdb csReplSet/<rsconfigsver1:port1>,<rsconfigsver2:port2>,<rsconfigsver3:port3>
13. Verify that the restarted mongos instances are aware of the protocol change. Connect a mongo shell to a
mongos instance and check the mongos collection in the config database:
use config
db.mongos.find()
The ping value for the mongos instances should indicate some time after the restart.
14. If the first config server uses the MMAPv1 storage engine, remove the member from the replica set. Connect a
mongo shell to the current primary and use rs.remove():
Important: Only if the config server uses the MMAPv1 storage engine.
rs.remove("<hostname>:<port>")
On this page
Prerequisites (page 777)
Procedure (page 777)
New in version 3.2: Starting in 3.2, config servers for a sharded cluster can be deployed as a replica set.
The following procedure upgrades three mirrored config servers to a config server replica set (page 735). Using a
replica set for the config servers improves consistency across the config servers, since MongoDB can take advantage
of the standard replica set read and write protocols for the config data. In addition, this allows a sharded cluster to
have more than 3 config servers since a replica set can have up to 50 members.
Prerequisites
All binaries in the sharded clusters must be at least version 3.2. See Upgrade a Sharded Cluster to 3.2 (page 895)
for instructions to upgrade the sharded cluster.
The existing config servers must be in sync.
Procedure
Important: The procedure outlined in this tutorial requires downtime. If all the sharded cluster binaries are at least
version 3.2.4, you can also convert the config servers to replica set without downtime. For details, see Upgrade Config
Servers to Replica Set (page 772).
_id (page 710) corresponds to the replica set name for the config servers.
version (page 711) set to 1, corresponding to the initial version of the replica set configuration.
configsvr (page 711) must be set be true.
members (page 711) array contains a document that specifies:
members._id (page 711) which is a numeric identifier for the member.
members.host (page 711) which is a string corresponding to the config servers hostname and port.
3. Restart this config server as a single member replica set with:
the --replSet option set to the replica set name specified during the rs.initiate(),
the --configsvrMode option set to the legacy config server mode Sync Cluster Connection Config
(sccc),
the --configsvr option, and
the --storageEngine option set to the storage engine used by this config server. For this upgrade
procedure, the existing config server can be using either MMAPv1 or WiredTiger.
Include additional options as specific to your deployment.
mongod --configsvr --replSet csReplSet --configsvrMode=sccc --storageEngine <storageEngine> --po
net:
port: <port>
storage:
dbPath: <path>
engine: <storageEngine>
4. Start the new mongod instances to add to the replica set. These instances must use the WiredTiger (page 587)
storage engine. Starting in 3.2, the default storage engine is WiredTiger for new mongod instances with new
data paths.
Important:
Do not add existing config servers to the replica set.
Use new dbpaths for the new instances.
The number of new mongod instances to add depends on the config server currently in the single-member
replica set:
If the config server is using MMAPv1, start 3 new mongod instances.
If the config server is using WiredTiger, start 2 new mongod instances.
Note: The example in this procedure assumes that the existing config servers use MMAPv1.
For each new mongod instance to add, include the --configsvr and the --replSet options:
mongod --configsvr --replSet csReplSet --port <port> --dbpath <path>
5. Using the mongo shell connected to the replica set config server, add the new mongod instances as non-voting
(page 637), priority 0 (page 621) members:
rs.add( { host: <host:port>, priority: 0, votes: 0 } )
6. Once all the new members have been added as non-voting (page 637), priority 0 (page 621) members, ensure
that the new nodes have completed the initial sync (page 648) and have reached SECONDARY (page 718) state.
To check the state of the replica set members, run rs.status() in the mongo shell:
rs.status()
7. Shut down one of the other non-replica set config servers; i.e. either the second and third config server listed in
the configDB setting of the mongos.
8. Reconfigure the replica set to allow all members to vote and have default priority of 1.
var cfg = rs.conf();
cfg.members[0].priority = 1;
cfg.members[1].priority = 1;
cfg.members[2].priority = 1;
cfg.members[3].priority = 1;
cfg.members[0].votes = 1;
cfg.members[1].votes = 1;
cfg.members[2].votes = 1;
cfg.members[3].votes = 1;
rs.reconfig(cfg);
9. Step down the first config server, i.e. the server started with --configsvrMode=sccc.
rs.stepDown()
Important: If the first config server uses the WiredTiger storage engine, do not remove.
rs.remove("<hostname>:<port>")
12. If the first config server uses WiredTiger (page 587), restart the first config server in config server replica set
(CSRS) mode; i.e. restart without the --configsvrMode=sccc option:
Important: If the first config server uses the MMAPv1 storage engine, do not restart.
mongod --configsvr --replSet csReplSet --storageEngine wiredTiger --port <port> --dbpath <path>
15. Re-enable the balancer as described in Enable the Balancer (page 795).
On this page
Convert a Cluster with a Single Shard into a Replica Set (page 780)
Convert a Sharded Cluster into a Replica Set (page 780)
This tutorial describes the process for converting a sharded cluster to a non-sharded replica set. To convert a replica
set into a sharded cluster Convert a Replica Set to a Sharded Cluster (page 767). See the Sharding (page 725)
documentation for more information on sharded clusters.
In the case of a sharded cluster with only one shard, that shard contains the full data set. Use the following procedure
to convert that cluster into a non-sharded replica set:
1. Reconfigure the application to connect to the primary member of the replica set hosting the single shard that
system will be the new replica set.
2. Optionally remove the --shardsrv option, if your mongod started with this option.
Tip
Changing the --shardsrv option will change the port that mongod listens for incoming connections on.
The single-shard cluster is now a non-sharded replica set that will accept read and write operations on the data set.
You may now decommission the remaining sharding infrastructure.
Use the following procedure to transition from a sharded cluster with more than one shard to an entirely new replica
set.
1. With the sharded cluster running, deploy a new replica set (page 657) in addition to your sharded cluster. The
replica set must have sufficient capacity to hold all of the data files from all of the current shards combined. Do
not configure the application to connect to the new replica set until the data transfer is complete.
2. Stop all writes to the sharded cluster. You may reconfigure your application or stop all mongos instances.
If you stop all mongos instances, the applications will not be able to read from the database. If you stop all
mongos instances, start a temporary mongos instance on that applications cannot access for the data migration
procedure.
3. Use mongodump and mongorestore (page 272) to migrate the data from the mongos instance to the new replica
set.
Note: Not all collections on all databases are necessarily sharded. Do not solely migrate the sharded collections.
Ensure that all databases and all collections migrate correctly.
4. Reconfigure the application to use the non-sharded replica set instead of the mongos instance.
The application will now use the un-sharded replica set for reads and writes. You may now decommission the remain-
ing unused sharded cluster infrastructure.
On this page
List Databases with Sharding Enabled (page 781)
List Shards (page 782)
View Cluster Details (page 782)
To list the databases that have sharding enabled, query the databases collection in the Config Database (page 816).
A database has sharding enabled if the value of the partitioned field is true. Connect to a mongos instance
with a mongo shell, and run the following operation to get a full list of databases with sharding enabled:
use config
db.databases.find( { "partitioned": true } )
Example
You can use the following sequence of commands when to return a list of all databases in the cluster:
use config
db.databases.find()
List Shards
To list the current set of configured shards, use the listShards command, as follows:
use admin
db.runCommand( { listShards : 1 } )
To view cluster details, issue db.printShardingStatus() or sh.status(). Both methods return the same
output.
Example
In the following example output from sh.status()
sharding version displays the version number of the shard metadata.
shards displays a list of the mongod instances used as shards in the cluster.
databases displays all databases in the cluster, including database that do not have sharding enabled.
The chunks information for the foo database displays how many chunks are on each shard and displays the
range of each chunk.
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "m0.example.net:30001" }
{ "_id" : "shard0001", "host" : "m3.example2.net:50000" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "contacts", "partitioned" : true, "primary" : "shard0000" }
foo.contacts
shard key: { "zip" : 1 }
chunks:
shard0001 2
shard0002 3
shard0000 2
{ "zip" : { "$minKey" : 1 } } -->> { "zip" : "56000" } on : shard0001 { "t" : 2, "i" : 0
{ "zip" : 56000 } -->> { "zip" : "56800" } on : shard0002 { "t" : 3, "i" : 4 }
{ "zip" : 56800 } -->> { "zip" : "57088" } on : shard0002 { "t" : 4, "i" : 2 }
{ "zip" : 57088 } -->> { "zip" : "57500" } on : shard0002 { "t" : 4, "i" : 3 }
{ "zip" : 57500 } -->> { "zip" : "58140" } on : shard0001 { "t" : 4, "i" : 0 }
{ "zip" : 58140 } -->> { "zip" : "59000" } on : shard0000 { "t" : 4, "i" : 1 }
{ "zip" : 59000 } -->> { "zip" : { "$maxKey" : 1 } } on : shard0000 { "t" : 3, "i" : 3 }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
On this page
Overview (page 783)
Considerations (page 783)
Procedure (page 783)
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica
set (page 613). The replica set config servers must run the WiredTiger storage engine (page 587). MongoDB 3.2
deprecates the use of three mirrored mongod instances for config servers.
For replacing config servers deployed as three mirrored mongod instances, see Migrate Config Servers with the Same
Hostname (page 784) and Migrate Config Servers with Different Hostnames (page 785).
Overview
If the config server replica set becomes read only, i.e. does not have a primary, the sharded cluster cannot support
operations that change the cluster metadata, such as chunk splits and migrations. Although no chunks can be split or
migrated, applications will be able to write data to the sharded cluster.
If one of the config servers is unavailable or inoperable, repair or replace it as soon as possible. The following
procedure replaces a member of a config server replica set (page 734) with a new member.
The tutorial is specific to MongoDB 3.2. For earlier versions of MongoDB, refer to the corresponding version of the
MongoDB Manual.
Considerations
The following restrictions apply to a replica set configuration when used for config servers:
Must have zero arbiters (page 625).
Must have no delayed members (page 624).
Must build indexes (i.e. no member should have buildIndexes setting set to false).
Procedure
Step 1: Start the replacement config server. Start a mongod instance, specifying both the --configsvr and
--replSet options.
mongod --configsvr --replSet <replicaSetName>
Step 2: Add the new config server to the replica set. Connect a mongo shell to the primary of the config server
replica set and use rs.add() to add the new member.
rs.add("<hostnameNew>:<portNew>")
The initial sync process copies all the data from one member of the config server replica set to the new member without
restarting.
mongos instances automatically recognize the change in the config server replica set members without restarting.
Step 3: Shut down the member to replace. If replacing the primary member, step down the primary first before
shutting down.
Step 4: Remove the member to replace from the config server replica set. Upon completion of initial sync of
the replacement config server, from a mongo shell connected to the primary, use rs.remove() to remove the old
member.
rs.remove("<hostnameOld>:<portOld>")
mongos instances automatically recognize the change in the config server replica set members without restarting.
Step 5: If necessary, update mongos configuration or DNS entry. With replica set config servers, the mongos
instances specify in the --configdb or sharding.configDB setting the config server replica set name and at
least one of the replica set members.
As such, if the mongos instance does not specify the removed replica set member in the --configdb or
sharding.configDB setting, no further action is necessary.
If, however, a mongos instance specified the removed member in the --configdb or configDB setting, either:
Update the setting for the next time you restart the mongos, or
Modify the DNS entry that points to the system that provided the old config server, so that the same hostname
points to the new config server.
Important: This procedure applies to migrating config servers when using three mirrored mongod instances as
config servers.
Starting in MongoDB 3.2, config servers can be deployed as replica set (page 613). MongoDB 3.2 deprecates the use
of three mirrored mongod instances for config servers.
For replacing config servers deployed as members of a replica set, see Replace a Config Server (page 783).
For a sharded cluster (page 731) that uses 3 mirrored config servers, use the following procedure migrates a config
server (page 734) to a new system that uses the same hostname.
To migrate all three mirrored config servers, perform this procedure for each config server separately and migrate the
config servers in reverse order from how they are listed in the mongos instances configDB string. Start with the
last config server listed in the configDB string.
1. Shut down the config server.
This renders all config data for the sharded cluster read only.
2. Change the DNS entry that points to the system that provided the old config server, so that the same hostname
points to the new system. How you do this depends on how you organize your DNS and hostname resolution
services.
3. Copy the contents of dbPath from the old config server to the new config server.
For example, to copy the contents of dbPath to a machine named mongodb.config2.example.net,
you might issue a command similar to the following:
rsync -az /data/configdb/ mongodb.config2.example.net:/data/configdb
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
When you start the third config server, your cluster will become writable and it will be able to create new splits and
migrate chunks as needed.
On this page
Overview (page 785)
Considerations (page 785)
Procedure (page 785)
Important: This procedure applies to migrating config servers when using three mirrored mongod instances as
config servers.
Changed in version 3.2: Starting in MongoDB 3.2, config servers can be deployed as replica set (page 613). MongoDB
3.2 deprecates the use of three mirrored mongod instances for config servers.
For replacing config servers deployed as members of a replica set, see Replace a Config Server (page 783).
Overview
For a sharded cluster (page 731) that uses three mirrored config servers, all three config servers must be available in
order to support operations that result in cluster metadata changes, e.g. chunk splits and migrations. If one of the
config servers is unavailable or inoperable, you must replace it as soon as possible.
For a sharded cluster (page 731) that uses three mirrored config servers, this procedure migrates a config server
(page 734) to a new server that uses a different hostname. Use this procedure only if the config server will not be
accessible via the same hostname. If possible, avoid changing the hostname so that you can instead use the procedure
to migrate a config server and use the same hostname (page 784).
Considerations
With three mirrored config servers, changing a config servers (page 734) hostname requires downtime and requires
restarting every process in the sharded cluster.
While migrating config servers, always make sure that all mongos instances have three config servers specified in the
configDB setting at all times. Also ensure that you specify the config servers in the same order for each mongos
instances configDB setting.
Procedure
Important: This procedure applies to migrating config servers when using three mirrored mongod instances as
config servers. For replacing config servers deployed as members of a replica set, see Replace a Config Server
(page 783).
1. Disable the cluster balancer process temporarily. See Disable the Balancer (page 794) for more information.
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
On this page
Disable the Balancer (page 787)
Migrate Each Config Server Separately (page 787)
Restart the mongos Instances (page 788)
Migrate the Shards (page 788)
Re-Enable the Balancer (page 789)
The tutorial is specific to MongoDB 3.2. For earlier versions of MongoDB, refer to the corresponding version of the
MongoDB Manual.
Changed in version 3.2.
Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica set (page 613). The replica
set config servers must run the WiredTiger storage engine (page 587). MongoDB 3.2 deprecates the use of three
mirrored mongod instances for config servers.
This procedure moves the components of the sharded cluster to a new hardware system without downtime for reads
and writes.
Important: While the migration is in progress, do not attempt to change to the cluster metadata (page 756). Do not
use any operation that modifies the cluster metadata in any way. For example, do not create or drop databases, create
or drop collections, or use any sharding commands.
If your cluster includes a shard backed by a standalone mongod instance, consider converting the standalone to a
replica set (page 669) to simplify migration and to let you keep the cluster online during future maintenance. Migrating
a shard as standalone is a multi-step process that may require downtime.
Disable the balancer to stop chunk migration (page 751) and do not perform any metadata write operations until the
process finishes. If a migration is in progress, the balancer will complete the in-progress migration before stopping.
To disable the balancer, connect to one of the clusters mongos instances and issue the following method:
sh.stopBalancer()
Step 1: Start the replacement config server. Start a mongod instance, specifying both the --configsvr and
--replSet options.
mongod --configsvr --replSet <replicaSetName>
Step 2: Add the new config server to the replica set. Connect a mongo shell to the primary of the config server
replica set and use rs.add() to add the new member.
rs.add("<hostnameNew>:<portNew>")
The initial sync process copies all the data from one member of the config server replica set to the new member without
restarting.
mongos instances automatically recognize the change in the config server replica set members without restarting.
Step 3: Shut down the member to replace. If replacing the primary member, step down the primary first before
shutting down.
Changed in version 3.2: With replica set config servers, the mongos instances specify in the --configdb or
sharding.configDB setting the config server replica set name and at least one of the replica set members. The
mongos instances for the sharded cluster must specify the same config server replica set name but can specify different
members of the replica set.
If a mongos instance specifies a migrated replica set member in the --configdb or sharding.configDB
setting, update the config server setting for the next time you restart the mongos instance.
For more information, see Start the mongos Instances (page 759).
Migrate the shards one at a time. For each shard, follow the appropriate procedure in this section.
Migrate a Replica Set Shard To migrate a sharded cluster, migrate each member separately. First migrate the
non-primary members, and then migrate the primary last.
If the replica set has two voting members, add an arbiter (page 625) to the replica set to ensure the set keeps a majority
of its votes available during the migration. You can remove the arbiter after completing the migration.
Migrate the Primary in a Replica Set Shard While migrating the replica sets primary, the set must elect a new
primary. This failover process which renders the replica set unavailable to perform reads or accept writes for the
duration of the election, which typically completes quickly. If possible, plan the migration during a maintenance
window.
1. Step down the primary to allow the normal failover (page 635) process. To step down the primary, connect to
the primary and issue the either the replSetStepDown command or the rs.stepDown() method. The
following example shows the rs.stepDown() method:
rs.stepDown()
2. Once the primary has stepped down and another member has become PRIMARY (page 718) state. To migrate
the stepped-down primary, follow the Migrate a Member of a Replica Set Shard (page 788) procedure
You can check the output of rs.status() to confirm the change in status.
Migrate a Standalone Shard The ideal procedure for migrating a standalone shard is to convert the standalone to a
replica set (page 669) and then use the procedure for migrating a replica set shard (page 788). In production clusters,
all shards should be replica sets, which provides continued availability during maintenance windows.
Migrating a shard as standalone is a multi-step process during which part of the shard may be unavailable. If the shard
is the primary shard for a database,the process includes the movePrimary command. While the movePrimary
runs, you should stop modifying data in that database. To migrate the standalone shard, use the Remove Shards from
an Existing Sharded Cluster (page 797) procedure.
To complete the migration, re-enable the balancer to resume chunk migrations (page 751).
Connect to one of the clusters mongos instances and pass true to the sh.setBalancerState() method:
sh.setBalancerState(true)
This procedure shuts down the mongod instance of a config server (page 734) in order to create a backup of a sharded
clusters (page 725) metadata. The clusters config servers store all of the clusters metadata, most importantly the
mapping from chunks to shards.
When you perform this procedure, the cluster remains operational 17 .
1. Disable the cluster balancer process temporarily. See Disable the Balancer (page 794) for more information.
2. Shut down one of the config databases.
3. Create a full copy of the data files (i.e. the path specified by the dbPath option for the config instance.)
4. Restart the original configuration server.
5. Re-enable the balancer to allow the cluster to resume normal balancing operations. See the Disable the Balancer
(page 794) section for more information on managing the balancer process.
See also:
MongoDB Backup Methods (page 200).
17 While one of the three config servers is unavailable, the cluster cannot split any chunks nor can it migrate chunks between shards. Your
application will be able to write data to the cluster. See Config Servers (page 734) for more information.
On this page
Schedule a Window of Time for Balancing to Occur (page 790)
Configure Default Chunk Size (page 790)
Change the Maximum Storage Size for a Given Shard (page 790)
Change Replication Behavior for Chunk Migration (page 791)
The balancer is a process that runs on one of the mongos instances in a cluster and ensures that chunks are evenly
distributed throughout a sharded cluster. In most deployments, the default balancer configuration is sufficient for
normal operation. However, administrators might need to modify balancer behavior depending on application or
operational requirements. If you encounter a situation where you need to modify the behavior of the balancer, use the
procedures described in this document.
For conceptual information about the balancer, see Sharded Collection Balancing (page 750) and Cluster Balancer
(page 750).
You can schedule a window of time during which the balancer can migrate chunks, as described in the following
procedures:
Schedule the Balancing Window (page 793)
Remove a Balancing Window Schedule (page 794).
The mongos instances use their own local timezones when respecting balancer window.
The default chunk size for a sharded cluster is 64 megabytes. In most situations, the default size is appropriate for
splitting and migrating chunks. For information on how chunk size affects deployments, see details, see Chunk Size
(page 754).
Changing the default chunk size affects chunks that are processes during migrations and auto-splits but does not
retroactively affect all chunks.
To configure default chunk size, see Modify Chunk Size in a Sharded Cluster (page 805).
The maxSize field in the shards (page 821) collection in the config database (page 816) sets the maximum size
for a shard, allowing you to control whether the balancer will migrate chunks to a shard. If mem.mapped size 18 is
above a shards maxSize, the balancer will not move chunks to the shard. Also, the balancer will not move chunks
off an overloaded shard. This must happen manually. The maxSize value only affects the balancers selection of
destination shards.
By default, maxSize is not specified, allowing shards to consume the total amount of available space on their ma-
chines if necessary.
You can set maxSize both when adding a shard and once a shard is running.
18 This value includes the mapped size of all data files including thelocal and admin databases. Account for this when setting maxSize.
To set maxSize when adding a shard, set the addShard commands maxSize parameter to the maximum size in
megabytes. For example, the following command run in the mongo shell adds a shard with a maximum size of 125
megabytes:
db.runCommand( { addshard : "example.net:34008", maxSize : 125 } )
To set maxSize on an existing shard, insert or update the maxSize field in the shards (page 821) collection in the
config database (page 816). Set the maxSize in megabytes.
Example
Assume you have the following shard without a maxSize field:
{ "_id" : "shard0000", "host" : "example.net:34001" }
Run the following sequence of commands in the mongo shell to insert a maxSize of 125 megabytes:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 125 } } )
To later increase the maxSize setting to 250 megabytes, run the following:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 250 } } )
Secondary Throttle Changed in version 3.0.0: The balancer configuration document added configurable
writeConcern to control the semantics of the _secondaryThrottle option.
The _secondaryThrottle parameter of the balancer and the moveChunk command affects the replication be-
havior during chunk migration (page 753). By default, _secondaryThrottle is true, which means each doc-
ument move during chunk migration propagates to at least one secondary before the balancer proceeds with the next
document: this is equivalent to a write concern of { w: 2 }.
You can also configure the writeConcern for the _secondaryThrottle operation, to configure how migra-
tions will wait for replication to complete. For more information on the replication behavior during various steps of
chunk migration, see Chunk Migration and Replication (page 753).
To change the balancers _secondaryThrottle and writeConcern values, connect to a mongos instance and
directly update the _secondaryThrottle value in the settings (page 820) collection of the config database
(page 816). For example, from a mongo shell connected to a mongos, issue the following command:
use config
db.settings.update(
{ "_id" : "balancer" },
{ $set : { "_secondaryThrottle" : false ,
"writeConcern": { "w": "majority" } } },
{ upsert : true }
)
The effects of changing the _secondaryThrottle and writeConcern value may not be immediate. To ensure
an immediate effect, stop and restart the balancer to enable the selected value of _secondaryThrottle. See
Manage Sharded Cluster Balancer (page 792) for details.
Wait for Delete The _waitForDelete setting of the balancer and the moveChunk command affects how the
balancer migrates multiple chunks from a shard. By default, the balancer does not wait for the on-going migrations
delete phase to complete before starting the next chunk migration. To have the delete phase block the start of the next
chunk migration, you can set the _waitForDelete to true.
For details on chunk migration, see Chunk Migration (page 752). For details on the chunk migration queuing behavior,
see Chunk Migration Queuing (page 753).
The _waitForDelete is generally for internal testing purposes. To change the balancers _waitForDelete
value:
1. Connect to a mongos instance.
2. Update the _waitForDelete value in the settings (page 820) collection of the config database
(page 816). For example:
use config
db.settings.update(
{ "_id" : "balancer" },
{ $set : { "_waitForDelete" : true } },
{ upsert : true }
)
On this page
Check the Balancer State (page 793)
Check the Balancer Lock (page 793)
Schedule the Balancing Window (page 793)
Remove a Balancing Window Schedule (page 794)
Disable the Balancer (page 794)
Enable the Balancer (page 795)
Disable Balancing During Backups (page 795)
Disable Balancing on a Collection (page 796)
Enable Balancing on a Collection (page 796)
Confirm Balancing is Enabled or Disabled (page 796)
This page describes common administrative procedures related to balancing. For an introduction to balancing, see
Sharded Collection Balancing (page 750). For lower level information on balancing, see Cluster Balancer (page 750).
See also:
Configure Behavior of Balancer Process in Sharded Clusters (page 790)
sh.getBalancerState() checks if the balancer is enabled (i.e. that the balancer is permitted to run).
sh.getBalancerState() does not check if the balancer is actively balancing chunks.
To see if the balancer is enabled in your sharded cluster, issue the following command, which returns a boolean:
sh.getBalancerState()
New in version 3.0.0: You can also see if the balancer is enabled using sh.status(). The currently-enabled
field indicates whether the balancer is enabled, while the currently-running field indicates if the balancer is
currently running.
When this command returns, you will see output like the following:
{ "_id" : "balancer",
"process" : "mongos0.example.net:1292810611:1804289383",
"state" : 2,
"ts" : ObjectId("4d0f872630c42d1978be8a2e"),
"when" : "Mon Dec 20 2010 11:41:10 GMT-0500 (EST)",
"who" : "mongos0.example.net:1292810611:1804289383:Balancer:846930886",
"why" : "doing balance round" }
In some situations, particularly when your data set grows slowly and a migration can impact performance, it is useful
to ensure that the balancer is active only at certain times. The following procedure specifies the activeWindow,
which is the timeframe during which the balancer will be able to migrate chunks:
Step 1: Connect to mongos using the mongo shell. You can connect to any mongos in the cluster.
Step 2: Switch to the Config Database. Issue the following command to switch to the config database.
use config
Step 3: Ensure that the balancer is not stopped. The balancer will not activate in the stopped state. To ensure
that the balancer is not stopped, use sh.setBalancerState(), as in the following:
sh.setBalancerState( true )
The balancer will not start if you are outside of the activeWindow timeframe.
Step 4: Modify the balancers window. Set the activeWindow using update(), as in the following:
db.settings.update(
{ _id: "balancer" },
{ $set: { activeWindow : { start : "<start-time>", stop : "<stop-time>" } } },
{ upsert: true }
)
Replace <start-time> and <end-time> with time values using two digit hour and minute values (i.e. HH:MM)
that specify the beginning and end boundaries of the balancing window.
For HH values, use hour values ranging from 00 - 23.
For MM value, use minute values ranging from 00 - 59.
MongoDB evaluates the start and stop times relative to the time zone of each individual mongos instance in the
sharded cluster. If your mongos instances are physically located in different time zones, set the time zone on each
server to UTC+-00:00 so that the balancer window is uniformly interpreted.
Note: The balancer window must be sufficient to complete the migration of all data inserted during the day.
As data insert rates can change based on activity and usage patterns, it is important to ensure that the balancing window
you select will be sufficient to support the needs of your deployment.
Do not use the sh.startBalancer() method when you have set an activeWindow.
If you have set the balancing window (page 793) and wish to remove the schedule so that the balancer is always
running, use $unset to clear the activeWindow, as in the following:
use config
db.settings.update({ _id : "balancer" }, { $unset : { activeWindow : true } })
By default, the balancer may run at any time and only moves chunks as needed. To disable the balancer for a short
period of time and prevent all migration, use the following procedure:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue the following operation to disable the balancer:
sh.stopBalancer()
If a migration is in progress, the system will complete the in-progress migration before stopping.
3. To verify that the balancer will not start, issue the following command, which returns false if the balancer is
disabled:
sh.getBalancerState()
Optionally, to verify no migrations are in progress after disabling, issue the following operation in the mongo
shell:
use config
while( sh.isBalancerRunning() ) {
print("waiting...");
sleep(1000);
}
Note: To disable the balancer from a driver that does not have the sh.stopBalancer() or
sh.setBalancerState() helpers, issue the following command from the config database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: true } } , { upsert: true } )
Use this procedure if you have disabled the balancer and are ready to re-enable it:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue one of the following operations to enable the balancer:
From the mongo shell, issue:
sh.setBalancerState(true)
From a driver that does not have the sh.startBalancer() helper, issue the following from the config
database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: false } } , { upsert: true } )
If MongoDB migrates a chunk during a backup (page 200), you can end with an inconsistent snapshot of your sharded
cluster. Never run a backup while the balancer is active. To ensure that the balancer is inactive during your backup
operation:
Set the balancing window (page 793) so that the balancer is inactive during the backup. Ensure that the backup
can complete while you have the balancer disabled.
manually disable the balancer (page 794) for the duration of the backup procedure.
If you turn the balancer off while it is in the middle of a balancing round, the shut down is not instantaneous. The
balancer completes the chunk move in-progress and then ceases all further balancing rounds.
Before starting a backup operation, confirm that the balancer is not active. You can use the following command to
determine if the balancer is active:
!sh.getBalancerState() && !sh.isBalancerRunning()
When the backup procedure is complete you can reactivate the balancer process.
You can disable balancing for a specific collection with the sh.disableBalancing() method. You may want
to disable the balancer for a specific collection to support maintenance operations or atypical workloads, for example,
during data ingestions or data exports.
When you disable balancing on a collection, MongoDB will not interrupt in progress migrations.
To disable balancing on a collection, connect to a mongos with the mongo shell and call the
sh.disableBalancing() method.
For example:
sh.disableBalancing("students.grades")
The sh.disableBalancing() method accepts as its parameter the full namespace of the collection.
You can enable balancing for a specific collection with the sh.enableBalancing() method.
When you enable balancing for a collection, MongoDB will not immediately begin balancing data. However, if the
data in your sharded collection is not balanced, MongoDB will be able to begin distributing the data more evenly.
To enable balancing on a collection, connect to a mongos with the mongo shell and call the
sh.enableBalancing() method.
For example:
sh.enableBalancing("students.grades")
The sh.enableBalancing() method accepts as its parameter the full namespace of the collection.
To confirm whether balancing for a collection is enabled or disabled, query the collections collection in the
config database for the collection namespace and check the noBalance field. For example:
db.getSiblingDB("config").collections.findOne({_id : "students.grades"}).noBalance;
On this page
Ensure the Balancer Process is Enabled (page 797)
Determine the Name of the Shard to Remove (page 797)
Remove Chunks from the Shard (page 797)
Check the Status of the Migration (page 798)
Move Unsharded Data (page 798)
Finalize the Migration (page 799)
To remove a shard you must ensure the shards data is migrated to the remaining shards in the cluster. This procedure
describes how to safely migrate data and how to remove a shard.
This procedure describes how to safely remove a single shard. Do not use this procedure to migrate an entire cluster
to new hardware. To migrate an entire shard to new hardware, migrate individual shards as if they were independent
replica sets.
To remove a shard, first connect to one of the clusters mongos instances using mongo shell. Then use the sequence
of tasks in this document to remove a shard from the cluster.
To successfully migrate data from a shard, the balancer process must be enabled. Check the balancer state using
the sh.getBalancerState() helper in the mongo shell. For more information, see the section on balancer
operations (page 794).
To determine the name of the shard, connect to a mongos instance with the mongo shell and either:
Use the listShards command, as in the following:
db.adminCommand( { listShards: 1 } )
From the admin database, run the removeShard command. This begins draining chunks from the shard you are
removing to other shards in the cluster. For example, for a shard named mongodb0, run:
use admin
db.runCommand( { removeShard: "mongodb0" } )
"ok" : 1
}
Depending on your network capacity and the amount of data, this operation can take from a few minutes to several
days to complete.
To check the progress of the migration at any stage in the process, run removeShard from the admin database
again. For example, for a shard named mongodb0, run:
use admin
db.runCommand( { removeShard: "mongodb0" } )
In the output, the remaining document displays the remaining number of chunks that MongoDB must migrate to
other shards and the number of MongoDB databases that have primary status on this shard.
Continue checking the status of the removeShard command until the number of chunks remaining is 0. Always run the
command on the admin database. If you are on a database other than admin, you can use sh._adminCommand
to run the command on admin.
If the shard is the primary shard for one or more databases in the cluster, then the shard will have unsharded data. If
the shard is not the primary shard for any databases, skip to the next task, Finalize the Migration (page 799).
In a cluster, a database with unsharded collections stores those collections only on a single shard. That shard becomes
the primary shard for that database. (Different databases in a cluster can have different primary shards.)
Warning: Do not perform this procedure until you have finished draining the shard.
1. To determine if the shard you are removing is the primary shard for any of the clusters databases, issue one of
the following methods:
sh.status()
db.printShardingStatus()
In the resulting document, the databases field lists each database and its primary shard. For example, the
following database field shows that the products database uses mongodb0 as the primary shard:
{ "_id" : "products", "partitioned" : true, "primary" : "mongodb0" }
2. To move a database to another shard, use the movePrimary command. For example, to migrate all remaining
unsharded data from mongodb0 to mongodb1, issue the following command:
This command does not return until MongoDB completes moving all data, which may take a long time. The
response from this command will resemble the following:
{ "primary" : "mongodb1", "ok" : 1 }
To clean up all metadata information and finalize the removal, run removeShard again. For example, for a shard
named mongodb0, run:
use admin
db.runCommand( { removeShard: "mongodb0" } )
Once the value of the state field is completed, you may safely stop the processes comprising the mongodb0
shard.
See also:
Backup and Restore Sharded Clusters (page 277)
Pre-splitting the chunk ranges in an empty sharded collection allows clients to insert data into an already partitioned
collection. In most situations a sharded cluster will create and distribute chunks automatically without user interven-
tion. However, in a limited number of cases, MongoDB cannot create enough chunks or distribute data fast enough to
support required throughput. For example:
If you want to partition an existing data collection that resides on a single shard.
If you want to ingest a large volume of data into a cluster that isnt balanced, or where the ingestion of data will
lead to data imbalance. For example, monotonically increasing or decreasing shard keys insert all data into a
single chunk.
These operations are resource intensive for several reasons:
Chunk migration requires copying all the data in the chunk from one shard to another.
MongoDB can migrate only a single chunk at a time.
MongoDB creates splits only after an insert operation.
Warning: Only pre-split an empty collection. If a collection already has data, MongoDB automatically splits the
collections data when you enable sharding for the collection. Subsequent attempts to manually create splits can
lead to unpredictable chunk ranges and sizes as well as inefficient or ineffective balancing behavior.
Example
To create chunks for documents in the myapp.users collection using the email field as the shard key, use
the following operation in the mongo shell:
for ( var x=97; x<97+26; x++ ){
for( var y=97; y<97+26; y+=6 ) {
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
db.runCommand( { split : "myapp.users" , middle : { email : prefix } } );
}
}
For information on the balancer and automatic distribution of chunks across shards, see Cluster Balancer
(page 750) and Chunk Migration (page 752). For information on manually migrating chunks, see Migrate
Chunks in a Sharded Cluster (page 801).
Normally, MongoDB splits a chunk after an insert if the chunk exceeds the maximum chunk size (page 754). However,
you may want to split chunks manually if:
you have a large amount of data in your cluster and very few chunks, as is the case after deploying a cluster
using existing data.
you expect to add a large amount of data that would initially reside in a single chunk or shard. For example, you
plan to insert a large amount of data with shard key values between 300 and 400, but all values of your shard
keys are between 250 and 500 are in a single chunk.
Note: New in version 2.6: MongoDB provides the mergeChunks command to combine contiguous chunk ranges
into a single chunk. See Merge Chunks in a Sharded Cluster (page 802) for more information.
The balancer may migrate recently split chunks to a new shard immediately if mongos predicts future insertions will
benefit from the move. The balancer does not distinguish between chunks split manually and those split automatically
by the system.
Warning: Be careful when splitting data in a sharded collection to create new chunks. When you shard a
collection that has existing data, MongoDB automatically creates chunks to evenly distribute the collection. To
split data effectively in a sharded cluster you must consider the number of documents in a chunk and the average
document size to create a uniform chunk size. When chunks have irregular sizes, shards may have an equal number
of chunks but have very different data sizes. Avoid creating splits that lead to a collection with differently sized
chunks.
Use sh.status() to determine the current chunk ranges across the cluster.
To split chunks manually, use the split command with either fields middle or find. The mongo shell provides
the helper methods sh.splitFind() and sh.splitAt().
splitFind() splits the chunk that contains the first document returned that matches this query into two equally
sized chunks. You must specify the full namespace (i.e. <database>.<collection>) of the sharded collection
to splitFind(). The query in splitFind() does not need to use the shard key, though it nearly always makes
sense to do so.
Example
The following command splits the chunk that contains the value of 63109 for the zipcode field in the people
collection of the records database:
sh.splitFind( "records.people", { "zipcode": "63109" } )
Use splitAt() to split a chunk in two, using the queried document as the lower bound in the new chunk:
Example
The following command splits the chunk that contains the value of 63109 for the zipcode field in the people
collection of the records database.
sh.splitAt( "records.people", { "zipcode": "63109" } )
Note: splitAt() does not necessarily split the chunk into two equally sized chunks. The split occurs at the location
of the document matching the query, regardless of where that document is in the chunk.
In most circumstances, you should let the automatic balancer migrate chunks between shards. However, you may
want to migrate chunks manually in a few cases:
When pre-splitting an empty collection, migrate chunks manually to distribute them evenly across the shards.
Use pre-splitting in limited situations to support bulk data ingestion.
If the balancer in an active cluster cannot distribute chunks within the balancing window (page 793), then you
will have to migrate chunks manually.
To manually migrate chunks, use the moveChunk command. For more information on how the automatic balancer
moves chunks between shards, see Cluster Balancer (page 750) and Chunk Migration (page 752).
Example
Migrate a single chunk
The following example assumes that the field username is the shard key for a collection named users in the
myapp database, and that the value smith exists within the chunk to migrate. Migrate the chunk using the following
command in the mongo shell.
db.adminCommand( { moveChunk : "myapp.users",
find : {username : "smith"},
to : "mongodb-shard3.example.net" } )
This command moves the chunk that includes the shard key value smith to the shard named
mongodb-shard3.example.net. The command will block until the migration is complete.
Tip
To return a list of shards, use the listShards command.
Example
Evenly migrate chunks
To evenly migrate chunks for the myapp.users collection, put each prefix chunk on the next shard from the other
and run the following commands in the mongo shell:
var shServer = [ "sh0.example.net", "sh1.example.net", "sh2.example.net", "sh3.example.net", "sh4.exa
for ( var x=97; x<97+26; x++ ){
for( var y=97; y<97+26; y+=6 ) {
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
db.adminCommand({moveChunk : "myapp.users", find : {email : prefix}, to : shServer[(y-97)/6]})
}
}
See Create Chunks in a Sharded Cluster (page 800) for an introduction to pre-splitting.
New in version 2.2: The moveChunk command has the: _secondaryThrottle parameter. When set to true,
MongoDB ensures that changes to shards as part of chunk migrations replicate to secondaries throughout the migration
operation. For more information, see Change Replication Behavior for Chunk Migration (page 791).
Changed in version 2.4: In 2.4, _secondaryThrottle is true by default.
Warning: The moveChunk command may produce the following error message:
This occurs when clients have too many open cursors that access the migrating chunk. You may either wait until
the cursors complete their operations or close the cursors manually.
On this page
Overview (page 803)
Procedure (page 803)
Overview
The mergeChunks command allows you to collapse empty chunks into neighboring chunks on the same shard. A
chunk is empty if it has no documents associated with its shard key range.
Important: Empty chunks can make the balancer assess the cluster as properly balanced when it is not.
Procedure
Note: Examples in this procedure use a users collection in the test database, using the username filed as a
shard key.
Identify Chunk Ranges In the mongo shell, identify the chunk ranges with the following operation:
sh.status()
The chunk ranges appear after the chunk counts for each sharded collection, as in the following excerpts:
Chunk counts:
chunks:
shard0000 7
shard0001 7
Chunk range:
{ "username" : "user36583" } -->> { "username" : "user43229" } on : shard0000 Timestamp(6, 0)
Verify a Chunk is Empty The mergeChunks command requires at least one empty input chunk. To check the
size of a chunk, use the dataSize command in the sharded collections database. For example, the following checks
the amount of data in the chunk for the users collection in the test database:
Important: You must use the use <db> helper to switch to the database containing the sharded collection before
running the dataSize command.
use test
db.runCommand({
"dataSize": "test.users",
"keyPattern": { username: 1 },
"min": { "username": "user36583" },
"max": { "username": "user43229" }
})
If the input chunk to dataSize is empty, dataSize produces output similar to:
{ "size" : 0, "numObjects" : 0, "millis" : 0, "ok" : 1 }
Merge Chunks Merge two contiguous chunks on the same shard, where at least one of the contains no data, with an
operation that resembles the following:
db.runCommand( { mergeChunks: "test.users",
bounds: [ { "username": "user68982" },
{ "username": "user95197" } ]
} )
On any failure condition, mergeChunks returns a document where the value of the ok field is 0.
View Merged Chunks Ranges After merging all empty chunks, confirm the new chunk, as follows:
sh.status()
When the first mongos connects to a set of config servers, it initializes the sharded cluster with a default chunk size
of 64 megabytes. This default chunk size works well for most deployments; however, if you notice that automatic
migrations have more I/O than your hardware can handle, you may want to reduce the chunk size. For automatic splits
and migrations, a small chunk size leads to more rapid and frequent migrations. The allowed range of the chunk size
is between 1 and 1024 megabytes, inclusive.
To modify the chunk size, use the following procedure:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue the following command to switch to the Config Database (page 816):
use config
3. Issue the following save() operation to store the global chunk size configuration value:
db.settings.save( { _id:"chunksize", value: <sizeInMB> } )
Note: The chunkSize and --chunkSize options, passed at startup to the mongos, do not affect the chunk size
after you have initialized the cluster.
To avoid confusion, always set the chunk size using the above procedure instead of the startup options.
If you lower the chunk size, it may take time for all chunks to split to the new size.
Splits cannot be undone.
If you increase the chunk size, existing chunks grow only through insertion or updates until they reach the new
size.
The allowed range of the chunk size is between 1 and 1024 megabytes, inclusive.
On this page
Procedures (page 806)
If MongoDB cannot split a chunk that exceeds the specified chunk size (page 754) or contains a number of documents
that exceeds the max, MongoDB labels the chunk as jumbo (page 754).
If the chunk size no longer hits the limits, MongoDB clears the jumbo flag for the chunk when the mongos reloads
or rewrites the chunk metadata.
In cases where you need to clear the flag manually, the following procedures outline the steps to manually clear the
jumbo flag.
Procedures
Divisible Chunks The preferred way to clear the jumbo flag from a chunk is to attempt to split the chunk. If the
chunk is divisible, MongoDB removes the flag upon successful split of the chunk.
Step 2: Find the jumbo Chunk. Run sh.status(true) to find the chunk labeled jumbo.
sh.status(true)
For example, the following output from sh.status(true) shows that chunk with shard key range { "x" : 2 }
-->> { "x" : 4 } is jumbo.
--- Sharding Status ---
sharding version: {
...
}
shards:
...
databases:
...
test.foo
shard key: { "x" : 1 }
chunks:
shard-b 2
shard-a 2
{ "x" : { "$minKey" : 1 } } -->> { "x" : 1 } on : shard-b Timestamp(2, 0)
{ "x" : 1 } -->> { "x" : 2 } on : shard-a Timestamp(3, 1)
{ "x" : 2 } -->> { "x" : 4 } on : shard-a Timestamp(2, 2) jumbo
{ "x" : 4 } -->> { "x" : { "$maxKey" : 1 } } on : shard-b Timestamp(3, 0)
Step 3: Split the jumbo Chunk. Use either sh.splitAt() or sh.splitFind() to split the jumbo chunk.
sh.splitAt( "test.foo", { x: 3 })
MongoDB removes the jumbo flag upon successful split of the chunk.
Indivisible Chunks In some instances, MongoDB cannot split the no-longer jumbo chunk, such as a chunk with
a range of single shard key value, and the preferred method to clear the flag is not applicable. In such cases, you can
clear the flag using the following steps.
Important: Only use this method if the preferred method (page 806) is not applicable.
Before modifying the config database (page 816), always back up the config database.
If you clear the jumbo flag for a chunk that still exceeds the chunk size and/or the document number limit, MongoDB
will re-label the chunk as jumbo when MongoDB tries to move the chunk.
Step 1: Stop the balancer. Disable the cluster balancer process temporarily, following the steps outlined in Disable
the Balancer (page 794).
Step 2: Create a backup of config database. Use mongodump against a config server to create a backup of the
config database. For example:
mongodump --db config --port <config server port> --out <output file>
Step 4: Find the jumbo Chunk. Run sh.status(true) to find the chunk labeled jumbo.
sh.status(true)
For example, the following output from sh.status(true) shows that chunk with shard key range { "x" : 2 }
-->> { "x" : 3 } is jumbo.
--- Sharding Status ---
sharding version: {
...
}
shards:
...
databases:
...
test.foo
shard key: { "x" : 1 }
chunks:
shard-b 2
shard-a 2
{ "x" : { "$minKey" : 1 } } -->> { "x" : 1 } on : shard-b Timestamp(2, 0)
{ "x" : 1 } -->> { "x" : 2 } on : shard-a Timestamp(3, 1)
{ "x" : 2 } -->> { "x" : 3 } on : shard-a Timestamp(2, 2) jumbo
{ "x" : 3 } -->> { "x" : { "$maxKey" : 1 } } on : shard-b Timestamp(3, 0)
Step 5: Update chunks collection. In the chunks collection of the config database, unset the jumbo flag for
the chunk. For example,
db.getSiblingDB("config").chunks.update(
{ ns: "test.foo", min: { x: 2 }, jumbo: true },
{ $unset: { jumbo: "" } }
)
Step 6: Restart the balancer. Restart the balancer, following the steps in Enable the Balancer (page 795).
Step 7: Optional. Clear current cluster meta information. To ensure that mongos instances update their cluster
information cache, run flushRouterConfig in the admin database.
db.adminCommand({ flushRouterConfig: 1 } )
On this page
Tag a Shard (page 808)
Tag a Shard Key Range (page 808)
Remove a Tag From a Shard Key Range (page 809)
View Existing Shard Tags (page 809)
Additional Resource (page 809)
In a sharded cluster, you can use tags to associate specific ranges of a shard key with a specific shard or subset of
shards.
Tag a Shard
Associate tags with a particular shard using the sh.addShardTag() method when connected to a mongos in-
stance. A single shard may have multiple tags, and multiple shards may also have the same tag.
Example
The following example adds the tag NYC to two shards, and the tags SFO and NRT to a third shard:
sh.addShardTag("shard0000", "NYC")
sh.addShardTag("shard0001", "NYC")
sh.addShardTag("shard0002", "SFO")
sh.addShardTag("shard0002", "NRT")
You may remove tags from a particular shard using the sh.removeShardTag() method when connected to a
mongos instance, as in the following example, which removes the NRT tag from a shard:
sh.removeShardTag("shard0002", "NRT")
To assign a tag to a range of shard keys use the sh.addTagRange() method when connected to a mongos instance.
Any given shard key range may only have one assigned tag. You cannot overlap defined ranges, or tag the same range
Example
Given a collection named users in the records database, sharded by the zipcode field. The following operations
assign:
two ranges of zip codes in Manhattan and Brooklyn the NYC tag
one range of zip codes in San Francisco the SFO tag
sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC")
sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC")
sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")
Note: Shard ranges are always inclusive of the lower value and exclusive of the upper boundary.
The mongod does not provide a helper for removing a tag range. You may delete tag assignment from a shard key
range by removing the corresponding document from the tags (page 821) collection of the config database.
Each document in the tags (page 821) holds the namespace of the sharded collection and a minimum shard key
value.
Example
The following example removes the NYC tag assignment for the range of zip codes within Manhattan:
use config
db.tags.remove({ _id: { ns: "records.users", min: { zipcode: "10001" }}, tag: "NYC" })
The output from sh.status() lists tags associated with a shard, if any, for each shard. A shards tags exist in the
shards document in the shards (page 821) collection of the config database. To return all shards with a specific
tag, use a sequence of operations that resemble the following, which will return only those shards tagged with NYC:
use config
db.shards.find({ tags: "NYC" })
You can find tag ranges for all namespaces in the tags (page 821) collection of the config database. The output of
sh.status() displays all tag ranges. To return all shard key ranges tagged with NYC, use the following sequence
of operations:
use config
db.tags.find({ tags: "NYC" })
Additional Resource
On this page
Overview (page 810)
Procedures (page 810)
Overview
The unique constraint on indexes ensures that only one document can have a value for a field in a collection. For
sharded collections these unique indexes cannot enforce uniqueness because insert and indexing operations are local
to each shard.
MongoDB does not support creating new unique indexes in sharded collections and will not allow you to shard col-
lections with unique indexes on fields other than the _id field.
If you need to ensure that a field is always unique in a sharded collection, there are three options:
1. Enforce uniqueness of the shard key (page 739).
MongoDB can enforce uniqueness for the shard key. For compound shard keys, MongoDB will enforce unique-
ness on the entire key combination, and not for a specific component of the shard key.
You cannot specify a unique constraint on a hashed index (page 512).
2. Use a secondary collection to enforce uniqueness.
Create a minimal collection that only contains the unique field and a reference to a document in the main
collection. If you always insert into a secondary collection before inserting to the main collection, MongoDB
will produce an error if you attempt to use a duplicate key.
If you have a small data set, you may not need to shard this collection and you can create multiple unique
indexes. Otherwise you can shard on a single unique key.
3. Use guaranteed unique identifiers.
Universally unique identifiers (i.e. UUID) like the ObjectId are guaranteed to be unique.
Procedures
Process To shard a collection using the unique constraint, specify the shardCollection command in the
following form:
db.runCommand( { shardCollection : "test.users" , key : { email : 1 } , unique : true } );
Remember that the _id field index is always unique. By default, MongoDB inserts an ObjectId into the _id field.
However, you can manually insert your own value into the _id field and use this as the shard key. To use the _id
field as the shard key, use the following operation:
20 https://www.mongodb.com/presentations/webinar-multi-data-center-deployment?jmp=docs
Limitations
You can only enforce uniqueness on one single field in the collection using this method.
If you use a compound shard key, you can only enforce uniqueness on the combination of component keys in
the shard key.
In most cases, the best shard keys are compound keys that include elements that permit write scaling (page 741) and
query isolation (page 741), as well as high cardinality (page 764). These ideal shard keys are not often the same keys
that require uniqueness and enforcing unique values in these collections requires a different approach.
Unique Constraints on Arbitrary Fields If you cannot use a unique field as the shard key or if you need to enforce
uniqueness over multiple fields, you must create another collection to act as a proxy collection. This collection must
contain both a reference to the original document (i.e. its ObjectId) and the unique key.
If you must shard this proxy collection, then shard on the unique key using the above procedure (page 810); other-
wise, you can simply create multiple unique indexes on the collection.
The _id field holds the ObjectId of the document it reflects, and the email field is the field on which you want to
ensure uniqueness.
To shard this collection, use the following operation using the email field as the shard key:
db.runCommand( { shardCollection : "records.proxy" ,
key : { email : 1 } ,
unique : true } );
If you do not need to shard the proxy collection, use the following command to create a unique index on the email
field:
db.proxy.createIndex( { "email" : 1 }, { unique : true } )
You may create multiple unique indexes on this collection if you do not plan to shard the proxy collection.
To insert documents, use the following procedure in the JavaScript shell:
db = db.getSiblingDB('records');
db.proxy.insert({
"_id" : primary_id
"email" : "example@example.net"
})
db.information.insert({
"_id" : primary_id
"email": "example@example.net"
// additional information...
})
You must insert a document into the proxy collection first. If this operation succeeds, the email field is unique, and
you may continue by inserting the actual document into the information collection.
See
The full documentation of: createIndex() and shardCollection.
Considerations
Your application must catch errors when inserting documents into the proxy collection and must enforce
consistency between the two collections.
If the proxy collection requires sharding, you must shard on the single field on which you want to enforce
uniqueness.
To enforce uniqueness on more than one field using sharded proxy collections, you must have one proxy col-
lection for every field for which to enforce uniqueness. If you create multiple unique indexes on a single proxy
collection, you will not be able to shard proxy collections.
Use Guaranteed Unique Identifier The best way to ensure a field has unique values is to generate universally
unique identifiers (UUID,) such as MongoDBs ObjectId values.
This approach is particularly useful for the_id field, which must be unique: for collections where you are not
sharding by the _id field the application is responsible for ensuring that the _id field is unique.
On this page
files Collection (page 812)
chunks Collection (page 812)
files Collection
Most deployments will not need to shard the files collection. The files collection is typically small, and only
contains metadata. None of the required keys for GridFS lend themselves to an even distribution in a sharded situation.
If you must shard the files collection, use the _id field possibly in combination with an application field.
Leaving files unsharded means that all the file metadata documents live on one shard. For production GridFS stores
you must store the files collection on a replica set.
chunks Collection
To shard the chunks collection by { files_id : 1 , n : 1 }, issue commands similar to the following:
db.fs.chunks.createIndex( { files_id : 1 , n : 1 } )
You may also want to shard using just the file_id field, as in the following operation:
db.runCommand( { shardCollection : "test.fs.chunks" , key : { files_id : 1 } } )
Important: { files_id : 1 , n : 1 } and { files_id : 1 } are the only supported shard keys
for the chunks collection of a GridFS store.
The default files_id value is an ObjectId, as a result the values of files_id are always ascending, and applica-
tions will insert all new GridFS data to a single chunk and shard. If your write load is too high for a single server to
handle, consider a different shard key or use a different value for _id in the files collection.
On this page
Config Database String Error (page 813)
Cursor Fails Because of Stale Config Data (page 813)
Avoid Downtime when Moving Config Servers (page 814)
This section describes common strategies for troubleshooting sharded cluster deployments.
Changed in version 3.2: Starting in MongoDB 3.2, config servers are deployed as replica sets by default. The mongos
instances for the sharded cluster must specify the same config server replica set name but can specify hostname and
port of different members of the replica set.
If using the deprecated topology of three mirrored mongod instances for config servers, mongos instances in a
sharded cluster must specify identical configDB string.
A query returns the following warning when one or more of the mongos instances has not yet updated its cache of
the clusters metadata from the config database:
could not initialize cursor across all shards because : stale config detected
This warning should not propagate back to your application. The warning will repeat until all the mongos instances
refresh their caches. To force an instance to refresh its cache, run the flushRouterConfig command.
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config servers
without downtime.
On this page
Sharding Methods in the mongo Shell (page 815)
Sharding Database Commands (page 815)
Reference Documentation (page 816)
Name Description
sh._adminCommand() Runs a database command against the admin database, like db.runCommand(),
but can confirm that it is issued against a mongos.
Reports on the active balancer lock, if it exists.
sh.getBalancerLockDetails()
sh._checkFullName() Tests a namespace to determine if its well formed.
sh._checkMongos() Tests to see if the mongo shell is connected to a mongos instance.
sh._lastMigration() Reports on the last chunk migration.
sh.addShard() Adds a shard to a sharded cluster.
sh.addShardTag() Associates a shard with a tag, to support tag aware sharding (page 748).
sh.addTagRange() Associates range of shard keys with a shard tag, to support tag aware sharding
(page 748).
sh.removeTagRange() Removes an association between a range shard keys and a shard tag. Use to manage
tag aware sharding (page 748).
Disable balancing on a single collection in a sharded database. Does not affect
sh.disableBalancing()
balancing of other collections in a sharded cluster.
sh.enableBalancing()Activates the sharded collection balancer process if previously disabled using
sh.disableBalancing().
sh.enableSharding() Enables sharding on a specific database.
sh.getBalancerHost()Returns the name of a mongos thats responsible for the balancer process.
Returns a boolean to report if the balancer is currently enabled.
sh.getBalancerState()
sh.help() Returns help text for the sh methods.
Returns a boolean to report if the balancer process is currently migrating chunks.
sh.isBalancerRunning()
sh.moveChunk() Migrates a chunk in a sharded cluster.
sh.removeShardTag() Removes the association between a shard and a shard tag.
Enables or disables the balancer which migrates chunks between shards.
sh.setBalancerState()
sh.shardCollection()Enables sharding for a collection.
sh.splitAt() Divides an existing chunk into two chunks using a specific value of the shard key as
the dividing point.
sh.splitFind() Divides an existing chunk that contains a document matching a query into two
approximately equal chunks.
sh.startBalancer() Enables the balancer and waits for balancing to start.
sh.status() Reports on the status of a sharded cluster, as db.printShardingStatus().
sh.stopBalancer() Disables the balancer and waits for any in progress balancing rounds to complete.
sh.waitForBalancer()Internal. Waits for the balancer state to change.
Internal. Waits until the balancer stops running.
sh.waitForBalancerOff()
sh.waitForDLock() Internal. Waits for a specified distributed sharded cluster lock.
Internal. Waits for a change in ping state from one of the mongos in the sharded
sh.waitForPingChange()
cluster.
Name Description
flushRouterConfig Forces an update to the cluster metadata cached by a mongos.
addShard Adds a shard to a sharded cluster.
cleanupOrphaned Removes orphaned data with shard key values outside of the ranges of the chunks
owned by a shard.
checkShardingIndexInternal command that validates index on shard key.
enableSharding Enables sharding on a specific database.
listShards Returns a list of configured shards.
removeShard Starts the process of removing a shard from a sharded cluster.
getShardMap Internal command that reports on the state of a sharded cluster.
getShardVersion Internal command that returns the config server version.
mergeChunks Provides the ability to combine chunks on a single shard.
setShardVersion Internal command to sets the config server version.
shardCollection Enables the sharding functionality for a collection, allowing the collection to be
sharded.
shardingState Reports whether the mongod is a member of a sharded cluster.
unsetSharding Internal command that affects connections between instances in a MongoDB
deployment.
split Creates a new chunk.
splitChunk Internal command to split chunk. Instead use the methods sh.splitFind() and
sh.splitAt().
splitVector Internal command that determines split points.
medianKey Deprecated internal command. See splitVector.
moveChunk Internal command that migrates chunks between shards.
movePrimary Reassigns the primary shard when removing a shard from a sharded cluster.
isdbgrid Verifies that a process is a mongos.
Config Database (page 816) Complete documentation of the content of the local database that MongoDB uses to
store sharded cluster metadata.
Config Database
On this page
Collections (page 817)
The config database supports sharded cluster operation. See the Sharding (page 725) section of this manual for full
documentation of sharded clusters.
Important: Consider the schema of the config database internal and may change between releases of MongoDB.
The config database is not a dependable API, and users should not write data to the config database in the course
of normal operation or maintenance.
Warning: Modification of the config database on a functioning system may lead to instability or inconsistent
data sets. If you must modify the config database, use mongodump to create a full backup of the config
database.
To access the config database, connect to a mongos instance in a sharded cluster, and use the following helper:
use config
You can return a list of the collections, with the following helper:
show collections
Collections
config
config.changelog
The changelog (page 817) collection stores a document for each change to the metadata of a sharded collec-
tion.
Example
The following example displays a single record of a chunk split from a changelog (page 817) collection:
{
"_id" : "<hostname>-<timestamp>-<increment>",
"server" : "<hostname><:port>",
"clientAddr" : "127.0.0.1:63381",
"time" : ISODate("2012-12-11T14:09:21.039Z"),
"what" : "split",
"ns" : "<database>.<collection>",
"details" : {
"before" : {
"min" : {
"<database>" : { $minKey : 1 }
},
"max" : {
"<database>" : { $maxKey : 1 }
},
"lastmod" : Timestamp(1000, 0),
"lastmodEpoch" : ObjectId("000000000000000000000000")
},
"left" : {
"min" : {
"<database>" : { $minKey : 1 }
},
"max" : {
"<database>" : "<value>"
},
"lastmod" : Timestamp(1000, 1),
"lastmodEpoch" : ObjectId(<...>)
},
"right" : {
"min" : {
"<database>" : "<value>"
},
"max" : {
"<database>" : { $maxKey : 1 }
},
"lastmod" : Timestamp(1000, 2),
"lastmodEpoch" : ObjectId("<...>")
}
}
}
Each document in the changelog (page 817) collection contains the following fields:
config.changelog._id
The value of changelog._id is: <hostname>-<timestamp>-<increment>.
config.changelog.server
The hostname of the server that holds this data.
config.changelog.clientAddr
A string that holds the address of the client, a mongos instance that initiates this change.
config.changelog.time
A ISODate timestamp that reflects when the change occurred.
config.changelog.what
Reflects the type of change recorded. Possible values are:
dropCollection
dropCollection.start
dropDatabase
dropDatabase.start
moveChunk.start
moveChunk.commit
split
multi-split
config.changelog.ns
Namespace where the change occurred.
config.changelog.details
A document that contains additional details regarding the change. The structure of the details
(page 818) document depends on the type of change.
config.chunks
The chunks (page 818) collection stores a document for each chunk in the cluster. Consider the following
example of a document for a chunk named records.pets-animal_\"cat\":
{
"_id" : "mydb.foo-a_\"cat\"",
"lastmod" : Timestamp(1000, 3),
"lastmodEpoch" : ObjectId("5078407bd58b175c5c225fdc"),
"ns" : "mydb.foo",
"min" : {
"animal" : "cat"
},
"max" : {
"animal" : "dog"
},
"shard" : "shard0004"
}
These documents store the range of values for the shard key that describe the chunk in the min and max fields.
Additionally the shard field identifies the shard in the cluster that owns the chunk.
config.collections
The collections (page 819) collection stores a document for each sharded collection in the cluster. Given
a collection named pets in the records database, a document in the collections (page 819) collection
would resemble the following:
{
"_id" : "records.pets",
"lastmod" : ISODate("1970-01-16T15:00:58.107Z"),
"dropped" : false,
"key" : {
"a" : 1
},
"unique" : false,
"lastmodEpoch" : ObjectId("5078407bd58b175c5c225fdc")
}
config.databases
The databases (page 819) collection stores a document for each database in the cluster, and tracks if the
database has sharding enabled. databases (page 819) represents each database in a distinct document. When
a databases have sharding enabled, the primary field holds the name of the primary shard.
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "mydb", "partitioned" : true, "primary" : "shard0000" }
config.lockpings
The lockpings (page 819) collection keeps track of the active components in the sharded cluster. Given
a cluster with a mongos running on example.com:30000, the document in the lockpings (page 819)
collection would resemble:
{ "_id" : "example.com:30000:1350047994:16807", "ping" : ISODate("2012-10-12T18:32:54.892Z") }
config.locks
The locks (page 820) collection stores a distributed lock. This ensures that only one mongos instance can
perform administrative tasks on the cluster at once. The mongos acting as balancer takes a lock by inserting a
document resembling the following into the locks collection.
{
"_id" : "balancer",
"process" : "example.net:40000:1350402818:16807",
"state" : 2,
"ts" : ObjectId("507daeedf40e1879df62e5f3"),
"when" : ISODate("2012-10-16T19:01:01.593Z"),
"who" : "example.net:40000:1350402818:16807:Balancer:282475249",
"why" : "doing balance round"
}
If a mongos holds the balancer lock, the state field has a value of 2, which means that balancer is active.
The when field indicates when the balancer began the current operation.
Changed in version 2.0: The value of the state field was 1 before MongoDB 2.0.
config.mongos
The mongos (page 820) collection stores a document for each mongos instance affiliated with the cluster.
mongos instances send pings to all members of the cluster every 30 seconds so the cluster can verify that the
mongos is active. The ping field shows the time of the last ping, while the up field reports the uptime of the
mongos as of the last ping. The cluster maintains this collection for reporting purposes.
The following document shows the status of the mongos running on example.com:30000.
{ "_id" : "example.com:30000", "ping" : ISODate("2012-10-12T17:08:13.538Z"), "up" : 13699, "wait
config.settings
The settings (page 820) collection holds the following sharding configuration settings:
Chunk size. To change chunk size, see Modify Chunk Size in a Sharded Cluster (page 805).
Balancer status. To change status, see Disable the Balancer (page 794).
The following is an example settings collection:
{ "_id" : "chunksize", "value" : 64 }
{ "_id" : "balancer", "stopped" : false }
config.shards
The shards (page 821) collection represents each shard in the cluster in a separate document. If the shard
is a replica set, the host field displays the name of the replica set, then a slash, then the hostname, as in the
following example:
{ "_id" : "shard0000", "host" : "shard1/localhost:30000" }
If the shard has tags (page 748) assigned, this document has a tags field, that holds an array of the tags, as in
the following example:
{ "_id" : "shard0001", "host" : "localhost:30001", "tags": [ "NYC" ] }
config.tags
The tags (page 821) collection holds documents for each tagged shard key range in the cluster. The documents
in the tags (page 821) collection resemble the following:
{
"_id" : { "ns" : "records.users", "min" : { "zipcode" : "10001" } },
"ns" : "records.users",
"min" : { "zipcode" : "10001" },
"max" : { "zipcode" : "10281" },
"tag" : "NYC"
}
config.version
The version (page 821) collection holds the current metadata version number. This collection contains only
one document:
To access the version (page 821) collection you must use the db.getCollection() method. For exam-
ple, to display the collections document:
mongos> db.getCollection("version").find()
{ "_id" : 1, "version" : 3 }
On this page
How does a collection differ from a table? (page 823)
How do I create a database and a collection? (page 823)
How do I define or alter the collection schema? (page 824)
Does MongoDB support SQL? (page 824)
Does MongoDB support transactions? (page 824)
Does MongoDB handle caching? (page 824)
Instead of tables, a MongoDB database stores its data in collections. A collection holds one or more BSON documents
(page 186). Documents are analogous to records or rows in a relational database table. Each document has one or
more fields (page 186); fields are similar to the columns in a relational database table.
See also:
SQL to MongoDB Mapping Chart (page 145), Introduction to MongoDB (page 3)
If a database does not exist, MongoDB creates the database when you first store data for that database.
1
If a collection does not exist, MongoDB creates the collection when you first store data for that collection.
As such, you can switch to a non-existent database (use <dbname>) and perform the following operation:
use myNewDB
db.myNewCollection1.insert( { x: 1 } )
db.myNewCollection2.createIndex( { a: 1 } )
1 You can also create a collection explicitly using db.createCollection if you want to specify specific options, such as maximum size or
823
MongoDB Documentation, Release 3.2.4
The insert operation creates both the database myNewDB and the collection myNewCollection1 if they do not
already exist.
The createIndex operation, which occurs after the myNewDB has been created, creates the index and the collection
myNewCollection2 if the collection does not exist. If myNewDb did not exist, the createIndex operation
would have also created the myNewDB.
You do not need to specify a schema for a collection in MongoDB. Although it is common for the documents in a
collection to have a largely homogeneous structure, it is not a requirement; i.e. documents in a single collection do not
need to have the same set of fields. The data type for a field can differ across documents in a collection as well.
To change the structure of the documents in a collection, update the documents to the new structure. For instance, add
new fields, remove existing ones, or update the value of a field to a new type.
Changed in version 3.2: Starting in MongoDB 3.2, however, you can enforce document validation rules (page 160)
for a collection during update and insert operations.
Some collection properties, such as specifying a maximum size, can be specified during the explicit creation of a
collection and be modified. See db.createCollection and collMod. If you are not specifying these properties,
you do not need to explicitly create the collection since MongoDB creates new collections when you first store data
for the collections.
No. However, MongoDB does support a rich query language of its own. For examples on using MongoDBs query
language, see MongoDB CRUD Tutorials (page 99)
See also:
SQL to MongoDB Mapping Chart (page 145)
MongoDB does not support multi-document transactions. However, MongoDB does provide atomic operations on a
single document.
For more details on MongoDBs isolation guarantees and behavior under concurrency, see FAQ: Concurrency
(page 835).
Yes. MongoDB keeps most recently used data in RAM. If you have created indexes for your queries and your working
data set fits in RAM, MongoDB serves all queries from memory.
MongoDB does not cache the query results in order to return the cached results for identical queries.
For more information on MongoDB and memory use, see WiredTiger and Memory Use (page 589) and MMAPv1 and
Memory Use (page 597).
On this page
What is a namespace in MongoDB? (page 825)
If you remove a document, does MongoDB remove it from disk? (page 825)
When does MongoDB write updates to disk? (page 826)
How do I do transactions and locking in MongoDB? (page 826)
How do you aggregate data with MongoDB? (page 826)
Why does MongoDB log so many Connection Accepted events? (page 826)
Does MongoDB run on Amazon EBS? (page 827)
Why are MongoDBs data files so large? (page 827)
How do I optimize storage use for small documents? (page 827)
When should I use GridFS? (page 828)
How does MongoDB address SQL or Query injection? (page 828)
How does MongoDB provide concurrency? (page 830)
What is the compare order for BSON types? (page 830)
When multiplying values of mixed types, what type conversion rules apply? (page 831)
How do I query for fields that have null values? (page 831)
Are there any restrictions on the names of Collections? (page 832)
How do I isolate cursors from intervening write operations? (page 833)
When should I embed documents within other documents? (page 833)
Where can I learn more about data modeling in MongoDB? (page 834)
Can I manually pad documents to prevent moves during updates? (page 834)
This document answers common questions about application development using MongoDB.
If you dont find the answer youre looking for, check the complete list of FAQs (page 823) or post your question to
the MongoDB User Mailing List2 .
Yes.
When you use remove(), the object will no longer exist in MongoDBs on-disk data storage.
2 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
3 Each index also has its own namespace.
4 MongoDB database have a configurable limit on the number of namespaces in a database.
MongoDB does not have support for traditional locking or complex transactions with rollback. MongoDB aims to be
lightweight, fast, and predictable in its performance. This is similar to the MySQL MyISAM autocommit model. By
keeping transaction support extremely simple, MongoDB can provide greater performance especially for partitioned
or replicated systems with a number of database server processes.
MongoDB does have support for atomic operations within a single document. Given the possibilities provided by
nested documents, this feature provides support for a large number of use-cases.
See also:
The Atomicity and Transactions (page 88) page.
In version 2.1 and later, you can use the new aggregation framework (page 447), with the aggregate command.
MongoDB also supports map-reduce with the mapReduce command, as well as basic aggregation with the group,
count, and distinct. commands.
See also:
The Aggregation (page 443) page.
If you see a very large number connection and re-connection messages in your MongoDB log, then clients are fre-
quently connecting and disconnecting to the MongoDB server. This is normal behavior for applications that do not use
request pooling, such as CGI. Consider using FastCGI, an Apache Module, or some other kind of persistent application
server to decrease the connection overhead.
If these connections do not impact your performance you can use the run-time quiet option or the command-line
option --quiet to suppress these messages from the log.
Yes.
MongoDB users of all sizes have had a great deal of success using MongoDB on the EC2 platform using EBS disks.
See also:
Amazon EC25
MongoDB aggressively preallocates data files to reserve space and avoid file system fragmentation. You can use the
storage.smallFiles setting to modify the file preallocation strategy.
See also:
Why are the files in my data directory larger than the data in my database? (page 853)
Each MongoDB document contains a certain amount of overhead. This overhead is normally insignificant but becomes
significant if all documents are just a few bytes, as might be the case if the documents in your collection only have one
or two fields.
Consider the following suggestions and strategies for optimizing storage utilization for these collections:
Use the _id field explicitly.
MongoDB clients automatically add an _id field to each document and generate a unique 12-byte ObjectId for
the _id field. Furthermore, MongoDB always indexes the _id field. For smaller documents this may account
for a significant amount of space.
To optimize storage use, users can specify a value for the _id field explicitly when inserting documents into the
collection. This strategy allows applications to store a value in the _id field that would have occupied space in
another portion of the document.
You can store any value in the _id field, but because this value serves as a primary key for documents in the
collection, it must uniquely identify them. If the fields value is not unique, then it cannot serve as a primary key
as there would be collisions in the collection.
Use shorter field names.
MongoDB stores all field names in every document. For most documents, this represents a small fraction of the
space used by a document; however, for small documents the field names may represent a proportionally large
amount of space. Consider a collection of documents that resemble the following:
{ last_name : "Smith", best_score: 3.9 }
If you shorten the field named last_name to lname and the field named best_score to score, as follows,
you could save 9 bytes per document.
{ lname : "Smith", score : 3.9 }
5 https://docs.mongodb.org/ecosystem/platforms/amazon-ec2
Shortening field names reduces expressiveness and does not provide considerable benefit for larger documents
and where document overhead is not of significant concern. Shorter field names do not reduce the size of
indexes, because indexes have a predefined structure.
In general it is not necessary to use short field names.
Embed documents.
In some cases you may want to embed documents in other documents and save on the per-document overhead.
For documents in a MongoDB collection, you should always use GridFS for storing files larger than 16 MB.
In some situations, storing large files may be more efficient in a MongoDB database than on a system-level filesystem.
If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
When you want to keep your files and metadata automatically synced and deployed across a number of systems
and facilities. When using geographically distributed replica sets (page 634) MongoDB can distribute files and
their metadata automatically to a number of mongod instances and facilities.
When you want to access information from portions of large files without having to load whole files into memory,
you can use GridFS to recall sections of files without reading the entire file into memory.
Do not use GridFS if you need to update the content of the entire file atomically. As an alternative you can store
multiple versions of each file and specify the current version of the file in the metadata. You can update the metadata
field that indicates latest status in an atomic update after uploading the new version of the file, and later remove
previous versions if needed.
Furthermore, if your files are all smaller the 16 MB BSON Document Size limit, consider storing the file man-
ually within a single document. You may use the BinData data type to store the binary data. See your drivers
documentation for details on using BinData.
For more information on GridFS, see GridFS (page 603).
BSON
As a client program assembles a query in MongoDB, it builds a BSON object, not a string. Thus traditional SQL
injection attacks are not a problem. More details and some nuances are covered below.
MongoDB represents queries as BSON objects. Typically client libraries provide a convenient, injection free,
process to build these objects. Consider the following C++ example:
BSONObj my_query = BSON( "name" << a_name );
auto_ptr<DBClientCursor> cursor = c.query("tutorial.persons", my_query);
Here, my_query then will have a value such as { name : "Joe" }. If my_query contained special charac-
ters, for example ,, :, and {, the query simply wouldnt match any documents. For example, users cannot hijack a
query and convert it to a delete.
JavaScript
Note: You can disable all server-side execution of JavaScript, by passing the --noscripting option on the
command line or setting security.javascriptEnabled in a configuration file.
All of the following MongoDB operations permit you to run arbitrary JavaScript expressions directly on the server:
$where
mapReduce
group
You must exercise care in these cases to prevent users from submitting malicious JavaScript.
Fortunately, you can express most queries in MongoDB without JavaScript and for queries that require JavaScript, you
can mix JavaScript and non-JavaScript in a single query. Place all the user-supplied fields directly in a BSON field and
pass JavaScript code to the $where field.
If you need to pass user-supplied values in a $where clause, you may escape these values with the CodeWScope
mechanism. When you set user-submitted values as variables in the scope document, you can avoid evaluating them
on the database server.
Field names in MongoDBs query language have semantic meaning. The dollar sign (i.e $) is a reserved character used
to represent operators (i.e. $inc.) Thus, you should ensure that your applications users cannot inject operators
into their inputs.
In some cases, you may wish to build a BSON object with a user-provided key. In these situations, keys will need
to substitute the reserved $ and . characters. Any character is sufficient, but consider using the Unicode full width
equivalents: U+FF04 (i.e. $) and U+FF0E (i.e. .).
Consider the following example:
BSONObj my_object = BSON( a_key << a_name );
The user may have supplied a $ value in the a_key value. At the same time, my_object might be { $where :
"things" }. Consider the following cases:
Insert. Inserting this into the database does no harm. The insert process does not evaluate the object as a query.
Note: MongoDB client drivers, if properly implemented, check for reserved characters in keys on inserts.
Update. The update() operation permits $ operators in the update argument but does not support the
$where operator. Still, some users may be able to inject operators that can manipulate a single document
only. Therefore your application should escape keys, as mentioned above, if reserved characters are possible.
Query Generally this is not a problem for queries that resemble { x : user_obj }: dollar signs are
not top level and have no effect. Theoretically it may be possible for the user to build a query themselves.
But checking the user-submitted content for $ characters in key names may help protect against this kind of
injection.
Driver-Specific Issues
See the PHP MongoDB Driver Security Notes6 page in the PHP driver documentation for more information
6 http://us.php.net/manual/en/mongo.security.php
MongoDB uses multi-granularity locking 7 that allows operations to lock at the global, database or collection level,
and allows for individual storage engines to implement their own concurrency control below the collection level (e.g.,
at the document-level in WiredTiger).
MongoDB uses reader-writer locks that allow concurrent readers shared access to a resource, such as a database or
collection, but in MMAPv1, give exclusive access to a single write operation.
When writing to a replica set, the locks scope applies to the primary.
In a sharded cluster, locks apply to each individual shard, not to the whole cluster; i.e. each mongod instance is
independent of the others in the shard cluster and uses its own locks (page 835). The operations on one mongod
instance do not block the operations on any others.
For more information, see FAQ: Concurrency (page 835).
MongoDB permits documents within a single collection to have fields with different BSON types. For instance, the
following documents may exist within a single collection.
{ x: "string" }
{ x: 42 }
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectId
9. Boolean
10. Date
11. Timestamp
12. Regular Expression
13. MaxKey (internal type)
MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion
before comparison.
Changed in version 3.0.0: Date objects sort before Timestamp objects. Previously Date and Timestamp objects sorted
together.
The comparison treats a non-existent field as it would an empty BSON Object. As such, a sort on the a field in
documents { } and { a: null } would treat the documents as equivalent in sort order.
7 See the Wikipedia page on Multiple granularity locking (http://en.wikipedia.org/wiki/Multiple_granularity_locking) for more information.
With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than
comparison or a descending sort compares the largest element of the arrays. As such, when comparing a field whose
value is a single-element array (e.g. [ 1 ]) with non-array fields (e.g. 2), the comparison is between 1 and 2. A
comparison of an empty array (e.g. [ ]) treats the empty array as less than null or a missing field.
MongoDB sorts BinData in the following order:
1. First, the length or size of the data.
2. Then, by the BSON one-byte subtype.
3. Finally, by the data, performing a byte-by-byte comparison.
Consider the following mongo example:
db.test.insert( {x : 3 } );
db.test.insert( {x : 2.9 } );
db.test.insert( {x : new Date() } );
db.test.insert( {x : true } );
db.test.find().sort({x:1});
{ "_id" : ObjectId("4b03155dce8de6586fb002c7"), "x" : 2.9 }
{ "_id" : ObjectId("4b03154cce8de6586fb002c6"), "x" : 3 }
{ "_id" : ObjectId("4b031566ce8de6586fb002c9"), "x" : true }
{ "_id" : ObjectId("4b031563ce8de6586fb002c8"), "x" : "Tue Nov 17 2009 16:28:03 GMT-0500 (EST)" }
The $type operator provides access to BSON type comparison in the MongoDB query syntax. See the documentation
on BSON types and the $type operator for additional information.
Warning: Data models that associate a field name with different data types within a collection are strongly
discouraged.
Without internal consistency complicates application code, and can lead to unnecessary complexity for application
developers.
See also:
The Tailable Cursors (page 133) page for an example of a C++ use of MinKey.
12.2.14 When multiplying values of mixed types, what type conversion rules apply?
The $mul multiplies the numeric value of a field by a number. For multiplication with values of mixed numeric types
(32-bit integer, 64-bit integer, float), the following type conversion rules apply:
32-bit Integer 64-bit Integer Float
32-bit Integer 32-bit or 64-bit Integer 64-bit Integer Float
64-bit Integer 64-bit Integer 64-bit Integer Float
Float Float Float Float
Note:
If the product of two 32-bit integers exceeds the maximum value for a 32-bit integer, the result is a 64-bit integer.
Integer operations of any type that exceed the maximum value for a 64-bit integer produce an error.
The { cancelDate : null } query matches documents that either contain the cancelDate field whose
value is null or that do not contain the cancelDate field. If the queried index is sparse (page 519), however, then
the query will only match null values, not missing fields.
Changed in version 2.6: If using the sparse index results in an incomplete result, MongoDB will not use the index
unless a hint() explicitly specifies the index. See Sparse Indexes (page 519) for more information.
Given the following query:
db.test.find( { cancelDate: null } )
Type Check
The { cancelDate : { $type: 10 } } query matches documents that contains the cancelDate field
whose value is null only; i.e. the value of the cancelDate field is of BSON Type Null (i.e. 10) :
db.test.find( { cancelDate : { $type: 10 } } )
The query returns only the document that contains the null value:
{ "_id" : 1, "cancelDate" : null }
Existence Check
The { cancelDate : { $exists: false } } query matches documents that do not contain the
cancelDate field:
db.test.find( { cancelDate : { $exists: false } } )
The query returns only the document that does not contain the cancelDate field:
{ "_id" : 2 }
See also:
The reference documentation for the $type and $exists operators.
Collection names can be any UTF-8 string with the following exceptions:
A collection name should begin with a letter or an underscore.
The empty string ("") is not a valid collection name.
Example
To create a collection _foo and insert the { a : 1 } document, use the following operation:
db.getCollection("_foo").insert( { a : 1 } )
Warning:
You cannot use snapshot() with sharded collections.
You cannot use snapshot() with the sort() or hint() cursor methods.
As an alternative, if your collection has a field or fields that are never modified, you can use a unique index on this
field or these fields to achieve a similar result as the snapshot(). Query with hint() to explicitly force the query
to use that index.
When modeling data in MongoDB (page 162), embedding is frequently the choice for:
contains relationships between entities.
one-to-many relationships when the many objects always appear with or are viewed in the context of their
parents.
8 https://api.mongodb.org/
9 As a cursor returns documents other operations may interleave with the query: with MMAPv1 storage engine (page 595), if some of these
operations are updates (page 77) that cause the document to move (in the case of a table scan, caused by document growth) or that change the
indexed field on the index used by the query; then the cursor will return the same document more than once.
You should also consider embedding for performance reasons if you have a collection with a large number of small
documents. Nevertheless, if small, separate documents represent the natural model for the data, then you should
maintain that model.
If, however, you can group these small documents by some logical relationship and you frequently retrieve the doc-
uments by this grouping, you might consider rolling-up the small documents into larger documents that contain an
array of embedded documents. Keep in mind that if you often only need to retrieve a subset of the documents within
the group, then rolling-up the documents may not provide better performance.
Rolling up these small documents into logical groupings means that queries to retrieve a group of documents involve
sequential reads and fewer random disk accesses.
Additionally, rolling up documents and moving common fields to the larger document benefit the index on these
fields. There would be fewer copies of the common fields and there would be fewer associated key entries in the
corresponding index. See Index Concepts (page 492) for more information on indexes.
Begin by reading the documents in the Data Models (page 157) section. These documents contain a high level intro-
duction to data modeling considerations in addition to practical examples of data models targeted at particular issues.
Additionally, consider the following external resources that provide additional examples:
Schema Design by Example10
Dynamic Schema Blog Post11
MongoDB Data Modeling and Rails12
Ruby Example of Materialized Paths13
Sean Cribs Blog Post14 which was the source for much of the data-modeling-trees content.
Warning: Do not manually pad documents in a capped collection. Applying manual padding to a document in a
capped collection can break replication. Also, the padding is not preserved if you re-sync the MongoDB instance.
10 http://www.mongodb.com/presentations/mongodb-melbourne-2012/schema-design-example
11 http://dmerr.tumblr.com/post/6633338010/schemaless
12 https://docs.mongodb.org/ecosystem/tutorial/model-data-for-ruby-on-rails/
13 http://github.com/banker/newsmonger/blob/master/app/models/comment.rb
14 http://seancribbs.com/tech/2009/09/28/modeling-a-tree-in-a-document-database
db.myCollection.update( { _id: 5 },
{ $unset: { paddingField: "" } }
)
db.myCollection.update( { _id: 5 },
{ $set: { realField: "Some text that I might have needed padding for" } }
)
See also:
Record Allocation Strategies (page 596)
On this page
What type of locking does MongoDB use? (page 835)
How granular are locks in MongoDB? (page 836)
How do I see the status of locks on my mongod instances? (page 836)
Does a read or write operation ever yield the lock? (page 837)
Which operations lock the database? (page 837)
Which administrative commands lock the database? (page 838)
Does a MongoDB operation ever lock more than one database? (page 839)
How does sharding affect concurrency? (page 839)
How does concurrency affect a replica set primary? (page 839)
How does concurrency affect secondaries? (page 839)
Does MongoDB support transactions? (page 839)
What isolation guarantees does MongoDB provide? (page 840)
Can reads see changes that have not been committed to disk? (page 840)
MongoDB uses multi-granularity locking 15 that allows operations to lock at the global, database or collection level,
and allows for individual storage engines to implement their own concurrency control below the collection level (e.g.,
at the document-level in WiredTiger).
15 See the Wikipedia page on Multiple granularity locking (http://en.wikipedia.org/wiki/Multiple_granularity_locking) for more information.
MongoDB uses reader-writer locks that allow concurrent readers shared access to a resource, such as a database or
collection, but in MMAPv1, give exclusive access to a single write operation.
In addition to a shared (S) locking mode for reads and an exclusive (X) locking mode for write operations, intent
shared (IS) and intent exclusive (IX) modes indicate an intent to read or write a resource using a finer granularity lock.
When locking at a certain granularity all higher levels are locked using an intent lock.
For example, when locking a collection for writing (using mode X), both the corresponding database lock and the
global lock must be locked in intent exclusive (IX) mode. A single database can simultaneously be locked in IS and
IX mode, but an exclusive (X) lock cannot coexist with any other modes, and a shared (S) lock can only coexists with
intent shared (IS) locks.
Locks are fair, with reads and writes being queued in order. However, to optimize throughput, when one request is
granted, all other compatible requests will be granted at the same time, potentially releasing them before a conflicting
request. For example, consider a case in which an X lock was just released, and in which the conflict queue contains
the following items:
IS IS X X S IS
In strict first-in, first-out (FIFO) ordering, only the first two IS modes would be granted. Instead MongoDB will
actually grant all IS and S modes, and once they all drain, it will grant X, even if new IS or S requests have been
queued in the meantime. As a grant will always move all other requests ahead in the queue, no starvation of any
request is possible.
For WiredTiger
Beginning with version 3.0, MongoDB ships with the WiredTiger (page 587) storage engine.
For most read and write operations, WiredTiger uses optimistic concurrency control. WiredTiger uses only intent locks
at the global, database and collection levels. When the storage engine detects conflicts between two operations, one
will incur a write conflict causing MongoDB to transparently retry that operation.
Some global operations, typically short lived operations involving multiple databases, still require a global instance-
wide lock. Some other operations, such as dropping a collection, still require an exclusive database lock.
For MMAPv1
The MMAPv1 storage engine uses collection-level locking as of the 3.0 release series, an improvement on earlier
versions in which the database lock was the finest-grain lock. Third-party storage engines may either use collection-
level locking or implement their own finer-grained concurrency control.
For example, if you have six collections in a database using the MMAPv1 storage engine and an operation takes a
collection-level write lock, the other five collections are still available for read and write operations. An exclusive
database lock makes all six collections unavailable for the duration of the operation holding the lock.
For reporting on lock utilization information on locks, use any of the following methods:
db.serverStatus(),
db.currentOp(),
mongotop,
mongostat, and/or
the MongoDB Cloud Manager16 or Ops Manager, an on-premise solution available in MongoDB Enterprise
Advanced17
Specifically, the locks document in the output of serverStatus, or the locks field in the current
operation reporting provides insight into the type of locks and amount of lock contention in your mongod
instance.
To terminate an operation, use db.killOp().
In some situations, read and write operations can yield their locks.
Long running read and write operations, such as queries, updates, and deletes, yield under many conditions. MongoDB
operations can also yield locks between individual document modifications in write operations that affect multiple
documents like update() with the multi parameter.
MongoDBs MMAPv1 (page 595) storage engine uses heuristics based on its access pattern to predict whether data is
likely in physical memory before performing a read. If MongoDB predicts that the data is not in physical memory,
an operation will yield its lock while MongoDB loads the data into memory. Once data is available in memory, the
operation will reacquire the lock to complete the operation.
For storage engines supporting document level concurrency control, such as WiredTiger (page 587), yielding is not
necessary when accessing storage as the intent locks, held at the global, database and collection level, do not block
other readers and writers.
Changed in version 2.6: MongoDB does not yield locks when scanning an index even if it predicts that the index is
not in memory.
Certain administrative commands can exclusively lock the database for extended periods of time. In some deploy-
ments, for large databases, you may consider taking the mongod instance offline so that clients are not affected. For
example, if a mongod is part of a replica set, take the mongod offline and let other members of the set service load
while maintenance is in progress.
The following administrative operations require an exclusive (i.e. write) lock on the database for extended periods:
db.collection.createIndex(), when issued without setting background to true,
reIndex,
compact,
db.repairDatabase(),
db.createCollection(), when creating a very large (i.e. many gigabytes) capped collection,
db.collection.validate(), and
db.copyDatabase(). This operation may lock all databases. See Does a MongoDB operation ever lock
more than one database? (page 839).
The following administrative commands lock the database but only hold the lock for a very short time:
db.collection.dropIndex(),
db.getLastError(),
db.isMaster(),
rs.status() (i.e. replSetGetStatus),
db.serverStatus(),
db.auth(), and
db.addUser().
12.3.7 Does a MongoDB operation ever lock more than one database?
Sharding improves concurrency by distributing collections over multiple mongod instances, allowing shard servers
(i.e. mongos processes) to perform any number of operations concurrently to the various downstream mongod
instances.
In a sharded cluster, locks apply to each individual shard, not to the whole cluster; i.e. each mongod instance is
independent of the others in the shard cluster and uses its own locks (page 835). The operations on one mongod
instance do not block the operations on any others.
With replica sets, when MongoDB writes to a collection on the primary, MongoDB also writes to the primarys oplog,
which is a special collection in the local database. Therefore, MongoDB must lock both the collections database
and the local database. The mongod must lock both databases at the same time to keep the database consistent and
ensure that write operations, even with replication, are all-or-nothing operations.
When writing to a replica set, the locks scope applies to the primary.
In replication, MongoDB does not apply writes serially to secondaries. Secondaries collect oplog entries in batches
and then apply those batches in parallel. Secondaries do not allow reads while applying the write operations, and apply
write operations in the order that they appear in the oplog.
See also:
Atomicity and Transactions (page 88)
MongoDB provides the following guarantees in the presence of concurrent read and write operations. These guarantees
hold on systems configured with either the MMAPv1 or WiredTiger storage engines.
1. Write operations are atomic with respect to a single document; i.e. if a write is updating multiple fields in the
document, a reader will never see the document with only some of the fields updated.
With a single mongod instance, a set of read and write operations to a single document is serializable. With
replica sets, only in the absence of a rollback, is a set of read and write operations to a single document serializ-
able.
2. Correctness with respect to query predicates, e.g. db.collection.find() will only return documents that
match and db.collection.update() will only write to matching documents.
3. Correctness with respect to sort. For read operations that request a sort order (e.g. db.collection.find()
or db.collection.aggregate()), the sort order will not be violated due to concurrent writes.
Although MongoDB provides these strong guarantees for single-document operations, read and write operations may
access an arbitrary number of documents during execution. Multi-document operations do not occur transactionally
and are not isolated from concurrent writes. This means that the following behaviors are expected under the normal
operation of the system, for both the MMAPv1 and WiredTiger storage engines:
1. Non-point-in-time read operations. Suppose a read operation begins at time t1 and starts reading documents. A
write operation then commits an update to one of the documents at some later time t2 . The reader may see the
updated version of the document, and therefore does not see a point-in-time snapshot of the data.
2. Non-serializable operations. Suppose a read operation reads a document d1 at time t1 and a write operation
updates d1 at some later time t3 . This introduces a read-write dependency such that, if the operations were to be
serialized, the read operation must precede the write operation. But also suppose that the write operation updates
document d2 at time t2 and the read operation subsequently reads d2 at some later time t4 . This introduces a
write-read dependency which would instead require the read operation to come after the write operation in a
serializable schedule. There is a dependency cycle which makes serializability impossible.
3. Dropped results for MMAPv1. For MMAPv1, reads may miss matching documents that are updated or deleted
during the course of the read operation. However, data that has not been modified during the operation will
always be visible.
See also:
Atomicity and Transactions (page 88)
12.3.13 Can reads see changes that have not been committed to disk?
Changed in version 3.2: MongoDB 3.2 introduces readConcern (page 882) option. Clients using majority
readConcern cannot see the results of writes before they are made durable.
Readers, using "local" (page 144) readConcern can see the results of writes before they are made durable,
regardless of write concern level or journaling configuration. As a result, applications may observe the following
behaviors:
1. MongoDB will allow a concurrent reader to see the result of the write operation before the write is acknowledged
to the client application. For details on when writes are acknowledged for different write concern levels, see
Write Concern (page 141).
2. Reads can see data which may subsequently be rolled back in cases such as replica set failover or power loss. It
does not mean that read operations can see documents in a partially written or otherwise inconsistent state.
Other systems refer to these semantics as read uncommitted.
Changed in version 3.2.
On this page
Is sharding appropriate for a new deployment? (page 841)
How does sharding work with replication? (page 842)
Can I change the shard key after sharding a collection? (page 842)
What happens to unsharded collections in sharded databases? (page 842)
How does MongoDB distribute data across shards? (page 842)
What happens if a client updates a document in a chunk during a migration? (page 842)
What happens to queries if a shard is inaccessible or slow? (page 842)
How does MongoDB distribute queries among shards? (page 843)
How does MongoDB sort queries in sharded environments? (page 843)
How does MongoDB ensure unique _id field values when using a shard key other than _id? (page 843)
Ive enabled sharding and added a second shard, but all the data is still on one server. Why? (page 843)
Is it safe to remove old files in the moveChunk directory? (page 844)
How does mongos use connections? (page 844)
Why does mongos hold connections open? (page 844)
Where does MongoDB report on connections used by mongos? (page 844)
What does writebacklisten in the log mean? (page 844)
How should administrators deal with failed migrations? (page 844)
What is the process for moving, renaming, or changing the number of config servers? (page 845)
When do the mongos servers detect config server changes? (page 845)
Is it possible to quickly update mongos servers after updating a replica set configuration? (page 845)
What does the maxConns setting on mongos do? (page 845)
How do indexes impact queries in sharded systems? (page 845)
Can shard keys be randomly generated? (page 845)
Can shard keys have a non-uniform distribution of values? (page 846)
Can you shard on the _id field? (page 846)
What do moveChunk commit failed errors mean? (page 846)
How does draining a shard affect the balancing of uneven chunk distribution? (page 846)
This document answers common questions about horizontal scaling using MongoDBs sharding.
If you dont find the answer youre looking for, check the complete list of FAQs (page 823) or post your question to
the MongoDB User Mailing List18 .
Sometimes.
If your data set fits on a single server, you should begin with an unsharded deployment.
Converting an unsharded database to a sharded cluster is easy and seamless, so there is little advantage in configuring
sharding while your data set is small.
18 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
Still, all production deployments should use replica sets to provide high availability and disaster recovery.
No.
There is no automatic support in MongoDB for changing a shard key after sharding a collection. This reality un-
derscores the importance of choosing a good shard key (page 739). If you must change a shard key after sharding a
collection, the best option is to:
dump all data from MongoDB into an external format.
drop the original sharded collection.
configure sharding using a more ideal shard key.
pre-split (page 800) the shard key range to ensure initial even distribution.
restore the dumped data into MongoDB.
See shardCollection, sh.shardCollection(), the Shard Key (page 739), Deploy a Sharded Cluster
(page 757), and SERVER-400019 for more information.
In the current implementation, all databases in a sharded cluster have a primary shard. All unsharded collection
within that database will reside on the same shard.
Sharding must be specifically enabled on a collection. After enabling sharding on the collection, MongoDB will assign
various ranges of collection data to the different shards in the cluster. The cluster automatically corrects imbalances
between shards by migrating ranges of data from one shard to another.
The mongos routes the operation to the old shard, where it will succeed immediately. Then the shard mongod in-
stances will replicate the modification to the new shard before the sharded cluster updates that chunks ownership,
which effectively finalizes the migration process.
If you call the cursor.sort() method on a query in a sharded environment, the mongod for each shard will sort
its results, and the mongos merges each shards results before returning them to the client.
12.4.10 How does MongoDB ensure unique _id field values when using a shard
key other than _id?
If you do not use _id as the shard key, then your application/client layer must be responsible for keeping the _id
field unique. It is problematic for collections to have duplicate _id values.
If youre not sharding your collection by the _id field, then you should be sure to store a globally unique identifier in
that field. The default BSON ObjectId (page 192) works well in this case.
12.4.11 Ive enabled sharding and added a second shard, but all the data is still on
one server. Why?
First, ensure that youve declared a shard key for your collection. Until you have configured the shard key, MongoDB
will not create chunks, and sharding will not occur.
Next, keep in mind that the default chunk size is 64 MB. As a result, in most situations, the collection needs to have at
least 64 MB of data before a migration will occur.
Additionally, the system which balances chunks among the servers attempts to avoid superfluous migrations. Depend-
ing on the number of shards, your shard key, and the amount of data, systems often require at least 10 chunks of data
to trigger migrations.
You can run db.printShardingStatus() to see all the chunks present in your cluster.
Yes. mongod creates these files as backups during normal shard balancing operations. If some error occurs during a
migration (page 751), these files may be helpful in recovering documents affected during the migration.
Once the migration has completed successfully and there is no need to recover documents from these files, you may
safely delete these files. Or, if you have an existing backup of the database that you can use for recovery, you may also
delete these files after migration.
To determine if all migrations are complete, run sh.isBalancerRunning() while connected to a mongos in-
stance.
Each client maintains a connection to a mongos instance. Each mongos instance maintains a pool of connections to
the members of a replica set supporting the sharded cluster. Clients use connections between mongos and mongod
instances one at a time. Requests are not multiplexed or pipelined. When client requests complete, the mongos
returns the connection to the pool.
See the System Resource Utilization (page 295) section of the UNIX ulimit Settings (page 295) document.
mongos uses a set of connection pools to communicate with each shard. These pools do not shrink when the number
of clients decreases.
This can lead to an unused mongos with a large number of open connections. If the mongos is no longer in use, it is
safe to restart the process to close existing connections.
Connect to the mongos with the mongo shell, and run the following command:
db._adminCommand("connPoolStats");
The writeback listener is a process that opens a long poll to relay writes back from a mongod or mongos after
migrations to make sure they have not gone to the wrong server. The writeback listener sends writes back to the
correct server if necessary.
These messages are a key part of the sharding infrastructure and should not cause concern.
Failed migrations require no administrative intervention. Chunk migrations always preserve a consistent state. If a mi-
gration fails to complete for some reason, the cluster retries the operation. When the migration completes successfully,
the data resides only on the new shard.
12.4.18 What is the process for moving, renaming, or changing the number of con-
fig servers?
See Sharded Cluster Tutorials (page 756) for information on migrating and replacing config servers.
mongos instances maintain a cache of the config database that holds the metadata for the sharded cluster. This
metadata includes the mapping of chunks to shards.
mongos updates its cache lazily by issuing a request to a shard and discovering that its metadata is out of date. There
is no way to control this behavior from the client, but you can run the flushRouterConfig command against any
mongos to force it to refresh its cache.
12.4.20 Is it possible to quickly update mongos servers after updating a replica set
configuration?
The mongos instances will detect these changes without intervention over time. However, if you want to force the
mongos to reload its configuration, run the flushRouterConfig command against to each mongos directly.
If the query does not include the shard key, the mongos must send the query to all shards as a scatter/gather
operation. Each shard will, in turn, use either the shard key index or another more efficient index to fulfill the query.
If the query includes multiple sub-expressions that reference the fields indexed by the shard key and the secondary
index, the mongos can route the queries to a specific shard and the shard will use the index that will allow it to fulfill
most efficiently. See this presentation20 for more information.
Shard keys can be random. Random keys ensure optimal distribution of data across the cluster.
Sharded clusters, attempt to route queries to specific shards when queries include the shard key as a parameter, because
these directed queries are more efficient. In many cases, random keys can make it difficult to direct queries to specific
shards.
20 http://www.slideshare.net/mongodb/how-queries-work-with-sharding
Yes. There is no requirement that documents be evenly distributed by the shard key.
However, documents that have the same shard key must reside in the same chunk and therefore on the same server. If
your sharded data set has too many documents with the exact same shard key you will not be able to distribute those
documents across your sharded cluster.
You can use any field for the shard key. The _id field is a common shard key.
Be aware that ObjectId() values, which are the default value of the _id field, increment as a timestamp. As a
result, when used as a shard key, all new documents inserted into the collection will initially belong to the same chunk
on a single shard. Although the system will eventually divide this chunk and migrate its contents to distribute data
more evenly, at any moment the cluster can only direct insert operations at a single shard. This can limit the throughput
of inserts. If most of your write operations are updates, this limitation should not impact your performance. However,
if you have a high insert volume, this may be a limitation.
To address this issue, MongoDB 2.4 provides hashed shard keys (page 740).
At the end of a chunk migration (page 752), the shard must connect to the config database to update the chunks record
in the cluster metadata. If the shard fails to connect to the config database, MongoDB reports the following error:
ERROR: moveChunk commit failed: version is at <n>|<nn> instead of
<N>|<NN>" and "ERROR: TERMINATING"
When this happens, the primary member of the shards replica set then terminates to protect data consistency. If a
secondary member can access the config database, data on the shard becomes accessible again after an election.
The user will need to resolve the chunk migration failure independently. If you encounter this issue, contact the
MongoDB User Group21 or MongoDB Support22 to address this issue.
12.4.27 How does draining a shard affect the balancing of uneven chunk distribu-
tion?
The sharded cluster balancing process controls both migrating chunks from decommissioned shards (i.e. draining) and
normal cluster balancing activities. Consider the following behaviors for different versions of MongoDB in situations
where you remove a shard in a cluster with an uneven chunk distribution:
After MongoDB 2.2, the balancer first removes the chunks from the draining shard and then balances the re-
maining uneven chunk distribution.
Before MongoDB 2.2, the balancer handles the uneven chunk distribution and then removes the chunks from
the draining shard.
21 http://groups.google.com/group/mongodb-user
22 https://www.mongodb.org/about/support
On this page
What kinds of replication does MongoDB support? (page 847)
What does the term primary mean? (page 847)
What does the term secondary mean? (page 847)
How long does replica set failover take? (page 847)
Does replication work over the Internet and WAN connections? (page 848)
Can MongoDB replicate over a noisy connection? (page 848)
Why use journaling if replication already provides data redundancy? (page 848)
How many arbiters do replica sets need? (page 848)
What information do arbiters exchange with the rest of the replica set? (page 849)
Which members of a replica set vote in elections? (page 849)
Do hidden members vote in replica set elections? (page 850)
Is it normal for replica set members to use different amounts of disk space? (page 850)
Can I rename a replica set? (page 850)
Secondary nodes are the read-only nodes in replica sets (page 613).
It varies, but a replica set will generally select a new primary within a minute.
23 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
24 In some circumstances (page 722), two nodes in a replica set may transiently believe that they are the primary, but at most, one of them
will be able to complete writes with { w: "majority" } (page 142) write concern. The node that can complete { w: "majority" }
(page 142) writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network
partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary
(page 721), and new writes to the former primary will eventually roll back.
For instance, it may take 10-30 seconds for the members of a replica set to declare a primary inaccessible (see
electionTimeoutMillis (page 715)). One of the remaining secondaries holds an election to elect itself as a
new primary. During the election, the cluster is unavailable for writes.
The election itself may take another 10-30 seconds.
Changed in version 3.2: Starting in MongoDB 3.2, with the replication election enhancements (page 881), MongoDB
reduces replica set failover time. See replication election enhancements (page 881) for details.
12.5.5 Does replication work over the Internet and WAN connections?
Yes.
For example, a deployment may maintain a primary and secondary in an East-coast data center along with a secondary
member for disaster recovery in a West-coast data center.
See also:
Deploy a Geographically Redundant Replica Set (page 662)
Yes, but not without connection failures and the obvious latency.
Members of the set will attempt to reconnect to the other members of the set in response to networking flaps. This
does not require administrator intervention. However, if the network connections among the nodes in the replica set
are very slow, it might not be possible for the members of the node to keep up with the replication.
If the TCP connection between the secondaries and the primary instance breaks, a replica set will automatically elect
one of the secondary members of the set as primary.
Journaling facilitates faster crash recovery. Prior to journaling, crashes often required database repairs or full
data resync. Both were slow, and the first was unreliable.
Journaling is particularly useful for protection against power failures, especially if your replica set resides in a single
data center or power circuit.
When a replica set runs with journaling, mongod instances can safely restart without any administrator intervention.
Note: Journaling requires some resource overhead for write operations. Journaling has no effect on read performance,
however.
Journaling is enabled by default on all 64-bit builds of MongoDB v2.0 and greater.
Some configurations do not require any arbiter instances. Arbiters vote in elections for primary but do not replicate
the data like secondary members.
Replica sets require a majority of the remaining nodes present to elect a primary. Arbiters allow you to construct this
majority without the overhead of adding replicating nodes to the system.
There are many possible replica set architectures (page 626).
A replica set with an odd number of voting nodes does not need an arbiter.
A common configuration consists of two replicating nodes that include a primary and a secondary, as well as an
arbiter for the third node. This configuration makes it possible for the set to elect a primary in the event of failure,
without requiring three replicating nodes.
You may also consider adding an arbiter to a set if it has an equal number of nodes in two facilities and network
partitions between the facilities are possible. In these cases, the arbiter will break the tie between the two facilities and
allow the set to elect a new primary.
See also:
Replica Set Deployment Architectures (page 626)
12.5.9 What information do arbiters exchange with the rest of the replica set?
Arbiters never receive the contents of a collection but do exchange the following data with the rest of the replica set:
Credentials used to authenticate the arbiter with the replica set. All MongoDB processes within a replica set use
keyfiles. These exchanges are encrypted.
Replica set configuration data and voting data. This information is not encrypted. Only credential exchanges
are encrypted.
If your MongoDB deployment uses TLS/SSL, then all communications between arbiters and the other members of
the replica set are secure. See the documentation for Configure mongod and mongos for TLS/SSL (page 382) for more
information. Run all arbiters on secure networks, as with all MongoDB components.
See
The overview of Arbiter Members of Replica Sets (page ??).
All members of a replica set, unless the value of votes (page 713) is equal to 0, vote in elections. This includes all
delayed (page 624), hidden (page 623) and secondary-only (page 621) members. Arbiters (page ??) always vote in
elections and always have 1 vote.
Additionally, the state of the voting members also determine whether the member can vote. Only voting members
in the following states are eligible to vote:
PRIMARY
SECONDARY
RECOVERING
ARBITER
ROLLBACK
See also:
Replica Set Elections (page 635)
Hidden members (page 623) of replica sets do vote in elections. To exclude a member from voting in an election,
change the value of the members members[n].votes (page 713) configuration to 0.
See also:
Replica Set Elections (page 635)
12.5.12 Is it normal for replica set members to use different amounts of disk space?
Yes.
Factors including: different oplog sizes, different levels of storage fragmentation, and MongoDBs data file pre-
allocation can lead to some variation in storage utilization between nodes. Storage use disparities will be most pro-
nounced when you add members at different times.
As of MongoDB 2.6 there are no tools or functions designed specifically to rename a replica set.
You can use the backup and restore procedure described in the Restore a Replica Set from MongoDB Backups
(page 270) tutorial to create a new replica set with the desired name. Downtime may be necessary in order to en-
sure parity between the original replica set and the new one.
On this page
Storage Engine Fundamentals (page 850)
Can you mix storage engines in a replica set? (page 851)
WiredTiger Storage Engine (page 851)
MMAPv1 Storage Engine (page 852)
Data Storage Diagnostics (page 855)
A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on
disk. Many databases support multiple storage engines, where different engines perform better for specific workloads.
For example, one storage engine might offer better performance for read-heavy workloads, and another might support
a higher-throughput for write operations.
See also:
Storage Engines (page 587)
Yes. You can have a replica set members that use different storage engines.
When designing these multi-storage engine deployments consider the following:
the oplog on each member may need to be sized differently to account for differences in throughput between
different storage engines.
recovery from backups may become more complex if your backup captures data files from MongoDB: you may
need to maintain backups for each storage engine.
Yes. See:
Change Standalone to WiredTiger (page 589)
Change Replica Set to WiredTiger (page 590)
Change Sharded Cluster to WiredTiger (page 591)
The ratio of compressed data to uncompressed data depends on your data and the compression library used. By default,
collection data in WiredTiger use Snappy block compression; zlib compression is also available. Index data use prefix
compression by default.
With WiredTiger, MongoDB utilizes both the WiredTiger cache and the filesystem cache.
Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For systems with up to 10 GB of RAM, the new default setting is less than or equal to the 3.0 default setting (For
MongoDB 3.0, the WiredTiger cache uses either 1 GB or half of the installed physical RAM, whichever is larger).
For systems with more than 10 GB of RAM, the new default setting is greater than the 3.0 setting.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or
by other processes. Data in the filesystem cache is compressed.
To adjust the size of the WiredTiger cache, see storage.wiredTiger.engineConfig.cacheSizeGB and
--wiredTigerCacheSizeGB. Avoid increasing the WiredTiger cache size above its default value.
The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node
contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM
available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less
than the amount of RAM available in the container. The exact amount depends on the other processes running in the
container.
To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the
serverStatus command.
MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds
or 2 gigabytes of journal data.
For journal data, MongoDB writes to disk according to the following intervals or condition:
New in version 3.2: Every 50 milliseconds.
MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of
journal data has been written, whichever occurs first.
If the write operation includes a write concern of j: true (page 143), WiredTiger forces a sync of the
WiredTiger journal files.
Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately
every 100 MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.
A memory-mapped file is a file with data that the operating system places in memory by way of the mmap() system
call. mmap() thus maps the file to a region of virtual memory. Memory-mapped files are the critical piece of the
MMAPv1 storage engine in MongoDB. By using memory mapped files, MongoDB can treat the contents of its data
files as if they were in memory. This provides MongoDB with an extremely fast and simple method for accessing and
manipulating data.
MongoDB uses memory mapped files for managing and interacting with all data.
Memory mapping assigns files to a block of virtual memory with a direct byte-for-byte correlation. MongoDB memory
maps data files to memory as it accesses documents. Unaccessed data is not mapped to memory.
Once mapped, the relationship between file and memory allows MongoDB to interact with the data in the file as if it
were memory.
In the default configuration for the MMAPv1 storage engine (page 595), MongoDB writes to the data files on disk
every 60 seconds and writes to the journal files roughly every 100 milliseconds.
To change the interval for writing to the data files, use the storage.syncPeriodSecs setting. For the journal
files, see storage.journal.commitIntervalMs setting.
These values represent the maximum amount of time between the completion of a write operation and when MongoDB
writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk
more frequently, so that the above values represents a theoretical maximum.
Why are the files in my data directory larger than the data in my database?
The data files in your data directory, which is the /data/db directory in default configurations, might be larger than
the data set inserted into the database. Consider the following possible causes:
MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do
not necessarily reflect the size of your data.
The storage.mmapv1.smallFiles option will reduce the size of these files, which may be useful if you have
many small databases on disk.
The oplog
If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated
capped collection in the local database.
The default allocation is approximately 5% of disk space on 64-bit installations. In most cases, you should not need
to resize the oplog. See Oplog Sizing (page 647) for more information.
The journal
The data directory contains the journal files, which store write operations on disk before MongoDB applies them to
databases. See Journaling (page 598).
Empty records
MongoDB maintains lists of empty records in data files as it deletes documents and collections. MongoDB can reuse
this space, but will not, by default, return this space to the operating system.
To allow MongoDB to more effectively reuse the space, you can de-fragment your data. To de-fragment, use the
compact command. The compact requires up to 2 gigabytes of extra disk space to run. Do not use compact if
you are critically low on disk space. For more information on its behavior and other considerations, see compact.
compact only removes fragmentation from MongoDB data files within a collection and does not return any disk space
to the operating system. To return disk space to the operating system, see How do I reclaim disk space? (page 853).
The following provides some options to consider when reclaiming disk space.
Note: You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records (page 853) for
information on reuse of freed space.
repairDatabase
You can use repairDatabase on a database to rebuilds the database, de-fragmenting the associated storage in the
process.
repairDatabase requires free disk space equal to the size of your current data set plus 2 gigabytes. If the volume
that holds dbpath lacks sufficient space, you can mount a separate volume and use that for the repair. For additional
information and considerations, see repairDatabase.
Warning: Do not use repairDatabase if you are critically low on disk space.
repairDatabase will block all other operations and may take a long time to complete.
For a secondary member of a replica set, you can perform a resync of the member (page 690) by: stopping the
secondary member to resync, deleting all data and subdirectories from the members data directory, and restarting.
For details, see Resync a Member of a Replica Set (page 690).
Working set represents the total body of data that the application uses in the course of normal operation. Often this is
a subset of the total data size, but the specific size of the working set depends on actual moment-to-moment use of the
database.
If you run a query that requires MongoDB to scan every document in a collection, the working set will expand to
include every document. Depending on physical memory size, this may cause documents in the working set to page
out, or to be removed from physical memory by the operating system. The next time MongoDB needs to access these
documents, MongoDB may incur a hard page fault.
For best performance, the majority of your active set should fit in RAM.
With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts of its data
files that are not currently located in physical memory. In contrast, operating system page faults happen when physical
memory is exhausted and pages of physical memory are swapped to disk.
If there is free memory, then the operating system can find the page on disk and load it to memory directly. However,
if there is no free memory, the operating system must:
find a page in memory that is stale or no longer needed, and write the page to disk.
read the requested page from disk and load it into memory.
This process, on an active system, can take a long time, particularly in comparison to reading a page that is already in
memory.
See Page Faults (page 234) for more information.
Page faults occur when MongoDB, with the MMAP storage engine, needs access to data that isnt currently in active
memory. A hard page fault refers to situations when MongoDB must access a disk to access the data. A soft page
fault, by contrast, merely moves memory pages from one list to another, such as from an operating system file cache.
See Page Faults (page 234) for more information.
To view the statistics for a collection, including the data size, use the db.collection.stats() method from the
mongo shell. The following example issues db.collection.stats() for the orders collection:
db.orders.stats();
MongoDB also provides the following methods to return specific sizes for the collection:
db.collection.dataSize() to return data size in bytes for the collection.
db.collection.storageSize() to return allocation size in bytes, including unused space.
db.collection.totalSize() to return the data size plus the index size in bytes.
db.collection.totalIndexSize() to return the index size in bytes.
The following script prints the statistics for each database:
db._adminCommand("listDatabases").databases.forEach(function (d) {
mdb = db.getSiblingDB(d.name);
printjson(mdb.stats());
})
The following script prints the statistics for each collection in each database:
db._adminCommand("listDatabases").databases.forEach(function (d) {
mdb = db.getSiblingDB(d.name);
mdb.getCollectionNames().forEach(function(c) {
s = mdb[c].stats();
printjson(s);
})
})
To view the size of the data allocated for an index, use the db.collection.stats() method and check the
indexSizes field in the returned document.
The db.stats() method in the mongo shell returns the current state of the active database. For the description
of the returned fields, see dbStats Output.
On this page
How do I create an index? (page 856)
How does an index build affect database performance? (page 856)
How do I see what indexes exist on a collection? (page 856)
How can I see if a query uses an index? (page 856)
How do I determine which fields to index? (page 856)
How can I see the size of an index? (page 857)
How do write operations affect indexes? (page 857)
This document addresses some common questions regarding MongoDB indexes (page 487). For more information on
indexes, see Indexes (page 487).
Note: Index builds can impact performance; see How does an index build affect database performance? (page 856).
Administrators should consider the performance implications before building indexes.
When building an index on a collection, the database that holds the collection is unavailable for read or write operations
until the index build completes. If you need to build a large index, consider building the index in the background
(page 522). See Index Creation (page 521) and Build Indexes on Replica Sets (page 537).
To return information on currently running index creation operations, see currentOp-index-creation. To kill a running
index creation operation, see db.killOp(). The partially built index will be deleted.
A number of factors determine which fields to index, including selectivity (page 578), the support for multiple query
shapes, and size of the index (page 577). For more information, see Operational Considerations for Indexes (page 166)
and Indexing Tutorials (page 531).
The db.collection.stats() includes an indexSizes document which provides size information for each
index on the collection.
Depending on its size, an index may not fit into RAM. An index fits into RAM when your server has enough RAM
available for both the index and the rest of the working set. When an index is too large to fit into RAM, MongoDB
must read the index from disk, which is a much slower operation than reading from RAM.
In certain cases, an index does not need to fit entirely into RAM. For details, see Indexes that Hold Only Recent Values
in RAM (page 578).
On this page
Where can I find information about a mongod process that stopped running unexpectedly? (page 857)
Does TCP keepalive time affect MongoDB Deployments? (page 858)
What tools are available for monitoring MongoDB? (page 859)
Memory Diagnostics for the MMAPv1 Storage Engine (page 859)
Memory Diagnostics for the WiredTiger Storage Engine (page 860)
Sharded Cluster Diagnostics (page 862)
12.8.1 Where can I find information about a mongod process that stopped running
unexpectedly?
If mongod shuts down unexpectedly on a UNIX or UNIX-based platform, and if mongod fails to log a shutdown or
error message, then check your system logs for messages pertaining to MongoDB. For example, for logs located in
/var/log/messages, use the following commands:
sudo grep mongod /var/log/messages
sudo grep score /var/log/messages
25 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
If you experience socket errors between clients and servers or between members of a sharded cluster or replica
set that do not have other reasonable causes, check the TCP keepalive value (e.g. on Linux systems store, the
tcp_keepalive_time value). A common keepalive period is 7200 seconds (2 hours); however, different distri-
butions and OS X may have different settings.
For MongoDB, you will have better results with shorter keepalive periods, on the order of 120 seconds (two minutes).
If your MongoDB deployment experiences keepalive-related issues, you must alter the keep alive value on all machines
hosting MongoDB processes. This includes all machines hosting mongos or mongod servers and all machines
hosting client processes that connect to MongoDB.
Note: For non-Linux systems, values greater than or equal to 600 seconds (10 minutes) will be ignored by mongod
and mongos. For Linux, values greater than 300 seconds (5 minutes) will be overridden on the mongod and mongos
sockets with a maximum of 300 seconds.
On Linux systems:
To view the keep alive setting, you can use one of the following commands:
sysctl net.ipv4.tcp_keepalive_time
Or:
cat /proc/sys/net/ipv4/tcp_keepalive_time
Or:
echo <value> | sudo tee /proc/sys/net/ipv4/tcp_keepalive_time
These operations do not persist across system reboots. To persist the setting, add the following line to
/etc/sysctl.conf:
net.ipv4.tcp_keepalive_time = <value>
On Linux, mongod and mongos processes limit the keepalive to a maximum of 300 seconds (5 minutes) on
their own sockets by overriding keepalive values greater than 5 minutes.
For OS X systems:
To view the keep alive setting, issue the following command:
sysctl net.inet.tcp.keepinit
To change the net.inet.tcp.keepinit value, you can use the following command:
sysctl -w net.inet.tcp.keepinit=<value>
The above method for setting the TCP keepalive is not persistent; you will need to reset the value each time
you reboot or restart a system. See your operating systems documentation for instructions on setting the TCP
keepalive value persistently.
For Windows systems:
To view the keep alive setting, issue the following command:
The registry value is not present by default. The system default, used if the value is absent, is 7200000 millisec-
onds or 0x6ddd00 in hexadecimal.
To change the KeepAliveTime value, use the following command in an Administrator Command Prompt,
where <value> is expressed in hexadecimal (e.g. 0x0124c0 is 120000):
reg add HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ /v KeepAliveTime /d <value>
Windows users should consider the Windows Server Technet Article on KeepAliveTime26 for more information
on setting keep alive for MongoDB deployments on Windows systems.
You will need to restart mongod and mongos servers for new system-wide keepalive settings to take effect.
The MongoDB Cloud Manager27 and Ops Manager, an on-premise solution available in MongoDB Enterprise Ad-
vanced28 include monitoring functionality, which collects data from running MongoDB deployments and provides
visualization and alerts based on that data.
For more information, see also the MongoDB Cloud Manager documentation29 and Ops Manager documentation30 .
A full list of third-party tools is available as part of the Monitoring for MongoDB (page 203) documentation.
Always configure systems to have swap space. Without swap, your system may not be reliant in some situations with
extreme memory constraints, memory leaks, or multiple programs using the same memory. Think of the swap space
as something like a steam release valve that allows the system to release extra pressure without affecting the overall
functioning of the system.
Nevertheless, systems running MongoDB do not need swap for routine operation. Database files are memory-mapped
(page 852) and should constitute most of your MongoDB memory use. Therefore, it is unlikely that mongod will ever
use any swap space in normal operation. The operating system will release memory from the memory mapped files
without needing swap and MongoDB can write data to the data files without needing the swap system.
The working set is the portion of your data that clients access most often.
Your working set should stay in memory to achieve good performance. Otherwise many random disk IOs will occur,
and unless you are using SSD, this can be quite slow.
26 https://technet.microsoft.com/en-us/library/cc957549.aspx
27 https://cloud.mongodb.com/?jmp=docs
28 https://www.mongodb.com/products/mongodb-enterprise-advanced?jmp=docs
29 https://docs.cloud.mongodb.com/
30 https://docs.opsmanager.mongodb.com/current/application
One area to watch specifically in managing the size of your working set is index access patterns. If you are inserting
into indexes at random locations (as would happen with ids that are randomly generated by hashes), you will contin-
ually be updating the whole index. If instead you are able to create your ids in approximately ascending order (for
example, day concatenated with a random id), all the updates will occur at the right side of the b-tree and the working
set size for index pages will be much smaller.
It is fine if databases and thus virtual size are much larger than RAM.
The amount of RAM you need depends on several factors, including but not limited to:
The relationship between database storage (page 850) and working set.
The operating systems cache strategy for LRU (Least Recently Used)
The impact of journaling (page 598)
The number or rate of page faults and other MongoDB Cloud Manager gauges to detect when you need more
RAM
Each database connection thread will need up to 1 MB of RAM.
MongoDB defers to the operating system when loading data into memory from disk. It simply memory maps
(page 852) all its data files and relies on the operating system to cache data. The OS typically evicts the least-
recently-used data from RAM when it runs low on memory. For example if clients access indexes more frequently
than documents, then indexes will more likely stay in RAM, but it depends on your particular usage.
To calculate how much RAM you need, you must calculate your working set size, or the portion of your data that
clients use most often. This depends on your access patterns, what indexes you have, and the size of your documents.
Because MongoDB uses a thread per connection model, each database connection also will need up to 1 MB of RAM,
whether active or idle.
If page faults are infrequent, your working set fits in RAM. If fault rates rise higher than that, you risk performance
degradation. This is less critical with SSD drives than with spinning disks.
Because mongod uses memory-mapped files (page 852), the memory statistics in top require interpretation in a
special way. On a large database, VSIZE (virtual bytes) tends to be the size of the entire database. If the mongod
doesnt have other processes running, RSIZE (resident bytes) is the total memory of the machine, as this counts file
system cache contents.
For Linux systems, use the vmstat command to help determine how the system uses memory. On OS X systems use
vm_stat.
No.
If the cache does not have enough space to load additional data, WiredTiger evicts pages from the cache to free up
space.
used by MongoDB. MongoDB also automatically uses all free memory on the machine via the filesystem cache (data
in the filesystem cache is compressed).
In addition, the operating system will use any free RAM to buffer filesystem blocks.
To accommodate the additional consumers of RAM, you may have to decrease WiredTiger cache size.
The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node
contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM
available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less
than the amount of RAM available in the container. The exact amount depends on the other processes running in the
container.
To see statistics on the cache and eviction, use the serverStatus command. The wiredTiger.cache field
holds the information on the cache and eviction.
...
"wiredTiger" : {
...
"cache" : {
"tracked dirty bytes in the cache" : <num>,
"bytes currently in the cache" : <num>,
"maximum bytes configured" : <num>,
"bytes read into cache" :<num>,
"bytes written from cache" : <num>,
"pages evicted by application threads" : <num>,
"checkpoint blocked page eviction" : <num>,
"unmodified pages evicted" : <num>,
"page split during eviction deepened the tree" : <num>,
"modified pages evicted" : <num>,
"pages selected for eviction unable to be evicted" : <num>,
"pages evicted because they exceeded the in-memory maximum" : <num>,,
"pages evicted because they had chains of deleted items" : <num>,
"failed eviction of pages that exceeded the in-memory maximum" : <num>,
"hazard pointer blocked page eviction" : <num>,
"internal pages evicted" : <num>,
"maximum page size at eviction" : <num>,
"eviction server candidate queue empty when topping up" : <num>,
"eviction server candidate queue not empty when topping up" : <num>,
"eviction server evicting pages" : <num>,
"eviction server populating queue, but not evicting pages" : <num>,
"eviction server unable to reach eviction goal" : <num>,
"pages split during eviction" : <num>,
"pages walked for eviction" : <num>,
"eviction worker thread evicting pages" : <num>,
"in-memory page splits" : <num>,
"percentage overhead" : <num>,
"tracked dirty pages in the cache" : <num>,
"pages currently held in the cache" : <num>,
"pages read into cache" : <num>,
"pages written from cache" : <num>,
},
...
For an explanation of some key cache and eviction statistics, such as wiredTiger.cache.bytes
currently in the cache and wiredTiger.cache.tracked dirty bytes in the cache, see
wiredTiger.cache.
With WiredTiger, MongoDB utilizes both the WiredTiger cache and the filesystem cache.
Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For systems with up to 10 GB of RAM, the new default setting is less than or equal to the 3.0 default setting (For
MongoDB 3.0, the WiredTiger cache uses either 1 GB or half of the installed physical RAM, whichever is larger).
For systems with more than 10 GB of RAM, the new default setting is greater than the 3.0 setting.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or
by other processes. Data in the filesystem cache is compressed.
To adjust the size of the WiredTiger cache, see storage.wiredTiger.engineConfig.cacheSizeGB and
--wiredTigerCacheSizeGB. Avoid increasing the WiredTiger cache size above its default value.
The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node
contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.
If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM
available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less
than the amount of RAM available in the container. The exact amount depends on the other processes running in the
container.
To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the
serverStatus command.
The two most important factors in maintaining a successful sharded cluster are:
choosing an appropriate shard key (page 739) and
sufficient capacity to support current and future operations (page 736).
You can prevent most issues encountered with sharding by ensuring that you choose the best possible shard key for
your deployment and ensure that you are always adding additional capacity to your cluster well before the current
resources become saturated. Continue reading for specific issues you may encounter in a production environment.
In a new sharded cluster, why does all data remains on one shard?
Your cluster must have sufficient data for sharding to make sense. Sharding works by migrating chunks between the
shards until each shard has roughly the same number of chunks.
The default chunk size is 64 megabytes. MongoDB will not begin migrations until the imbalance of chunks in the
cluster exceeds the migration threshold (page 751). While the default chunk size is configurable with the chunkSize
setting, these behaviors help prevent unnecessary chunk migrations, which can degrade the performance of your cluster
as a whole.
If you have just deployed a sharded cluster, make sure that you have enough data to make sharding effective. If you do
not have sufficient data to create more than eight 64 megabyte chunks, then all data will remain on one shard. Either
lower the chunk size (page 754) setting, or add more data to the cluster.
As a related problem, the system will split chunks only on inserts or updates, which means that if you configure
sharding and do not continue to issue insert and update operations, the database will not create any chunks. You can
either wait until your application inserts data or split chunks manually (page 800).
Finally, if your shard key has a low cardinality (page 764), MongoDB may not be able to create sufficient splits among
the data.
Why would one shard receive a disproportion amount of traffic in a sharded cluster?
In some situations, a single shard or a subset of the cluster will receive a disproportionate portion of the traffic and
workload. In almost all cases this is the result of a shard key that does not effectively allow write scaling (page 741).
Its also possible that you have hot chunks. In this case, you may be able to solve the problem by splitting and then
migrating parts of these chunks.
In the worst case, you may have to consider re-sharding your data and choosing a different shard key (page 763) to
correct this pattern.
If you have just deployed your sharded cluster, you may want to consider the troubleshooting suggestions for a new
cluster where data remains on a single shard (page 863).
If the cluster was initially balanced, but later developed an uneven distribution of data, consider the following possible
causes:
You have deleted or removed a significant amount of data from the cluster. If you have added additional data, it
may have a different distribution with regards to its shard key.
Your shard key has low cardinality (page 764) and MongoDB cannot split the chunks any further.
Your data set is growing faster than the balancer can distribute data around the cluster. This is uncommon and
typically is the result of:
a balancing window (page 793) that is too short, given the rate of data growth.
an uneven distribution of write operations (page 741) that requires more data migration. You may have to
choose a different shard key to resolve this issue.
poor network connectivity between shards, which may lead to chunk migrations that take too long to
complete. Investigate your network configuration and interconnections between shards.
If migrations impact your cluster or applications performance, consider the following options, depending on the nature
of the impact:
1. If migrations only interrupt your clusters sporadically, you can limit the balancing window (page 793) to prevent
balancing activity during peak hours. Ensure that there is enough time remaining to keep the data from becoming
out of balance again.
2. If the balancer is always migrating chunks to the detriment of overall cluster performance:
You may want to attempt decreasing the chunk size (page 805) to limit the size of the migration.
Your cluster may be over capacity, and you may want to attempt to add one or two shards (page 765) to
the cluster to distribute load.
Its also possible that your shard key causes your application to direct all writes to a single shard. This kind of activity
pattern can require the balancer to migrate most data soon after writing it. Consider redeploying your cluster with a
shard key that provides better write scaling (page 741).
Release Notes
Always install the latest, stable version of MongoDB. See MongoDB Version Numbers (page 1061) for more informa-
tion.
See the following release notes for an account of the changes in major versions. Release notes also include instructions
for upgrade.
(3.2-series)
On this page
Minor Releases (page 866)
WiredTiger as Default (page 881)
Replication Election Enhancements (page 881)
Sharded Cluster Enhancements (page 882)
readConcern (page 882)
Partial Indexes (page 882)
Document Validation (page 883)
Aggregation Framework Enhancements (page 883)
MongoDB Tools Enhancements (page 886)
Encrypted Storage Engine (page 886)
Text Search Enhancements (page 886)
New Storage Engines (page 887)
General Enhancements (page 888)
Changes Affecting Compatibility (page 891)
Upgrade Process (page 894)
Known Issues in 3.2.1 (page 903)
Known Issues in 3.2.0 (page 903)
Download (page 904)
Additional Resources (page 904)
Dec 8, 2015
865
MongoDB Documentation, Release 3.2.4
MongoDB 3.2 is now available. Key features include WiredTiger as the default storage engine, replication election
enhancements, config servers as replica sets, readConcern, and document validations.
OpsManager 2.0 is also available. See the Ops Manager documentation1 and the Ops Manager release notes2 for more
information.
Minor Releases
3.2 Changelog
On this page
3.2.4 Changelog (page 866)
3.2.3 Changelog (page 870)
3.2.1 Changelog (page 876)
3.2.4 Changelog
Security SERVER-222373 Built-in role that allows full control over data, but not security or topology
Sharding
SERVER-217584 Test behavior when nearest config server has severe replication lag
SERVER-221845 Operations that fail against a secondary in a sharded cluster may have their error message
swallowed
SERVER-222396 wait for replication after duplicate key error from insert operations
SERVER-222977 Add targeted jstests for csrs upgrade during common operations
SERVER-222998 Add a jstest that runs moveChunk directly against a mongod that is not yet sharding aware,
providing an SCCC connection string for the config servers
SERVER-225249 Only interrupt mapReduce on catalog manager swap if it is outputting to a sharded collection
SERVER-2254310 multi_write_target.js should not rely on the order of shard ids
SERVER-2254711 add support for config server ReplSetTest options to ShardingTest
SERVER-2255312 mongos_shard_failure_tolerance.js should not rely on order of shard ids
SERVER-2256913 Initialization of eooElement static local variable isnt thread safe with MSVC 2013
1 http://docs.opsmanager.mongodb.com/current/
2 http://docs.opsmanager.mongodb.com/current/release-notes/application/
3 https://jira.mongodb.org/browse/SERVER-22237
4 https://jira.mongodb.org/browse/SERVER-21758
5 https://jira.mongodb.org/browse/SERVER-22184
6 https://jira.mongodb.org/browse/SERVER-22239
7 https://jira.mongodb.org/browse/SERVER-22297
8 https://jira.mongodb.org/browse/SERVER-22299
9 https://jira.mongodb.org/browse/SERVER-22524
10 https://jira.mongodb.org/browse/SERVER-22543
11 https://jira.mongodb.org/browse/SERVER-22547
12 https://jira.mongodb.org/browse/SERVER-22553
13 https://jira.mongodb.org/browse/SERVER-22569
Replication
SERVER-2169829 Add error-checking for isMaster() return values in jstests/libs/election_timing_test.js
SERVER-2197230 improve naming of ReplicationCoordinator and TopologyCoordinator unittests
SERVER-2226931 ReadConcern: majority does not reflect journaled state on PRIMARY
SERVER-2227632 implement j flag in write concern apply to secondary as well as primary
SERVER-2227733 test j flag in write concern apply to secondary as well as primary
SERVER-2228734 Merging replica sets with replication protocol version 1 may result in two primaries
14 https://jira.mongodb.org/browse/SERVER-22584
15 https://jira.mongodb.org/browse/SERVER-22585
16 https://jira.mongodb.org/browse/SERVER-22590
17 https://jira.mongodb.org/browse/SERVER-22592
18 https://jira.mongodb.org/browse/SERVER-22627
19 https://jira.mongodb.org/browse/SERVER-22783
20 https://jira.mongodb.org/browse/SERVER-22789
21 https://jira.mongodb.org/browse/SERVER-22797
22 https://jira.mongodb.org/browse/SERVER-22822
23 https://jira.mongodb.org/browse/SERVER-22849
24 https://jira.mongodb.org/browse/SERVER-22859
25 https://jira.mongodb.org/browse/SERVER-22862
26 https://jira.mongodb.org/browse/SERVER-22863
27 https://jira.mongodb.org/browse/SERVER-22878
28 https://jira.mongodb.org/browse/SERVER-22880
29 https://jira.mongodb.org/browse/SERVER-21698
30 https://jira.mongodb.org/browse/SERVER-21972
31 https://jira.mongodb.org/browse/SERVER-22269
32 https://jira.mongodb.org/browse/SERVER-22276
33 https://jira.mongodb.org/browse/SERVER-22277
34 https://jira.mongodb.org/browse/SERVER-22287
Query
SERVER-2234445 certain cursor options can trigger an invariant failure in GetMoreCmd
SERVER-2242546 execStats in system.profile reports winning plan and rejected plans
SERVER-2253247 $type with invalid integer type code fails with unhelpful message and leaks memory
SERVER-2262648 fix $type unit tests on experimental decimal build
SERVER-2279349 Unbounded memory usage by long-running query using projection
JavaScript
SERVER-913151 Ensure documents with code elements do not conflict with internal JS functions
SERVER-2258752 Upgrade to spidermonkey 38.6.1esr
35 https://jira.mongodb.org/browse/SERVER-22426
36 https://jira.mongodb.org/browse/SERVER-22428
37 https://jira.mongodb.org/browse/SERVER-22495
38 https://jira.mongodb.org/browse/SERVER-22521
39 https://jira.mongodb.org/browse/SERVER-22595
40 https://jira.mongodb.org/browse/SERVER-22598
41 https://jira.mongodb.org/browse/SERVER-22617
42 https://jira.mongodb.org/browse/SERVER-22683
43 https://jira.mongodb.org/browse/SERVER-22728
44 https://jira.mongodb.org/browse/SERVER-22731
45 https://jira.mongodb.org/browse/SERVER-22344
46 https://jira.mongodb.org/browse/SERVER-22425
47 https://jira.mongodb.org/browse/SERVER-22532
48 https://jira.mongodb.org/browse/SERVER-22626
49 https://jira.mongodb.org/browse/SERVER-22793
50 https://jira.mongodb.org/browse/SERVER-22537
51 https://jira.mongodb.org/browse/SERVER-9131
52 https://jira.mongodb.org/browse/SERVER-22587
Storage
SERVER-2141953 The ephemeralForTest storage engine should support the fsync command
SERVER-2192454 Add log message for inMemory and ephemeralForTest storage engine
SERVER-2253455 Change ephemeral storage to update durable OpTime
WiredTiger
SERVER-2243756 Coverity analysis defect 77704: Redundant test
SERVER-2243857 Coverity analysis defect 77705: Dereference before null check
SERVER-2257058 WiredTiger changes for MongoDB 3.2.4
SERVER-2269159 Incorrect initialization order in WiredTigerKVEngine
SERVER-2289860 High fragmentation on WiredTiger databases under write workloads
Operations SERVER-2244061 Shell incorrectly issues first query in legacy read mode
Internals
SERVER-1450165 De-inline ReplSettings class
SERVER-2188166 dbhash checking in FSM framework doesnt handle TTL deletes
SERVER-2210167 Generate minidumps when the hang analyzer is triggered on Windows
SERVER-2223168 Add additional test suites to run resmoke.py validation hook
SERVER-2229269 Use more reliable mechanism in the mongo shell to wait for process to terminate on windows
SERVER-2231470 Fix the detection of Python processes in the hang analyzer script
53 https://jira.mongodb.org/browse/SERVER-21419
54 https://jira.mongodb.org/browse/SERVER-21924
55 https://jira.mongodb.org/browse/SERVER-22534
56 https://jira.mongodb.org/browse/SERVER-22437
57 https://jira.mongodb.org/browse/SERVER-22438
58 https://jira.mongodb.org/browse/SERVER-22570
59 https://jira.mongodb.org/browse/SERVER-22691
60 https://jira.mongodb.org/browse/SERVER-22898
61 https://jira.mongodb.org/browse/SERVER-22440
62 https://jira.mongodb.org/browse/SERVER-20930
63 https://jira.mongodb.org/browse/SERVER-22003
64 https://jira.mongodb.org/browse/TOOLS-1043
65 https://jira.mongodb.org/browse/SERVER-14501
66 https://jira.mongodb.org/browse/SERVER-21881
67 https://jira.mongodb.org/browse/SERVER-22101
68 https://jira.mongodb.org/browse/SERVER-22231
69 https://jira.mongodb.org/browse/SERVER-22292
70 https://jira.mongodb.org/browse/SERVER-22314
3.2.3 Changelog
Sharding
SERVER-1867190 SecondaryPreferred can end up using unversioned connections
SERVER-2003091 ForwardingCatalogManager::shutdown races with _replaceCatalogManager
71 https://jira.mongodb.org/browse/SERVER-22317
72 https://jira.mongodb.org/browse/SERVER-22332
73 https://jira.mongodb.org/browse/SERVER-22340
74 https://jira.mongodb.org/browse/SERVER-22341
75 https://jira.mongodb.org/browse/SERVER-22342
76 https://jira.mongodb.org/browse/SERVER-22479
77 https://jira.mongodb.org/browse/SERVER-22513
78 https://jira.mongodb.org/browse/SERVER-22539
79 https://jira.mongodb.org/browse/SERVER-22546
80 https://jira.mongodb.org/browse/SERVER-22559
81 https://jira.mongodb.org/browse/SERVER-22597
82 https://jira.mongodb.org/browse/SERVER-22636
83 https://jira.mongodb.org/browse/SERVER-22641
84 https://jira.mongodb.org/browse/SERVER-22732
85 https://jira.mongodb.org/browse/SERVER-22746
86 https://jira.mongodb.org/browse/SERVER-22776
87 https://jira.mongodb.org/browse/SERVER-22806
88 https://jira.mongodb.org/browse/SERVER-22846
89 https://jira.mongodb.org/browse/SERVER-22850
90 https://jira.mongodb.org/browse/SERVER-18671
91 https://jira.mongodb.org/browse/SERVER-20030
SERVER-2003692 Add interruption points to operations that hold distributed locks for a long time
SERVER-2003793 Transfer responsibility for the release of distributed locks to new catalog manager
SERVER-2029094 Recipient shard for migration can continue on retrieving data even after donor shard aborts
SERVER-2041895 Make sure mongod and mongos always start the distlock pinger when running in SCCC mode
SERVER-2042296 setShardVersion configdb string mismatch during config rs upgrade
SERVER-2058097 Failure in csrs_upgrade_during_migrate.js
SERVER-2069498 user-initiated finds against the config servers can fail with need to swap catalog manager
error
SERVER-2138299 Sharding migration transfers all document deletions
SERVER-21789100 mongos replica set monitor should choose primary based on (rs config version, electionId)
SERVER-21896101 Chunk metadata will not get refreshed after shard is removed
SERVER-21906102 Race in ShardRegistry::reload and config.shard update can cause shard not found error
SERVER-21956103 applyOps does not correctly propagate operation cancellation exceptions
SERVER-21994104 cleanup_orphaned_basic.js
SERVER-21995105 Queries against sharded collections fail after upgrade to CSRS due to caching of config
server string in setShardVersion
SERVER-22010106 min_optime_recovery.js failure in the sharding continuous config stepdown suite
SERVER-22016107 Fatal assertion 28723 trying to rollback applyOps on a CSRS config server
SERVER-22027108 AsyncResultMerger should not retry killed operations
SERVER-22079109 Make sharding_rs1.js more compact
SERVER-22112110 Circular call dependency between CatalogManager and CatalogCache
SERVER-22113111 Remove unused sharding-specific getLocsInRange code in dbhelpers
SERVER-22114112 Mongos can accumulate multiple copies of ChunkManager when a shard restarts
SERVER-22169113 Deadlock during CatalogManager swap from SCCC -> CSRS
92 https://jira.mongodb.org/browse/SERVER-20036
93 https://jira.mongodb.org/browse/SERVER-20037
94 https://jira.mongodb.org/browse/SERVER-20290
95 https://jira.mongodb.org/browse/SERVER-20418
96 https://jira.mongodb.org/browse/SERVER-20422
97 https://jira.mongodb.org/browse/SERVER-20580
98 https://jira.mongodb.org/browse/SERVER-20694
99 https://jira.mongodb.org/browse/SERVER-21382
100 https://jira.mongodb.org/browse/SERVER-21789
101 https://jira.mongodb.org/browse/SERVER-21896
102 https://jira.mongodb.org/browse/SERVER-21906
103 https://jira.mongodb.org/browse/SERVER-21956
104 https://jira.mongodb.org/browse/SERVER-21994
105 https://jira.mongodb.org/browse/SERVER-21995
106 https://jira.mongodb.org/browse/SERVER-22010
107 https://jira.mongodb.org/browse/SERVER-22016
108 https://jira.mongodb.org/browse/SERVER-22027
109 https://jira.mongodb.org/browse/SERVER-22079
110 https://jira.mongodb.org/browse/SERVER-22112
111 https://jira.mongodb.org/browse/SERVER-22113
112 https://jira.mongodb.org/browse/SERVER-22114
113 https://jira.mongodb.org/browse/SERVER-22169
Replication
SERVER-21583119 ApplyOps background index creation may deadlock
SERVER-21678120 fromMigrate flag never set for deletes in oplog
SERVER-21744121 Clients may fail to discover new primaries when clock skew between nodes is greater than
electionTimeout
SERVER-21958122 Eliminate unused flags from Cloner methods
SERVER-21988123 Rollback does not wait for applier to finish before starting
SERVER-22109124 Invariant failure when running applyOps to create an index with a bad ns field
SERVER-22152125 priority_takeover_two_nodes_equal_priority.js fails if default priority node gets elected at
beginning of test
SERVER-22190126 electionTime field not set in heartbeat response from primary under protocol version 1
SERVER-22335127 Do not prepare getmore when un-needed in bgsync fetcher
SERVER-22362128 election_timing.js waits for wrong node to become primary
SERVER-22420129 priority_takeover_two_nodes_equal_priority.js fails if existing primarys step down period
expires
SERVER-22456130 The oplog find query timeout is too low
Query
SERVER-17011131 Cursor can return objects out of order if updated during query (legacy readMode only)
SERVER-18115132 The planner can add an unnecessary in-memory sort stage for .min()/.max() queries
114 https://jira.mongodb.org/browse/SERVER-22232
115 https://jira.mongodb.org/browse/SERVER-22247
116 https://jira.mongodb.org/browse/SERVER-22249
117 https://jira.mongodb.org/browse/SERVER-22270
118 https://jira.mongodb.org/browse/SERVER-22303
119 https://jira.mongodb.org/browse/SERVER-21583
120 https://jira.mongodb.org/browse/SERVER-21678
121 https://jira.mongodb.org/browse/SERVER-21744
122 https://jira.mongodb.org/browse/SERVER-21958
123 https://jira.mongodb.org/browse/SERVER-21988
124 https://jira.mongodb.org/browse/SERVER-22109
125 https://jira.mongodb.org/browse/SERVER-22152
126 https://jira.mongodb.org/browse/SERVER-22190
127 https://jira.mongodb.org/browse/SERVER-22335
128 https://jira.mongodb.org/browse/SERVER-22362
129 https://jira.mongodb.org/browse/SERVER-22420
130 https://jira.mongodb.org/browse/SERVER-22456
131 https://jira.mongodb.org/browse/SERVER-17011
132 https://jira.mongodb.org/browse/SERVER-18115
SERVER-20083133 Add log statement at default log level for when an index filter is set or cleared successfully
SERVER-21776134 Move per-operation log lines for queries out of the QUERY log component
SERVER-21869135 Avoid wrapping of spherical queries in geo_full.js
SERVER-22002136 Do not retry findAndModify operations on MMAPv1
SERVER-22100137 memory pressure from find/getMore buffer preallocation causes concurrency suite slowness
on Windows DEBUG
SERVER-22448138 Query planner does not filter 2dsphere Index Version 3 correctly
Write Operations
SERVER-11983139 Update on document without _id, in capped collection without _id index, creates an _id field
SERVER-21647140 $rename changes field ordering
Aggregation
SERVER-21887141 $sample takes disproportionately long time on newly created collection
SERVER-22048142 Index access stats should be recorded for $match & mapReduce
Storage
SERVER-21388144 Invariant Failure in CappedRecordStoreV1::cappedTruncateAfter
SERVER-22011145 Direct writes to the local database can cause deadlock involving the WiredTiger write throttle
SERVER-22058146 not all control paths return a value warning in non-MMAP V1 implementations of ::writ-
ingPtr
SERVER-22167147 Failed to insert document larger than 256k
SERVER-22199148 Collection drop command during checkpoint causes complete stall until end of checkpoint
133 https://jira.mongodb.org/browse/SERVER-20083
134 https://jira.mongodb.org/browse/SERVER-21776
135 https://jira.mongodb.org/browse/SERVER-21869
136 https://jira.mongodb.org/browse/SERVER-22002
137 https://jira.mongodb.org/browse/SERVER-22100
138 https://jira.mongodb.org/browse/SERVER-22448
139 https://jira.mongodb.org/browse/SERVER-11983
140 https://jira.mongodb.org/browse/SERVER-21647
141 https://jira.mongodb.org/browse/SERVER-21887
142 https://jira.mongodb.org/browse/SERVER-22048
143 https://jira.mongodb.org/browse/SERVER-21528
144 https://jira.mongodb.org/browse/SERVER-21388
145 https://jira.mongodb.org/browse/SERVER-22011
146 https://jira.mongodb.org/browse/SERVER-22058
147 https://jira.mongodb.org/browse/SERVER-22167
148 https://jira.mongodb.org/browse/SERVER-22199
WiredTiger
SERVER-21833149 Compact does not release space to the system with WiredTiger
SERVER-21944150 WiredTiger changes for 3.2.3
SERVER-22064151 Coverity analysis defect 77699: Unchecked return value
SERVER-22279152 SubplanStage fails to register its MultiPlanStage
MMAP
SERVER-21997153 kill_cursors.js deadlocks
SERVER-22261154 MMAPv1 LSNFile may be updated ahead of what is synced to data files
Operations
SERVER-20358155 Usernames can contain NULL characters
SERVER-22007156 List all commands crashes server
SERVER-22075157 election_timing.js election timed out
Internals
SERVER-12108162 setup_multiversion_mongodb.py script should support downloading windows binaries
SERVER-20409163 Negative scaling with more than 10K connections
SERVER-21035164 Delete the disabled fsm_all_sharded.js test runner
SERVER-21050165 Add a failover workload to cause CSRS config server primary failovers
149 https://jira.mongodb.org/browse/SERVER-21833
150 https://jira.mongodb.org/browse/SERVER-21944
151 https://jira.mongodb.org/browse/SERVER-22064
152 https://jira.mongodb.org/browse/SERVER-22279
153 https://jira.mongodb.org/browse/SERVER-21997
154 https://jira.mongodb.org/browse/SERVER-22261
155 https://jira.mongodb.org/browse/SERVER-20358
156 https://jira.mongodb.org/browse/SERVER-22007
157 https://jira.mongodb.org/browse/SERVER-22075
158 https://jira.mongodb.org/browse/SERVER-21905
159 https://jira.mongodb.org/browse/SERVER-22042
160 https://jira.mongodb.org/browse/SERVER-22350
161 https://jira.mongodb.org/browse/TOOLS-1039
162 https://jira.mongodb.org/browse/SERVER-12108
163 https://jira.mongodb.org/browse/SERVER-20409
164 https://jira.mongodb.org/browse/SERVER-21035
165 https://jira.mongodb.org/browse/SERVER-21050
3.2.1 Changelog
Security
SERVER-21724206 Backup role cant read system.profile
SERVER-21824207 Disable kmip.js test in ESE suite; re-enable once fixed
SERVER-21890208 Create a flag to allow server realm to be specified explicitly on Windows
189 https://jira.mongodb.org/browse/SERVER-22034
190 https://jira.mongodb.org/browse/SERVER-22054
191 https://jira.mongodb.org/browse/SERVER-22055
192 https://jira.mongodb.org/browse/SERVER-22059
193 https://jira.mongodb.org/browse/SERVER-22066
194 https://jira.mongodb.org/browse/SERVER-22083
195 https://jira.mongodb.org/browse/SERVER-22098
196 https://jira.mongodb.org/browse/SERVER-22099
197 https://jira.mongodb.org/browse/SERVER-22120
198 https://jira.mongodb.org/browse/SERVER-22121
199 https://jira.mongodb.org/browse/SERVER-22142
200 https://jira.mongodb.org/browse/SERVER-22154
201 https://jira.mongodb.org/browse/SERVER-22165
202 https://jira.mongodb.org/browse/SERVER-22171
203 https://jira.mongodb.org/browse/SERVER-22219
204 https://jira.mongodb.org/browse/SERVER-22324
205 https://jira.mongodb.org/browse/TOOLS-1028
206 https://jira.mongodb.org/browse/SERVER-21724
207 https://jira.mongodb.org/browse/SERVER-21824
208 https://jira.mongodb.org/browse/SERVER-21890
Sharding
SERVER-20824209 Test for sharding state recovery
SERVER-21076210 Write tests to ensure that operations using DBDirectClient handle shard versioning properly
SERVER-21132211 Add more basic tests for moveChunk
SERVER-21133212 Add more basic test for mergeChunk
SERVER-21134213 Add more basic tests for shardCollection
SERVER-21135214 Add more basic tests for sharded implicit database creation
SERVER-21136215 Add more basic tests for enableSharding
SERVER-21137216 Add more basic tests for movePrimary
SERVER-21138217 Add more basic tests for dropDatabase
SERVER-21139218 Add more basic tests for drop collection
SERVER-21366219 Long-running transactions in MigrateStatus::apply
SERVER-21586220 Investigate v3.0 mongos and v3.2 cluster compatibility issues in jstests/sharding
SERVER-21704221 JS Test single_node_config_server_smoke has race condition
SERVER-21706222 Certain parameters to mapReduce trigger segmentation fault in a sharded cluster
SERVER-21786223 Fix code coverage gaps in s/query directory exposed by code coverage tool
SERVER-21848224 bulk write operations on config/admin triggers invariant failure
Replication
SERVER-21248225 jstests for fast-failover correctness
SERVER-21667226 do not set lastop on clients used by replication on secondaries
SERVER-21795227 Do not reschedule more than one liveness timeout callback at a time
SERVER-21847228 log range of operations read from sync source during replication
SERVER-21868229 Shutdown may not be handled correctly on secondary nodes
209 https://jira.mongodb.org/browse/SERVER-20824
210 https://jira.mongodb.org/browse/SERVER-21076
211 https://jira.mongodb.org/browse/SERVER-21132
212 https://jira.mongodb.org/browse/SERVER-21133
213 https://jira.mongodb.org/browse/SERVER-21134
214 https://jira.mongodb.org/browse/SERVER-21135
215 https://jira.mongodb.org/browse/SERVER-21136
216 https://jira.mongodb.org/browse/SERVER-21137
217 https://jira.mongodb.org/browse/SERVER-21138
218 https://jira.mongodb.org/browse/SERVER-21139
219 https://jira.mongodb.org/browse/SERVER-21366
220 https://jira.mongodb.org/browse/SERVER-21586
221 https://jira.mongodb.org/browse/SERVER-21704
222 https://jira.mongodb.org/browse/SERVER-21706
223 https://jira.mongodb.org/browse/SERVER-21786
224 https://jira.mongodb.org/browse/SERVER-21848
225 https://jira.mongodb.org/browse/SERVER-21248
226 https://jira.mongodb.org/browse/SERVER-21667
227 https://jira.mongodb.org/browse/SERVER-21795
228 https://jira.mongodb.org/browse/SERVER-21847
229 https://jira.mongodb.org/browse/SERVER-21868
SERVER-21930230 Restart oplog query if oplog entries are not monotonically increasing
Query
SERVER-21600231 Increase test coverage for killCursors command and OP_KILLCURSORS
SERVER-21602232 Reduce execution time of cursor_timeout.js
SERVER-21637233 Add mixed version tests for find/getMore commands
SERVER-21638234 Audit and improve logging in new find/getMore commands code
SERVER-21750235 getMore command does not set nreturned operation counter
Storage
SERVER-21384236 Expand testing for in memory storage engines
SERVER-21545237 collMod and invalid parameter triggers fassert on dropCollection on mmapv1
SERVER-21885238 capped_truncate.js cannot be run with repeat
SERVER-21920239 Use enhanced WiredTiger next_random cursors for oplog stones
WiredTiger
SERVER-21792240 75% performance regression in insert workload under Windows between 3.0.7 and 3.2 with
WiredTiger
SERVER-21872241 WiredTiger changes for 3.2.1
Tools
TOOLS-954247 Add bypassDocumentValidation option to mongorestore and mongoimport
TOOLS-982248 Missing from text in mongorestore status message
Internals
SERVER-21164249 Change assert to throw in rslib.jss wait loop
SERVER-21214250 Dump config server data when the sharded concurrency suites fail
SERVER-21426251 Add writeConcern support to benchRun
SERVER-21450252 Modify MongoRunner to add enableMajorityReadConcern flag based on jsTestOptions
SERVER-21500253 Include the name of the FSM workload in the WorkloadFailure description
SERVER-21516254 Remove dbStats command from readConcern testing override
SERVER-21665255 Suppress tar output in jstestfuzz tasks
SERVER-21714256 Increase replSetTest.initiate() timeout for FSM tests
SERVER-21719257 Add initiateTimeout rsOption for ShardingTest
SERVER-21725258 Enable the analysis script move
SERVER-21737259 remove deprecated release process configuration from master branch evergreen configura-
tion
SERVER-21752260 slow2_wt fails by exhausting host machines memory
SERVER-21768261 Remove the numCollections field from dbHashs response
SERVER-21772262 findAndModify not captured by Profiler
SERVER-21793263 create v3.2 branch and update evergreen configuration
SERVER-21849264 Fix timestamp compare in min_optime_recovery.js
SERVER-21852265 kill_cursors.js fails in small_oplog* configurations
SERVER-21871266 Do not run min_optime_recovery.js on ephemeralForTest storageEngine
SERVER-21901267 CheckReplDBHash checks the wrong node when dumping docs from missing collections
247 https://jira.mongodb.org/browse/TOOLS-954
248 https://jira.mongodb.org/browse/TOOLS-982
249 https://jira.mongodb.org/browse/SERVER-21164
250 https://jira.mongodb.org/browse/SERVER-21214
251 https://jira.mongodb.org/browse/SERVER-21426
252 https://jira.mongodb.org/browse/SERVER-21450
253 https://jira.mongodb.org/browse/SERVER-21500
254 https://jira.mongodb.org/browse/SERVER-21516
255 https://jira.mongodb.org/browse/SERVER-21665
256 https://jira.mongodb.org/browse/SERVER-21714
257 https://jira.mongodb.org/browse/SERVER-21719
258 https://jira.mongodb.org/browse/SERVER-21725
259 https://jira.mongodb.org/browse/SERVER-21737
260 https://jira.mongodb.org/browse/SERVER-21752
261 https://jira.mongodb.org/browse/SERVER-21768
262 https://jira.mongodb.org/browse/SERVER-21772
263 https://jira.mongodb.org/browse/SERVER-21793
264 https://jira.mongodb.org/browse/SERVER-21849
265 https://jira.mongodb.org/browse/SERVER-21852
266 https://jira.mongodb.org/browse/SERVER-21871
267 https://jira.mongodb.org/browse/SERVER-21901
Fixed issue with setting optime when running with journaling disabled: SERVER-22495271 , SERVER-22728272
Have read concern majority reflect journaled state on the primary: SERVER-22269273
Fixed issue where specifying replication.enableMajorityReadConcern implied true regardless
of the actual boolean value: SERVER-22683274
Fixed issue causing segfault when running aggregation that includes $lookup: SERVER-22537275
All issues closed in 3.2.4276
Fixed issue with MMAPv1 journaling where the last sequence number file (lsn file) may be ahead of what
is synced to the data files: SERVER-22261277 .
Fixed issue where in some cases, insert operations fails to add the _id field to large documents: SERVER-
22167278 .
Increased timeout for querying oplog to 1 minute: SERVER-22456279 .
All issues closed in 3.2.3280
Replaced by MongoDB 3.2.3. Users wishing to run MongoDB 3.2 should skip 3.2.2 and upgrade directly to 3.2.3.
Fixed error where during a regular shutdown of a replica set, secondaries may mark certain replicated but yet to
be applied operations as successfully applied: SERVER-21868281 .
Improve insert workload performance with WiredTiger on Windows: SERVER-20262282 .
268 https://jira.mongodb.org/browse/SERVER-21923
269 https://jira.mongodb.org/browse/TOOLS-944
270 https://jira.mongodb.org/browse/TOOLS-1002
271 https://jira.mongodb.org/browse/SERVER-22495
272 https://jira.mongodb.org/browse/SERVER-22728
273 https://jira.mongodb.org/browse/SERVER-22269
274 https://jira.mongodb.org/browse/SERVER-22683
275 https://jira.mongodb.org/browse/SERVER-22537
276 https://jira.mongodb.org/issues/?jql=project%20in%20(SERVER%2C%20TOOLS)%20AND%20fixVersion%20%3D%203.2.4%20AND%20resolution%20%3D%2
277 https://jira.mongodb.org/browse/SERVER-22261
278 https://jira.mongodb.org/browse/SERVER-22167
279 https://jira.mongodb.org/browse/SERVER-22456
280 https://jira.mongodb.org/issues/?jql=project%20in%20(SERVER%2C%20TOOLS)%20AND%20fixVersion%20%3D%203.2.3%20AND%20resolution%20%3D%2
281 https://jira.mongodb.org/browse/SERVER-21868
282 https://jira.mongodb.org/browse/SERVER-20262
WiredTiger as Default
Starting in 3.2, MongoDB uses the WiredTiger as the default storage engine.
To specify the MMAPv1 storage engine, you must specify the storage engine setting either:
On the command line with the --storageEngine option:
mongod --storageEngine mmapv1
Note: For existing deployments, if you do not specify the --storageEngine or the storage.engine setting,
MongoDB 3.2 can automatically determine the storage engine used to create the data files in the --dbpath or
storage.dbPath.
If specifying --storageEngine or storage.engine, mongod will not start if dbPath contains data files
created by a storage engine other than the one specified.
See also:
Default Storage Engine Change (page 891)
Starting in MongoDB 3.2, the WiredTiger cache, by default, will use the larger of either:
60% of RAM minus 1 GB, or
1 GB.
For more information, see WiredTiger and Memory Use (page 589).
MongoDB 3.2 configures WiredTiger to write to the journal files at every 50 milliseconds. This is in addition to the
existing joural write intervals and conditions. For more information, see Journaling Process (page 599).
Starting in MongoDB 3.2, MongoDB reduces replica set failover time and accelerates the detection of multiple simul-
taneous primaries.
As part of this enhancement, MongoDB introduces a version 1 of the replication protocol. New replica sets will, by
default, use protocolVersion: 1 (page 711). Previous versions of MongoDB use version 0 of the protocol.
283 https://jira.mongodb.org/browse/SERVER-21366
284 https://jira.mongodb.org/issues/?jql=project%20in%20(SERVER%2C%20TOOLS)%20AND%20fixVersion%20%3D%203.2.1%20AND%20resolution%20%3D%2
In addition, MongoDB introduces a new replica set configuration (page 709) option electionTimeoutMillis
(page 715). electionTimeoutMillis (page 715) specifies the time limit in milliseconds for detecting when a
replica sets primary is unreachable.
electionTimeoutMillis (page 715) only applies if using the version 1 of the replication protocol
(page 711).
MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
Instead, starting in 3.2, the config servers (page 734) for a sharded cluster can be deployed as a replica set. The replica
set config servers must run the WiredTiger storage engine.
This change improves consistency across the config servers, since MongoDB can take advantage of the standard replica
set read and write protocols for sharding config data. In addition, this allows a sharded cluster to have more than 3
config servers since a replica set can have up to 50 members.
For more information, see Config Servers (page 734). To deploy a new sharded cluster with replica set config servers,
see Deploy a Sharded Cluster (page 757).
readConcern
MongoDB 3.2 introduces the readConcern query option for replica sets and replica set shards. For the WiredTiger
storage engine (page 587), the readConcern option allows clients to choose a level of isolation for their reads. You
can specify a readConcern of "majority" to read data that has been written to a majority of nodes and thus
cannot be rolled back. By default, MongoDB uses a readConcern of "local" to return the most recent data
available to the node at the time of the query, even if the data has not been persisted to a majority of nodes and may be
rolled back. With the MMAPv1 storage engine (page 595), you can only specify a readConcern of "local".
readConcern requires MongoDB drivers updated for MongoDB 3.2.
Only replica sets using protocol version 1 (page 711) support "majority" (page 144) read concern.
Replica sets running protocol version 0 do not support "majority" (page 144) read concern.
For details on readConcern, including operations that support the option, see Read Concern (page 143).
Partial Indexes
MongoDB 3.2 provides the option to create indexes that only index the documents in a collection that meet a
specified filter expression. By indexing a subset of the documents in a collection, partial indexes have lower
storage requirements and reduced performance costs for index creation and maintenance. You can specify a
partialFilterExpression option for all MongoDB index types (page 492).
The partialFilterExpression option accepts a document that specifies the condition using:
equality expressions (i.e. field: value or using the $eq operator),
$exists: true expression,
$gt, $gte, $lt, $lte expressions,
$type expressions,
$and operator at the top-level only
For details, see Partial Indexes (page 515).
Document Validation
Starting in 3.2, MongoDB provides the capability to validate documents during updates and insertions. Validation
rules are specified on a per-collection basis.
To specify document validation on a new collection, use the new validator option in the
db.createCollection() method. To add document validation to an existing collection, use the new
validator option in the collMod command. For more information, see Document Validation (page 160).
To view the validation specifications for a collection, use the db.getCollectionInfos() method.
The following commands can bypass validation per operation using the new option
bypassDocumentValidation:
applyOps command
findAndModify command and db.collection.findAndModify() method
mapReduce command and db.collection.mapReduce() method
insert command
update command
$out for the aggregate command and db.collection.aggregate() method
For deployments that have enabled access control, you must have bypassDocumentValidation (page 429)
action. The built-in roles dbAdmin (page 416) and restore (page 420) provide this action.
MongoDB introduces:
New stages, accumulators, and expressions.
Availability of accumulator expressions (page 885) in $project stage.
Performance improvements (page 885) on sharded clusters.
Starting in version 3.2, the following accumulator expressions, previously only available in the $group stage, are
now also available in the $project stage:
$avg
$min
$max
$sum
$stdDevPop
$stdDevSamp
When used as part of the $project stage, these accumulator expressions can accept either:
A single argument: <accumulator> : <arg>
Multiple arguments: <accumulator> : [ <arg1>, <arg2>, ... ]
General Enhancements
In MongoDB 3.2, $project stage supports using the square brackets [] to directly create new array fields.
For an example, see example-project-new-array-fields.
MongoDB 3.2 introduces the minDistance option for the $geoNear stage.
$unwind stage no longer errors on non-array operand. If the operand does not resolve to an array but is not
missing, null, or an empty array, $unwind treats the operand as a single element array.
$unwind stage can:
include the array index of the array element in the output by specifying a new option
includeArrayIndex in the stage specification.
output those documents where the array field is missing, null or an empty array by specifying a new option
preserveNullAndEmptyArrays in the stage specification.
To support these new features, $unwind can now take an alternate syntax. See $unwind for details.
Optimization
Compatibility
mongodump and mongorestore add support for archive files and standard output/input streams with a new
--archive option. This enhancement allows for the streaming of the dump data over a network device via a
pipe. For examples, see
mongodump to an Archive File and mongodump an Archive to Standard Output
mongorestore-example-archive-file and mongorestore-example-archive-stdin.
mongodump and mongorestore add support for compressed data dumps with a new --gzip option. This
enhancement reduces storage space for the dump files. For examples, see:
Compress mongodump Output
mongorestore-example-gzip.
Enterprise Feature
Available in MongoDB Enterprise only.
Encryption at rest, when used in conjunction with transport encryption and good security policies that protect relevant
accounts, passwords, and encryption keys, can help ensure compliance with security and privacy standards, including
HIPAA, PCI-DSS, and FERPA.
MongoDB Enterprise 3.2 introduces a native encryption option for the WiredTiger storage engine. This feature allows
MongoDB to encrypt data files such that only parties with the decryption key can decode and read the data. For detail,
see Encrypted Storage Engine (page 338).
MongoDB 3.2 introduces a version 3 of the text index (page 508). Key features of the new version of the index are:
Improved case insensitivity (page 509).
Diacritic insensitivity (page 510).
Additional delimiters for tokenization (page 510).
Starting in MongoDB 3.2, version 3 is the default version for new text (page 508) indexes.
See also:
Text Index Version 3 Compatibility (page 892)
Enterprise Feature
Available in MongoDB Enterprise only.
Starting in 3.2, MongoDB Enterprise provides support for the following languages: Arabic, Farsi (specifically Dari
and Iranian Persian dialects), Urdu, Simplified Chinese, and Traditional Chinese.
For details, see Text Search with Basis Technology Rosette Linguistics Platform (page 567).
Enterprise Feature
Available in MongoDB Enterprise only.
MongoDB Enterprise 3.2 provides an in-memory storage engine. Other than some metadata, the in-memory storage
engine does not maintain any on-disk data. By avoiding disk I/O, the in-memory storge engine allows for more
predictable latency of database operations.
MongoDB 3.2 provides a new for-test storage engine. Other than some metadata, the for-test storage engine does not
maintain any on-disk data, removing the need to clean up between test runs. The for-test storage engine is unsupported.
General Enhancements
MongoDB 3.2 uses SpiderMonkey as the JavaScript engine for the mongo shell and mongod server. SpiderMonkey
provides support for additional platforms and has an improved memory management model.
This change affects all JavaScript behavior including the commands mapReduce, group, and the query operator
$where; however, this change should be completely transparent to the user.
See also:
SpiderMonkey Compatibility Changes (page 892)
To provide consistency with the MongoDB drivers CRUD (Create/Read/Update/Delete) API, the mongo shell intro-
duces additional CRUD methods that are consistent with the drivers CRUD API:
Starting in MongoDB 3.2, the WiredTiger storage engine supports the fsync command with the lock option or the
mongo shell method db.fsyncLock(). That is, for the WiredTiger storage engine, these operations can guarantee
that the data files do not change, ensuring consistency for the purposes of creating backups.
Platform Support
Starting in 3.2, 32-bit binaries are deprecated and will be unavailable in future releases.
MongoDB 3.2 deprecates support for Red Hat Enterprise Linux 5.
$type operator accepts string aliases for the BSON types in addition to the numbers corresponding to the BSON
types.
For explain operations run in executionStats or allPlansExecution mode, the explain output
contains the keysExamined statistic, representing the number of index keys examined during index scans.
Prior to 3.2, keysExamined count in some queries did not include the last scanned key. As of 3.2 this error has been
corrected. For more information, see :data: ~explain.executionStats.executionStages.inputStage.keysExamined.
The diagnostic logs and the system profiler report on this statistic.
Geospatial Optimization
MongoDB 3.2 introduces version 3 of 2dsphere indexes (page 503), which index GeoJSON geometries (page 580) at
a finer gradation. The new version improves performance of 2dsphere index (page 503) queries over smaller regions.
In addition, for both 2d indexes (page 505) and 2dsphere indexes (page 503), the performance of geoNear queries has
been improved for dense datasets.
See also:
2dsphere Index Version 3 Compatibility (page 892)
To facilitate analysis of the MongoDB server behavior by MongoDB engineers, MongoDB 3.2 introduces
a diagnostic data collection mechanism for logging server statistics to diagnostic files at periodic inter-
vals. By default, the mechanism captures data at 1 second intervals. To modify the interval, see
diagnosticDataCollectionPeriodMillis.
MongoDB creates a diagnostic.data directory under the mongod instances --dbpath or
storage.dbPath. The diagnostic data is stored in files under this directory.
The maximum size of the diagnostic files is configurable with the diagnosticDataCollectionFileSizeMB,
and the maximum size of the diagnostic.data directory is configurable with
diagnosticDataCollectionDirectorySizeMB.
The default values for the capture interval and the maximum sizes are chosen to provide useful data to MongoDB
engineers with minimal impact on performance and storage size. Typically, these values will only need modifications
as requested by MongoDB engineers for specific diagnostic purposes.
Write Concern
For replica sets using protocolVersion: 1 (page 711), secondaries acknowledge write operations af-
ter the secondary members have written to their respective on-disk journals (page 598), regardless of the j
(page 142) option.
For replica sets using protocolVersion: 1 (page 711), w: "majority" (page 142) implies j: true
(page 142).
With j: true (page 143), MongoDB returns only after the requested number of members, including the
primary, have written to the journal. Previously j: true (page 143) write concern in a replica set only
requires the primary to write to the journal, regardless of the w: <value> (page 141) write concern.
MongoDB 3.2 adds support for specifying the journal commit interval for the WiredTiger storage engine. See
journalCommitInterval option. In previous versions, the option is applicable to MMAPv1 storage engine
only.
For the corresponding configuration file setting, MongoDB 3.2 adds the
storage.journal.commitIntervalMs setting and deprecates storage.mmapv1.journal.commitIntervalMs.
The deprecated storage.mmapv1.journal.commitIntervalMs setting acts as an alias to the new
storage.journal.commitIntervalMs setting.
Some MongoDB 3.2 changes can affect compatibility and may require user actions. For a detailed list of compatibility
changes, see Compatibility Changes in MongoDB 3.2 (page 891).
On this page
Default Storage Engine Change (page 891)
Index Changes (page 891)
Aggregation Compatibility Changes (page 892)
SpiderMonkey Compatibility Changes (page 892)
Driver Compatibility Changes (page 893)
General Compatibility Changes (page 893)
Additional Information (page 893)
The following 3.2 changes can affect the compatibility with older versions of MongoDB. See also Release Notes for
MongoDB 3.2 (page 865) for the list of the 3.2 changes.
Default Storage Engine Change Starting in 3.2, MongoDB uses the WiredTiger as the default storage engine.
Previous versions used the MMAPv1 as the default storage engine.
For existing deployments, if you do not specify the --storageEngine or the storage.engine set-
ting, MongoDB automatically determines the storage engine used to create the data files in the --dbpath or
storage.dbPath.
For new deployments, to use MMAPv1, you must explicitly specify the storage engine setting either:
On the command line with the --storageEngine option:
mongod --storageEngine mmapv1
Index Changes
Version 0 Indexes MongoDB 3.2 disallows the creation of version 0 indexes (i.e. {v: 0}). If version 0 indexes
exist, MongoDB 3.2 outputs a warning log message, specifying the collection and the index.
Starting in MongoDB 2.0, MongoDB started automatically upgrading v: 0 indexes during initial sync (page 648),
mongorestore or reIndex operations.
If a version 0 index exists, you can use any of the aforementioned operations as well as drop and recreate the index to
upgrade to the v: 1 version.
For example, if upon startup, a warning message indicated that an index index { v: 0, key: { x: 1.0
}, name: "x_1", ns: "test.legacyOrders" } is a version 0 index, to upgrade to the appropriate
version, you can drop and recreate the index:
1. Drop the index either by name:
use test
db.legacyOrders.dropIndex( "x_1" )
or by key:
use test
db.legacyOrders.dropIndex( { x: 1 } )
Text Index Version 3 Compatibility Text index (version 3) (page 886) is incompatible with earlier versions of
MongoDB. Earlier versions of MongoDB will not start if text index (version 3) (page 508) exists in the database.
2dsphere Index Version 3 Compatibility 2dsphere index (version 3) (page 890) is incompatible with earlier
versions of MongoDB. Earlier versions of MongoDB will not start if 2dsphere index (version 3) exists in the
database.
On this page
JavaScript Changes in MongoDB 3.2 Modernized JavaScript Implementation (ES6) (page 893)
Changes to the mongo Shell (page 893)
Removed Non-Standard V8 Features (page 893)
In MongoDB 3.2, the javascript engine used for both the mongo shell and for server-side javascript in mongod
changed from V8 to SpiderMonkey285 .
To confirm which JavaScript engine you are using, you can use either interpreterVersion() method in the
mongo shell and the javascriptEngine field in the output of db.serverBuildInfo()
In MongoDB 3.2, this will appear as MozJS-38 and mozjs, respectively.
Modernized JavaScript Implementation (ES6) SpiderMonkey brings with it increased support for features defined
in the 6th edition of ECMAScript286 , abbreviated as ES6. ES6 adds many new language features, including:
arrow functions287 ,
destructuring assignment288 ,
for-of loops289 , and
generators290 .
Changes to the mongo Shell MongoDB 3.2 will return JavaScript and BSON undefined values intact if saved
into a collection. Previously, the mongo shell would convert undefined values into null.
MongoDB 3.2 also adds the disableJavaScriptJIT parameter to mongod, which allows you to disable the
JavaScript engines JIT acceleration. The mongo shell has a corresponding --disableJavaScriptJIT flag.
Driver Compatibility Changes A driver upgrade is necessary to support the find and getMore commands.
Additional Information See also Release Notes for MongoDB 3.2 (page 865).
285 https://developer.mozilla.org/en-US/docs/SpiderMonkey
286 http://www.ecma-international.org/ecma-262/6.0/index.html
287 http://www.ecma-international.org/ecma-262/6.0/index.html#sec-arrow-function-definitions
288 http://www.ecma-international.org/ecma-262/6.0/index.html#sec-destructuring-assignment
289 http://www.ecma-international.org/ecma-262/6.0/index.html#sec-for-in-and-for-of-statements
290 http://www.ecma-international.org/ecma-262/6.0/index.html#sec-generator-function-definitions
Upgrade Process
On this page
Upgrade Recommendations and Checklists (page 894)
Upgrade Standalone mongod Instance to MongoDB 3.2 (page 894)
Upgrade a Replica Set to 3.2 (page 895)
Upgrade a Sharded Cluster to 3.2 (page 895)
Additional Resources (page 897)
Before you attempt any upgrade, please familiarize yourself with the content of this document.
If you need guidance on upgrading to 3.2, MongoDB offers 3.2 upgrade services291 to help ensure a smooth transition
without interruption to your MongoDB application.
Upgrade Requirements To upgrade an existing MongoDB deployment to 3.2, you must be running a 3.0-series
release.
To upgrade from a 2.6-series release, you must upgrade to the latest 3.0-series release before upgrading to 3.2. For the
procedure to upgrade from the 2.6-series to a 3.0-series release, see Upgrade MongoDB to 3.0 (page 945).
Preparedness Before beginning your upgrade, see the Compatibility Changes in MongoDB 3.2 (page 891) document
to ensure that your applications and deployments are compatible with MongoDB 3.2. Resolve the incompatibilities in
your deployment before starting the upgrade.
Before upgrading MongoDB, always test your application in a staging environment before deploying the upgrade to
your production environment.
Upgrade Standalone mongod Instance to MongoDB 3.2 The following steps outline the procedure to upgrade
a standalone mongod from version 3.0 to 3.2. To upgrade from version 2.6 to 3.2, upgrade to the latest 3.0-series
release (page 945) first, and then use the following procedure to upgrade from 3.0 to 3.2.
Upgrade with Package Manager If you installed MongoDB from the MongoDB apt, yum, dnf, or zypper
repositories, you should upgrade to 3.2 using your package manager. Follow the appropriate installation instructions
(page 6) for your Linux system. This will involve adding a repository for the new release, then performing the actual
upgrade.
Step 1: Download 3.2 binaries. Download binaries of the latest release in the 3.2 series from the MongoDB Down-
load Page292 . See Install MongoDB (page 5) for more information.
291 https://www.mongodb.com/contact/mongodb-3-2-upgrade-services?jmp=docs
292 http://www.mongodb.org/downloads?jmp=docs
Step 2: Replace with 3.2 binaries Shut down your mongod instance. Replace the existing binary with the 3.2
mongod binary and restart mongod.
Note: MongoDB 3.2 generates core dumps on some mongod failures. For production environments, you may prefer
to turn off core dumps for the operating system, if not already.
Prerequisites All replica set members must be running version 3.0 before you can upgrade them to version 3.2. To
upgrade a replica set from an earlier MongoDB version, upgrade all members of the replica set to the latest 3.0-series
release (page 945) first, and then follow the procedure to upgrade from MongoDB 3.0 to 3.2.
Upgrade Binaries You can upgrade from MongoDB 3.0 to 3.2 using a rolling upgrade to minimize downtime by
upgrading the members individually while the other members are available:
Step 1: Upgrade secondary members of the replica set. Upgrade the secondary (page ??) members of the replica
set one at a time:
Shut down the mongod instance and replace the 3.0 binary with the 3.2 binary.
Restart the member and wait for the member to recover to SECONDARY state before upgrading the next sec-
ondary member. To check the members state, issue rs.status() in the mongo shell.
Step 2: Step down the replica set primary. Connect a mongo shell to the primary and use rs.stepDown() to
step down the primary and force an election of a new primary:
Step 3: Upgrade the primary. When rs.status() shows that the primary has stepped down and another mem-
ber has assumed PRIMARY state, upgrade the stepped-down primary:
Shut down the stepped-down primary and replace the mongod binary with the 3.2 binary.
Restart.
Step 4: Upgrade the replication protocol. Connect a mongo shell to the current primary and upgrade the replica-
tion protocol
cfg = rs.conf();
cfg.protocolVersion=1;
rs.reconfig(cfg);
Replica set failover is not instant and will render the set unavailable to accept writes until the failover process com-
pletes. This may take 30 seconds or more: schedule the upgrade procedure during a scheduled maintenance window.
Note: MongoDB 3.2 generates core dumps on some mongod failures. For production environments, you may prefer
to turn off core dumps for the operating system, if not already.
Prerequisites
Version 3.0 or Greater To upgrade a sharded cluster to 3.2, all members of the cluster must be at least version
3.0. The upgrade process checks all components of the cluster and will produce warnings if any component
is running version earlier than 3.0.
Stop Metadata Changes during the Upgrade During the upgrade, ensure that clients do not make changes to
the collection metadata. For example, during the upgrade, do not perform any of the following operations:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster metadata in any way.
See the Sharding Reference (page 814) for a complete list of sharding commands. Not all commands on
the Sharding Reference (page 814) page modify the cluster metadata.
Disable the balancer (page 794)
Back up the config Database Optional but Recommended. As a precaution, take a backup of the config
database before upgrading the sharded cluster.
Upgrade Binaries
Step 1: Disable the Balancer. Disable the balancer as described in Disable the Balancer (page 794).
Step 2: Upgrade the shards. Upgrade the shards one at a time. If the shards are replica sets, for each shard:
1. Upgrade the secondary (page ??) members of the replica set one at a time:
Shut down the mongod instance and replace the 3.0 binary with the 3.2 binary.
Restart the member and wait for the member to recover to SECONDARY state before upgrading the next
secondary member. To check the members state, issue rs.status() in the mongo shell.
2. Step down the replica set primary.
Connect a mongo shell to the primary and use rs.stepDown() to step down the primary and force an election
of a new primary:
rs.stepDown()
3. When rs.status() shows that the primary has stepped down and another member has assumed PRIMARY
state, upgrade the stepped-down primary:
Shut down the stepped-down primary and replace the mongod binary with the 3.2 binary.
Restart.
4. Connect a mongo shell to the current primary and upgrade the replication protocol (page 711) for the
shard:
cfg = rs.conf();
cfg.protocolVersion=1;
rs.reconfig(cfg);
Step 3: Upgrade the config servers. Upgrade the config servers one at a time in reverse order of the configDB or
--configdb setting for the mongos. That is, if the mongos has the following --configdb listing:
mongos --configdb confserver1:port1,confserver2:port2,confserver3:port2
Repeat for the config server listed second in the configDB setting, and finally the config server listed first in the
configDB setting.
Step 4: Upgrade the mongos instances. Replace each mongos instance with the 3.2 binary and restart.
mongos --configdb <cfgsvr1:port1>,<cfgsvr2:port2>,<cfgsvr3:port3>
Step 5: Re-enable the balancer. Re-enable the balancer as described in Enable the Balancer (page 795).
Note: MongoDB 3.2 generates core dumps on some mongod failures. For production environments, you may prefer
to turn off core dumps for the operating system, if not already.
Once the sharded cluster binaries have been upgraded to 3.2, existing config servers will continue to run as mirrored
mongod instances. For instructions on upgrading existing config servers to a replica set, see Upgrade Config Servers
to Replica Set (page 772) (requires MongoDB version 3.2.4 or later versions).
Additional Resources
Getting ready for MongoDB 3.2? Get our help.293
293 https://www.mongodb.com/contact/mongodb-3-2-upgrade-services?jmp=docs
On this page
Downgrade Recommendations and Checklist (page 898)
Prerequisites (page 898)
Downgrade a Standalone mongod Instance (page 899)
Downgrade a 3.2 Replica Set (page 899)
Downgrade a 3.2 Sharded Cluster (page 900)
Before you attempt any downgrade, familiarize yourself with the content of this document, particularly the Downgrade
Recommendations and Checklist (page 898) and the procedure for downgrading sharded clusters (page 900).
Preparedness
Remove or downgrade version 3 text indexes (page 898) before downgrading MongoDB 3.2 to 3.0.
Remove or downgrade version 3 2dsphere indexes (page 899) before downgrading MongoDB 3.2 to 3.0.
Prerequisites
Text Index Version Check If you have version 3 text indexes (i.e. the default version for text indexes in MongoDB
3.2), drop the version 3 text indexes before downgrading MongoDB. After the downgrade, enable text search and
recreate the dropped text indexes.
To determine the version of your text indexes, run db.collection.getIndexes() to view index specifica-
tions. For text indexes, the method returns the version information in the field textIndexVersion. For example,
the following shows that the text index on the quotes collection is version 3.
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "quote_text_translation.quote_text",
"ns" : "test.quotes",
"weights" : {
"quote" : 1,
"translation.quote" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
2dsphere Index Version Check If you have version 3 2dsphere indexes (i.e. the default version for 2dsphere
indexes in MongoDB 3.2), drop the version 3 2dsphere indexes before downgrading MongoDB. After the down-
grade, recreate the 2dsphere indexes.
To determine the version of your 2dsphere indexes, run db.collection.getIndexes() to view
index specifications. For 2dsphere indexes, the method returns the version information in the field
2dsphereIndexVersion. For example, the following shows that the 2dsphere index on the locations
collection is version 3.
{
"v" : 1,
"key" : {
"geo" : "2dsphere"
},
"name" : "geo_2dsphere",
"ns" : "test.locations",
"sparse" : true,
"2dsphereIndexVersion" : 3
}
Downgrade a Standalone mongod Instance The following steps outline the procedure to downgrade a standalone
mongod from version 3.2 to 3.0.
Step 1: Download the latest 3.0 binaries. For the downgrade, use the latest release in the 3.0 series.
Shut down your mongod instance. Replace the existing binary with the downloaded mongod binary and restart.
Downgrade a 3.2 Replica Set The following steps outline a rolling downgrade process for the replica set. The
rolling downgrade process minimizes downtime by downgrading the members individually while the other members
are available:
Step 1: Downgrade the protocolVersion. Connect a mongo shell to the current primary and downgrade the repli-
cation protocol:
cfg = rs.conf();
cfg.protocolVersion=0;
rs.reconfig(cfg);
Step 2: Downgrade secondary members of the replica set. Downgrade each secondary member of the replica set,
one at a time:
1. Shut down the mongod. See Stop mongod Processes (page 246) for instructions on safely terminating mongod
processes.
2. Replace the 3.2 binary with the 3.0 binary and restart.
Important: If your mongod instance is using the WiredTiger (page 587) storage engine, you must include the
--storageEngine option (or storage.engine if using the configuration file) with the 3.0 binary.
3. Wait for the member to recover to SECONDARY state before downgrading the next secondary. To check the
members state, use the rs.status() method in the mongo shell.
Step 3: Step down the primary. Use rs.stepDown() in the mongo shell to step down the primary and force
the normal failover (page 635) procedure.
rs.stepDown()
rs.stepDown() expedites the failover procedure and is preferable to shutting down the primary directly.
Step 4: Replace and restart former primary mongod. When rs.status() shows that the primary has stepped
down and another member has assumed PRIMARY state, shut down the previous primary and replace the mongod
binary with the 3.0 binary and start the new instance.
Important: If your mongod instance is using the WiredTiger (page 587) storage engine, you must include the
--storageEngine option (or storage.engine if using the configuration file) with the 3.0 binary.
Replica set failover is not instant but will render the set unavailable to writes and interrupt reads until the failover pro-
cess completes. Typically this takes 10 seconds or more. You may wish to plan the downgrade during a predetermined
maintenance window.
Requirements While the downgrade is in progress, you cannot make changes to the collection metadata. For exam-
ple, during the downgrade, do not do any of the following:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster meta-data in any way. See Sharding Reference (page 814) for a com-
plete list of sharding commands. Note, however, that not all commands on the Sharding Reference (page 814)
page modifies the cluster meta-data.
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 2: Downgrade each shard, one at a time. For each replica set shard:
1. Downgrade the protocolVersion.
2. Downgrade the mongod secondaries before downgrading the primary.
3. To downgrade the primary, run replSetStepDown and then downgrade.
For details on downgrading a replica set, see Downgrade a 3.2 Replica Set (page 899).
Step 3: Downgrade the SCCC config servers. If the sharded cluster uses 3 mirrored mongod instances for the
config servers, downgrade all three instances in reverse order of their listing in the --configdb option for mongos.
For example, if mongos has the following --configdb listing:
--configdb confserver1,confserver2,confserver3
Downgrade first confserver3, then confserver2, and lastly, confserver1. If your mongod instance
is using the WiredTiger (page 587) storage engine, you must include the --storageEngine option (or
storage.engine if using the configuration file) with the 3.0 binary.
mongod --configsvr --dbpath <path> --port <port> --storageEngine <storageEngine>
Step 4: Downgrade the mongos instances. Downgrade the binaries and restart.
Step 5: Re-enable the balancer. Once the downgrade of sharded cluster components is complete, re-enable the
balancer (page 795).
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 2: Prepare CSRS Config Servers for downgrade. If the sharded cluster uses CSRS (page 734):
1. Remove secondary members from the replica set (page 673) to have only a primary and two secondaries and only
the primary can vote and be eligible to be primary; i.e. the other two members have 0 for votes (page 713)
and priority (page 713).
Connect a mongo shell to the primary and run:
rs.reconfig(
{
"_id" : <name>,
"configsvr" : true,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "<host1>:<port1>",
"priority" : 1,
"votes" : 1
},
{
"_id" : 1,
"host" : "<host2>:<port2>",
"priority" : 0,
"votes" : 0
},
{
"_id" : 2,
"host" : "<host3>:<port3>",
"priority" : 0,
"votes" : 0
}
]
}
)
2. Step down the primary using replSetStepDown against the admin database. Ensure enough time for the
secondaries to catch up.
Connect a mongo shell to the primary and run:
db.adminCommand( { replSetStepDown: 360, secondaryCatchUpPeriodSecs: 300, force: true })
3. Shut down all members of the config server replica set, the mongos instances, and the shards.
4. Restart each config server as standalone 3.2 mongod; i.e. without the --replSet or, if using a configuration
file, replication.replSetName.
mongod --configsvr --dbpath <path> --port <port> --storageEngine <storageEngine>
Step 3: Update the protocolVersion for each shard. Restart each replica set shard and update the protocolVersion.
Connect a mongo shell to the current primary and downgrade the replication protocol:
cfg = rs.conf();
cfg.protocolVersion=0;
rs.reconfig(cfg);
Step 5: Downgrade Config Servers. Downgrade the binaries and restart. Downgrade in reverse order of their listing
in the --configdb option for mongos.
If your mongod instance is using the WiredTiger (page 587) storage engine, you must include the
--storageEngine option (or storage.engine if using the configuration file) with the 3.0 binary.
mongod --configsvr --dbpath <path> --port <port> --storageEngine <storageEngine>
Step 6: Downgrade each shard, one at a time. For each replica set shard, downgrade the mongod binaries
and restart. If your mongod instance is using the WiredTiger (page 587) storage engine, you must include the
--storageEngine option (or storage.engine if using the configuration file) with the 3.0 binary.
1. Downgrade the mongod secondaries before downgrading the primary.
2. To downgrade the primary, run replSetStepDown and then downgrade.
For details on downgrading a replica set, see Downgrade a 3.2 Replica Set (page 899).
Step 7: Re-enable the balancer. Once the downgrade of sharded cluster components is complete, re-enable the
balancer (page 795).
See Upgrade MongoDB to 3.2 (page 894) for full upgrade instructions.
Severe performance regression in insert workload under Windows with WiredTiger: SERVER-21792310
Download
Additional Resources
On this page
Minor Releases (page 904)
Major Changes (page 933)
Replica Sets (page 934)
Sharded Clusters (page 935)
Security Improvements (page 935)
Improvements (page 936)
MongoDB Enterprise Features (page 937)
Additional Information (page 938)
Additional Resources (page 959)
March 3, 2015
MongoDB 3.0 is now available. Key features include support for the WiredTiger storage engine, pluggable storage
engine API, SCRAM-SHA-1 authentication mechanism, and improved explain functionality.
MongoDB Ops Manager, which includes Automation, Backup, and Monitoring, is now also available. See the Ops
Manager documentation315 and the Ops Manager release notes316 for more information.
Minor Releases
3.0 Changelog
310 https://jira.mongodb.org/browse/SERVER-21792
311 http://www.mongodb.org/downloads
312 https://github.com/mongodb/mongo/blob/v3.2/distsrc/THIRD-PARTY-NOTICES
313 http://bit.ly/1XXomL9
314 https://www.mongodb.com/contact/mongodb-3-2-upgrade-services?jmp=docs
315 http://docs.opsmanager.mongodb.com/current/
316 http://docs.opsmanager.mongodb.com/current/release-notes/application/
On this page
3.0.10 Changelog (page 905)
3.0.9 Changelog (page 906)
3.0.8 Changelog (page 908)
3.0.7 Changelog (page 910)
3.0.6 Changelog (page 913)
3.0.5 Changelog (page 914)
3.0.4 Changelog (page 917)
3.0.3 Changelog (page 920)
3.0.2 Changelog (page 924)
3.0.1 Changelog (page 926)
3.0.10 Changelog
Sharding
SERVER-18671317 SecondaryPreferred can end up using unversioned connections
SERVER-22569318 Initialization of eooElement static local variable isnt thread safe with MSVC 2013
Query SERVER-22535319 Some index operations (drop index, abort index build, update TTL config) on collection
during active migration can cause migration to skip documents
Storage
SERVER-19800320 DataSizeChange forces an int into a bool
SERVER-22634321 Data size change for oplog deletes can overflow 32-bit int
WiredTiger
SERVER-22554322 WiredTiger data handles not closed when collection is dropped
MMAP
SERVER-22261323 MMAPv1 LSNFile may be updated ahead of what is synced to data files
Internals
SERVER-22292327 Use more reliable mechanism in the mongo shell to wait for process to terminate on windows
SERVER-22328328 bench_test_crud_commands.js fails due to resource contention from other resmoke jobs and
low timeout values
3.0.9 Changelog
Sharding
SERVER-19266330 An error document is returned with result set
SERVER-21382331 Sharding migration transfers all document deletions
SERVER-22114332 Mongos can accumulate multiple copies of ChunkManager when a shard restarts
Replication
SERVER-18219333 control reaches end of non-void function errors in GCC with WCE retry loop
SERVER-21583334 ApplyOps background index creation may deadlock
SERVER-22109335 Invariant failure when running applyOps to create an index with a bad ns field
Query
SERVER-19128336 Fatal assertion during secondary index build
SERVER-19996337 Queries which specify sort and batch size can generate results out of order, if documents
concurrently updated
SERVER-20083338 Add log statement at default log level for when an index filter is set or cleared successfully
SERVER-21602339 Reduce execution time of cursor_timeout.js
SERVER-21776340 Move per-operation log lines for queries out of the QUERY log component
Aggregation SERVER-7656342 Optimize aggregation on sharded setup if first stage is exact match on shard key
Storage
SERVER-20858343 Invariant failure in OplogStones for non-capped oplog creation
SERVER-20866344 Race condition in oplog insert transaction rollback
SERVER-21545345 collMod and invalid parameter triggers fassert on dropCollection on mmapv1
SERVER-22014346 index_bigkeys_nofail.js triggers spurious failures when run in parallel with other tests
WiredTiger
SERVER-20961347 Large amounts of create and drop collections can cause listDatabases to be slow under
WiredTiger
SERVER-22129348 WiredTiger changes for MongoDB 3.0.9
Internals
SERVER-18373353 MONGO_COMPILER_UNREACHABLE should terminate if violated
SERVER-19110354 Ignore failed operations in mixed_storage_version_replication.js
SERVER-21934355 Add extra information to OSX stack traces to facilitate addr2line translation
SERVER-21960356 Include symbol name in stacktrace json when available
SERVER-22013357 coll_mod_bad_spec.js tries to pass filter to getCollectionInfos on v3.0 branch
SERVER-22054358 Authentication failure reports incorrect IP address
342 https://jira.mongodb.org/browse/SERVER-7656
343 https://jira.mongodb.org/browse/SERVER-20858
344 https://jira.mongodb.org/browse/SERVER-20866
345 https://jira.mongodb.org/browse/SERVER-21545
346 https://jira.mongodb.org/browse/SERVER-22014
347 https://jira.mongodb.org/browse/SERVER-20961
348 https://jira.mongodb.org/browse/SERVER-22129
349 https://jira.mongodb.org/browse/SERVER-20358
350 https://jira.mongodb.org/browse/SERVER-17747
351 https://jira.mongodb.org/browse/SERVER-18162
352 https://jira.mongodb.org/browse/SERVER-18953
353 https://jira.mongodb.org/browse/SERVER-18373
354 https://jira.mongodb.org/browse/SERVER-19110
355 https://jira.mongodb.org/browse/SERVER-21934
356 https://jira.mongodb.org/browse/SERVER-21960
357 https://jira.mongodb.org/browse/SERVER-22013
358 https://jira.mongodb.org/browse/SERVER-22054
3.0.8 Changelog
Sharding
SERVER-20407362 findAndModify on mongoS upserts to the wrong shard
SERVER-20839363 trace_missing_docs_test.js compares Timestamp instances using < operator in mongo shell
Query
SERVER-2454364 Queries that are killed during a yield should return error to user instead of partial result set
SERVER-21227365 MultiPlanStage::invalidate() should not flag and drop invalidated WorkingSetMembers
SERVER-21275366 Document not found due to WT commit visibility issue
Storage
SERVER-20650367 Backport MongoRocks changes to 3.0
SERVER-21543368 Lengthen delay before deleting old journal files
WiredTiger
SERVER-20303369 Negative scaling at low thread count under WiredTiger when inserting large documents
SERVER-21063370 MongoDB with WiredTiger can build very deep trees
SERVER-21442371 WiredTiger changes for MongoDB 3.0.8
SERVER-21553372 Oplog grows to 3x configured size
359 https://jira.mongodb.org/browse/SERVER-22191
360 https://jira.mongodb.org/browse/TOOLS-1002
361 https://jira.mongodb.org/browse/SERVER-21278
362 https://jira.mongodb.org/browse/SERVER-20407
363 https://jira.mongodb.org/browse/SERVER-20839
364 https://jira.mongodb.org/browse/SERVER-2454
365 https://jira.mongodb.org/browse/SERVER-21227
366 https://jira.mongodb.org/browse/SERVER-21275
367 https://jira.mongodb.org/browse/SERVER-20650
368 https://jira.mongodb.org/browse/SERVER-21543
369 https://jira.mongodb.org/browse/SERVER-20303
370 https://jira.mongodb.org/browse/SERVER-21063
371 https://jira.mongodb.org/browse/SERVER-21442
372 https://jira.mongodb.org/browse/SERVER-21553
Tools
TOOLS-702380 bsondump does not keep attribut order
TOOLS-920381 mongodump issue with temporary map/reduce collections
TOOLS-939382 Error restoring database insertion error: EOF
Internals
SERVER-8728383 jstests/profile1.js is a race and fails randomly
SERVER-20521384 Update Mongo-perf display names in Evergreen to sort better
SERVER-20527385 Delete resmoke.py from the 3.0 branch
SERVER-20876386 Hang in scenario with sharded ttl collection under WiredTiger
SERVER-21027387 Reduced performance of index lookups after removing documents from collection
SERVER-21099388 Improve logging in SecureRandom and PseudoRandom classes
SERVER-21150389 Basic startup logging should be done as early as possible in initAndListen
SERVER-21208390 server up check in perf.yml is in the wrong place
SERVER-21305391 Lock timeAcquiringMicros value is much higher than the actual time spent
SERVER-21433392 Perf.yml project should kill unwanted processes before starting tests
373 https://jira.mongodb.org/browse/SERVER-10512
374 https://jira.mongodb.org/browse/SERVER-19755
375 https://jira.mongodb.org/browse/SERVER-20699
376 https://jira.mongodb.org/browse/SERVER-20830
377 https://jira.mongodb.org/browse/SERVER-20834
378 https://jira.mongodb.org/browse/SERVER-21209
379 https://jira.mongodb.org/browse/SERVER-21477
380 https://jira.mongodb.org/browse/TOOLS-702
381 https://jira.mongodb.org/browse/TOOLS-920
382 https://jira.mongodb.org/browse/TOOLS-939
383 https://jira.mongodb.org/browse/SERVER-8728
384 https://jira.mongodb.org/browse/SERVER-20521
385 https://jira.mongodb.org/browse/SERVER-20527
386 https://jira.mongodb.org/browse/SERVER-20876
387 https://jira.mongodb.org/browse/SERVER-21027
388 https://jira.mongodb.org/browse/SERVER-21099
389 https://jira.mongodb.org/browse/SERVER-21150
390 https://jira.mongodb.org/browse/SERVER-21208
391 https://jira.mongodb.org/browse/SERVER-21305
392 https://jira.mongodb.org/browse/SERVER-21433
SERVER-21533393 Lock manager is not fair in the presence of compatible requests which can be granted im-
mediately
3.0.7 Changelog
Security
SERVER-13647394 root (page 422) role does not contain sufficient privileges for a mongorestore of a
system with security enabled
SERVER-15893395 root (page 422) role should be able to run validate on system collections
SERVER-19131396 clusterManager (page 417) role does not have permission for adding tag ranges
SERVER-19284397 Should not be able to create role with same name as builtin role
SERVER-20394398 Remove non-integer test case from iteration_count_control.js
SERVER-20401399 Publicly expose net.ssl.disabledProtocols
Sharding
SERVER-17886400 dbKillCursors op asserts on mongos when at log level 3
SERVER-20191401 multi-updates/remove can make successive queries skip shard version checking
SERVER-20460402 listIndexes on 3.0 mongos with 2.6 mongod instances returns erroneous not autho-
rized
SERVER-20557403 Active window setting is not being processed correctly
Replication
SERVER-20262404 Replica set nodes can get stuck in a state where they will not step themselves down
SERVER-20473405 calling setMaintenanceMode(true) while running for election crashes server
Query
SERVER-17895406 Server should not clear collection plan cache periodically when write operations are issued
SERVER-19412407 NULL PlanStage in getStageByType causes segfault during stageDebug command
SERVER-19725408 NULL pointer crash in QueryPlanner::plan with $near operator
393 https://jira.mongodb.org/browse/SERVER-21533
394 https://jira.mongodb.org/browse/SERVER-13647
395 https://jira.mongodb.org/browse/SERVER-15893
396 https://jira.mongodb.org/browse/SERVER-19131
397 https://jira.mongodb.org/browse/SERVER-19284
398 https://jira.mongodb.org/browse/SERVER-20394
399 https://jira.mongodb.org/browse/SERVER-20401
400 https://jira.mongodb.org/browse/SERVER-17886
401 https://jira.mongodb.org/browse/SERVER-20191
402 https://jira.mongodb.org/browse/SERVER-20460
403 https://jira.mongodb.org/browse/SERVER-20557
404 https://jira.mongodb.org/browse/SERVER-20262
405 https://jira.mongodb.org/browse/SERVER-20473
406 https://jira.mongodb.org/browse/SERVER-17895
407 https://jira.mongodb.org/browse/SERVER-19412
408 https://jira.mongodb.org/browse/SERVER-19725
Write Operations
SERVER-11746413 Improve shard version checking for versioned (single) updates after yield
SERVER-19361414 Insert of document with duplicate _id fields should be forbidden
SERVER-20531415 Mongodb server crash: Invariant failure res.existing
Storage
SERVER-18624416 listCollections command should not be O(n^2) on MMAPv1
SERVER-20617417 wt_nojournal_toggle.js failing intermittently in noPassthrough_WT
SERVER-20638418 Reading the profiling level shouldnt create databases that dont exist
WiredTiger
SERVER-18250419 Once enabled journal cannot be disabled under WiredTiger
SERVER-20008420 Stress test deadlock in WiredTiger
SERVER-20091421 Poor query throughput and erratic behavior at high connection counts under WiredTiger
SERVER-20159422 Out of memory on index build during initial sync even with low cacheSize parameter
SERVER-20176423 Deletes with j:true slower on WT than MMAPv1
SERVER-20204424 Segmentation fault during index build on 3.0 secondary
Operations
SERVER-14750425 Convert RPM and DEB mongod.conf files to new YAML format
SERVER-18506426 Balancer section of printShardingStatus should respect passed-in configDB
409 https://jira.mongodb.org/browse/SERVER-20139
410 https://jira.mongodb.org/browse/SERVER-20219
411 https://jira.mongodb.org/browse/SERVER-20347
412 https://jira.mongodb.org/browse/SERVER-20364
413 https://jira.mongodb.org/browse/SERVER-11746
414 https://jira.mongodb.org/browse/SERVER-19361
415 https://jira.mongodb.org/browse/SERVER-20531
416 https://jira.mongodb.org/browse/SERVER-18624
417 https://jira.mongodb.org/browse/SERVER-20617
418 https://jira.mongodb.org/browse/SERVER-20638
419 https://jira.mongodb.org/browse/SERVER-18250
420 https://jira.mongodb.org/browse/SERVER-20008
421 https://jira.mongodb.org/browse/SERVER-20091
422 https://jira.mongodb.org/browse/SERVER-20159
423 https://jira.mongodb.org/browse/SERVER-20176
424 https://jira.mongodb.org/browse/SERVER-20204
425 https://jira.mongodb.org/browse/SERVER-14750
426 https://jira.mongodb.org/browse/SERVER-18506
Tools
TOOLS-767434 mongorestore: error parsing metadata: call of reflect.Value.Set on zero Value
TOOLS-847435 mongorestore exits in response to SIGHUP, even when run under nohup
TOOLS-874436 mongoimport $date close to epoch not working
TOOLS-916437 mongoexport throws reflect.Value.Type errors
Internals
SERVER-18178438 Fix mr_drop.js test to not fail from nondeterministic collection drop timing
SERVER-19819439 Update perf.yml to use new mongo-perf release
SERVER-19820440 Update perf.yml to use mongo-perf check script
SERVER-19899441 Mongo-perf analysis script Check for per thread level regressions
SERVER-19901442 Mongo-perf analysis script Compare to tagged baseline
SERVER-19902443 Mongo-perf analysis script Use noise data for regression comparison instead of fixed
percentage
SERVER-20035444 Updated perf_regresison_check.py script to output report.json summarizing results
SERVER-20121445 XorShift PRNG should use unsigned arithmetic
SERVER-20216446 Extend optional Command properties to SASL
427 https://jira.mongodb.org/browse/SERVER-18516
428 https://jira.mongodb.org/browse/SERVER-18581
429 https://jira.mongodb.org/browse/SERVER-18749
430 https://jira.mongodb.org/browse/SERVER-18793
431 https://jira.mongodb.org/browse/SERVER-19088
432 https://jira.mongodb.org/browse/SERVER-19509
433 https://jira.mongodb.org/browse/SERVER-19661
434 https://jira.mongodb.org/browse/TOOLS-767
435 https://jira.mongodb.org/browse/TOOLS-847
436 https://jira.mongodb.org/browse/TOOLS-874
437 https://jira.mongodb.org/browse/TOOLS-916
438 https://jira.mongodb.org/browse/SERVER-18178
439 https://jira.mongodb.org/browse/SERVER-19819
440 https://jira.mongodb.org/browse/SERVER-19820
441 https://jira.mongodb.org/browse/SERVER-19899
442 https://jira.mongodb.org/browse/SERVER-19901
443 https://jira.mongodb.org/browse/SERVER-19902
444 https://jira.mongodb.org/browse/SERVER-20035
445 https://jira.mongodb.org/browse/SERVER-20121
446 https://jira.mongodb.org/browse/SERVER-20216
3.0.6 Changelog
Security SERVER-19538455 Segfault when calling dbexit in SSLManager with auditing enabled
Querying
SERVER-19553456 Mongod shouldnt use sayPiggyBack to send KillCursor messages
Replication
SERVER-19719457 Failure to rollback noPadding should not cause fatal error
SERVER-19644458 Seg Fault on cloneCollection (specifically gridfs)
WiredTiger
SERVER-19673459 Excessive memory allocated by WiredTiger journal
SERVER-19987460 Limit the size of the per-session cursor cache
SERVER-19751461 WiredTiger panic halt in eviction-server
SERVER-19744462 WiredTiger changes for MongoDB 3.0.6
SERVER-19573463 MongoDb crash due to segfault
SERVER-19522464 Capped collection insert rate declines over time under WiredTiger
447 https://jira.mongodb.org/browse/SERVER-20316
448 https://jira.mongodb.org/browse/SERVER-20322
449 https://jira.mongodb.org/browse/SERVER-20383
450 https://jira.mongodb.org/browse/SERVER-20429
451 https://jira.mongodb.org/browse/SERVER-20464
452 https://jira.mongodb.org/browse/SERVER-20691
453 https://jira.mongodb.org/browse/TOOLS-894
454 https://jira.mongodb.org/browse/TOOLS-898
455 https://jira.mongodb.org/browse/SERVER-19538
456 https://jira.mongodb.org/browse/SERVER-19553
457 https://jira.mongodb.org/browse/SERVER-19719
458 https://jira.mongodb.org/browse/SERVER-19644
459 https://jira.mongodb.org/browse/SERVER-19673
460 https://jira.mongodb.org/browse/SERVER-19987
461 https://jira.mongodb.org/browse/SERVER-19751
462 https://jira.mongodb.org/browse/SERVER-19744
463 https://jira.mongodb.org/browse/SERVER-19573
464 https://jira.mongodb.org/browse/SERVER-19522
MMAPv1 SERVER-19805465 MMap memory mapped file address allocation code cannot handle addresses non-
aligned to memory mapped granularity size
Networking
SERVER-19389466 Remove wire level endianness check
Aggregation Framework
SERVER-19553467 Mongod shouldnt use sayPiggyBack to send KillCursor messages
SERVER-19464468 $sort stage in aggregation doesnt call scoped connections done ()
Internal Code
SERVER-19856472 Register for PRESHUTDOWN notifications on Windows Vista+
Tools
mongoimport
TOOLS-874474 mongoimport $date close to epoch not working
mongotop
TOOLS-864475 mongotop i/o timeout error
3.0.5 Changelog
465 https://jira.mongodb.org/browse/SERVER-19805
466 https://jira.mongodb.org/browse/SERVER-19389
467 https://jira.mongodb.org/browse/SERVER-19553
468 https://jira.mongodb.org/browse/SERVER-19464
469 https://jira.mongodb.org/browse/SERVER-19650
470 https://jira.mongodb.org/browse/SERVER-19236
471 https://jira.mongodb.org/browse/SERVER-19540
472 https://jira.mongodb.org/browse/SERVER-19856
473 https://jira.mongodb.org/browse/TOOLS-848
474 https://jira.mongodb.org/browse/TOOLS-874
475 https://jira.mongodb.org/browse/TOOLS-864
Querying
SERVER-19489476 Assertion failure and segfault in WorkingSet::free in 3.0.5-rc0
SERVER-18461477 Range predicates comparing against a BinData value should be covered, but are not in 2.6
SERVER-17815478 Plan ranking tie breaker is computed incorrectly
SERVER-17259479 Coverity analysis defect 56350: Dereference null return value
SERVER-18926480 Full text search extremely slow and uses a lot of memory under WiredTiger
Replication
SERVER-19375481 choosing syncsource should compare against last fetched optime rather than last applied
SERVER-19298482 Use userCreateNS w/options consistently in cloner
SERVER-18994483 producer thread can continue producing after a node becomes primary
SERVER-18455484 master/slave keepalives are not silent on slaves
SERVER-18280485 ReplicaSetMonitor should use electionId to avoid talking to old primaries
SERVER-17689486 Server crash during initial replication sync
Sharding SERVER-18955487 mongoS doesnt set batch size (and keeps the old one, 0) on getMore if performed on
first _cursor->more()
Storage
SERVER-19283488 WiredTiger changes for MongoDB 3.0.5
SERVER-18874489 Backport changes to RocksDB from mongo-partners repo
SERVER-18838490 DB fails to recover creates and drops after system crash
SERVER-17370491 Clean up storage engine-specific index and collection options
SERVER-15901492 Cleanup unused locks on the lock manager
476 https://jira.mongodb.org/browse/SERVER-19489
477 https://jira.mongodb.org/browse/SERVER-18461
478 https://jira.mongodb.org/browse/SERVER-17815
479 https://jira.mongodb.org/browse/SERVER-17259
480 https://jira.mongodb.org/browse/SERVER-18926
481 https://jira.mongodb.org/browse/SERVER-19375
482 https://jira.mongodb.org/browse/SERVER-19298
483 https://jira.mongodb.org/browse/SERVER-18994
484 https://jira.mongodb.org/browse/SERVER-18455
485 https://jira.mongodb.org/browse/SERVER-18280
486 https://jira.mongodb.org/browse/SERVER-17689
487 https://jira.mongodb.org/browse/SERVER-18955
488 https://jira.mongodb.org/browse/SERVER-19283
489 https://jira.mongodb.org/browse/SERVER-18874
490 https://jira.mongodb.org/browse/SERVER-18838
491 https://jira.mongodb.org/browse/SERVER-17370
492 https://jira.mongodb.org/browse/SERVER-15901
WiredTiger
SERVER-19513493 Truncating a capped collection may not unindex deleted documents in WiredTiger
SERVER-19283494 WiredTiger changes for MongoDB 3.0.5
SERVER-19189495 Improve performance under high number of threads with WT
SERVER-19178496 In WiredTiger capped collection truncates, avoid walking lists of deleted items
SERVER-19052497 Remove sizeStorer recalculations at startup with WiredTiger
SERVER-18926498 Full text search extremely slow and uses a lot of memory under WiredTiger
SERVER-18902499 Retrieval of large documents slower on WiredTiger than MMAPv1
SERVER-18875500 Oplog performance on WT degrades over time after accumulation of deleted items
SERVER-18838501 DB fails to recover creates and drops after system crash
SERVER-18829502 Cache usage exceeds configured maximum during index builds under WiredTiger
SERVER-18321503 Speed up background index build with WiredTiger LSM
SERVER-17689504 Server crash during initial replication sync
SERVER-17386505 Cursor cache causes excessive memory utilization in WiredTiger
SERVER-17254506 WT: drop collection while concurrent oplog tailing may greatly reduce throughput
SERVER-17078507 show databases taking extraordinarily long with wiredTiger
Networking
SERVER-19255508 Listener::waitUntilListening may return before listening has started
Shell
SERVER-18795517 db.printSlaveReplicationInfo()/rs.printSlaveReplicationInfo() can not work with ARBITER
role
3.0.4 Changelog
513 https://jira.mongodb.org/browse/SERVER-17568
514 https://jira.mongodb.org/browse/SERVER-17329
515 https://jira.mongodb.org/browse/SERVER-18977
516 https://jira.mongodb.org/browse/SERVER-18911
517 https://jira.mongodb.org/browse/SERVER-18795
518 https://jira.mongodb.org/browse/SERVER-19054
519 https://jira.mongodb.org/browse/SERVER-18979
520 https://jira.mongodb.org/browse/SERVER-19382
521 https://jira.mongodb.org/browse/SERVER-19353
522 https://jira.mongodb.org/browse/SERVER-19298
523 https://jira.mongodb.org/browse/SERVER-19255
524 https://jira.mongodb.org/browse/SERVER-17728
525 https://jira.mongodb.org/browse/SERVER-17567
526 https://jira.mongodb.org/browse/SERVER-19540
527 https://jira.mongodb.org/browse/SERVER-18068
528 https://jira.mongodb.org/browse/SERVER-17259
529 https://jira.mongodb.org/browse/SERVER-15017
530 https://jira.mongodb.org/browse/SERVER-19525
Security
SERVER-18475531 authSchemaUpgrade fails when the system.users (page 300) contains non
MONGODB-CR users
SERVER-18312532 Upgrade PCRE to latest
Querying
SERVER-18364533 Ensure non-negation predicates get chosen over negation predicates for multikey index
bounds construction
SERVER-16265534 Add query details to getmore entry in profiler and db.currentOp()
SERVER-15225535 CachedPlanStage should execute for trial period and re-plan if query performs poorly
SERVER-13875536 ensureIndex() of 2dsphere index breaks after upgrading to 2.6 (with the new
createIndex command)
Replication
SERVER-18566537 Primary member can trip fatal assertion if stepping down while running findAndModify op
resulting in an upsert
SERVER-18511538 Report upstream progress when initial sync completes
SERVER-18409539 Retry failed heartbeats before marking a node as DOWN
SERVER-18326540 Rollback attempted during initial sync is fatal
SERVER-17923541 Creating/dropping multiple background indexes on the same collection can cause fatal error
on secondaries
SERVER-17913542 New primary should log voters at default log level
SERVER-17807543 drain ops before restarting initial sync
SERVER-15252544 Write unit tests of ScatterGatherRunner
SERVER-15192545 Make all logOp listeners rollback-safe
SERVER-18190546 Secondary reads block replication
531 https://jira.mongodb.org/browse/SERVER-18475
532 https://jira.mongodb.org/browse/SERVER-18312
533 https://jira.mongodb.org/browse/SERVER-18364
534 https://jira.mongodb.org/browse/SERVER-16265
535 https://jira.mongodb.org/browse/SERVER-15225
536 https://jira.mongodb.org/browse/SERVER-13875
537 https://jira.mongodb.org/browse/SERVER-18566
538 https://jira.mongodb.org/browse/SERVER-18511
539 https://jira.mongodb.org/browse/SERVER-18409
540 https://jira.mongodb.org/browse/SERVER-18326
541 https://jira.mongodb.org/browse/SERVER-17923
542 https://jira.mongodb.org/browse/SERVER-17913
543 https://jira.mongodb.org/browse/SERVER-17807
544 https://jira.mongodb.org/browse/SERVER-15252
545 https://jira.mongodb.org/browse/SERVER-15192
546 https://jira.mongodb.org/browse/SERVER-18190
Sharding
SERVER-18822547 Sharded clusters with WiredTiger primaries may lose writes during chunk migration
SERVER-18246548 getmore on secondary in recovery mode can crash mongos
Storage SERVER-18442549 better error message when attempting to change storage engine metadata options
WiredTiger
SERVER-18647550 WiredTiger changes for MongoDB 3.0.4
SERVER-18646551 Avoid WiredTiger checkpointing dead handles
SERVER-18629552 WiredTiger journal system syncs wrong directory
SERVER-18460553 Segfault during eviction under load
SERVER-18316554 Database with WT engine fails to recover after system crash
SERVER-18315555 Throughput drop during transaction pinned phase of checkpoints under WiredTiger
SERVER-18213556 Lots of WriteConflict during multi-upsert with WiredTiger storage engine
SERVER-18079557 Large performance drop with documents > 16k on Windows
SERVER-17944558 WiredTigerRecordStore::truncate spends a lot of time sleeping
HTTP Console SERVER-18117559 Bring back the _replSet page in the html interface
Testing
SERVER-18318565 Disable jsCore_small_oplog suite in Windows
SERVER-17336566 fix core/compact_keeps_indexes.js in a master/slave test configuration
SERVER-13237567 benchRun should use a thread-safe random number generator
SERVER-18097568 Remove mongosTest_auth and mongosTest_WT tasks from evergreen.yml
3.0.3 Changelog
Security
SERVER-18290569 Adding a read role for a user doesnt seem to propagate to secondary until restart
SERVER-18239570 dumpauth.js uses ambiguous --db/--collection args
SERVER-18169571 Regression: Auth enabled arbiter cannot be shutdown using command
SERVER-18140572 Allow getParameter to be executed locally against an arbiter in an authenticated replica
set
SERVER-18051573 OpenSSL internal error when using SCRAM-SHA1 authentication in FIPS mode
SERVER-18021574 Allow serverStatus to be executed locally against an arbiter in an authenticated replica
set
SERVER-17908575 Allow getCmdLineOpts to be executed locally against an arbiter in an authenticated
replica set
SERVER-17832576 Memory leak when mongod configured with SSL required and handle insecure connection
SERVER-17812577 LockPinger has audit-related GLE failure
SERVER-17591578 Add SSL flag to select supported protocols
SERVER-16073579 Allow disabling SSL Ciphers via hidden flag: sslCipherConfig
SERVER-12235580 Dont require a database read on every new localhost connection when auth is on
Querying
SERVER-18304581 duplicates on FindAndModify with remove option
SERVER-17815582 Plan ranking tie breaker is computed incorrectly
565 https://jira.mongodb.org/browse/SERVER-18318
566 https://jira.mongodb.org/browse/SERVER-17336
567 https://jira.mongodb.org/browse/SERVER-13237
568 https://jira.mongodb.org/browse/SERVER-18097
569 https://jira.mongodb.org/browse/SERVER-18290
570 https://jira.mongodb.org/browse/SERVER-18239
571 https://jira.mongodb.org/browse/SERVER-18169
572 https://jira.mongodb.org/browse/SERVER-18140
573 https://jira.mongodb.org/browse/SERVER-18051
574 https://jira.mongodb.org/browse/SERVER-18021
575 https://jira.mongodb.org/browse/SERVER-17908
576 https://jira.mongodb.org/browse/SERVER-17832
577 https://jira.mongodb.org/browse/SERVER-17812
578 https://jira.mongodb.org/browse/SERVER-17591
579 https://jira.mongodb.org/browse/SERVER-16073
580 https://jira.mongodb.org/browse/SERVER-12235
581 https://jira.mongodb.org/browse/SERVER-18304
582 https://jira.mongodb.org/browse/SERVER-17815
Replication
SERVER-18211583 MongoDB fails to correctly roll back collection creation
SERVER-17273584 Add support for secondaryCatchupPeriodSecs to rs.stepdown() shell helper
Sharding
SERVER-17812585 LockPinger has audit-related GLE failure
SERVER-17749586 collMod usePowerOf2Sizes fails on mongos
SERVER-16987587 sh.getRecentMigrations() shows aborted migration as success
Storage
SERVER-18211588 MongoDB fails to correctly roll back collection creation
SERVER-18111589 mongod allows user inserts into system.profile collection
SERVER-17939590 Backport mongo-rocks updates to v3.0 branch
SERVER-17745591 Improve dirty page estimation in mmapv1 on Windows
WiredTiger
SERVER-18205592 WiredTiger changes for MongoDB 3.0.3
SERVER-18192593 Crash running WiredTiger with cache_resident=true
SERVER-18014594 Dropping a collection can block creating a new collection for an extended time under
WiredTiger
SERVER-17907595 B-tree eviction blocks access to collection for extended period under WiredTiger
SERVER-17892596 Explicitly turn checksum on for all collections/indexes in WiredTiger by default
Indexing
SERVER-18087597 index_retry.js and index_no_retry.js not checking for presence of progress field in curren-
tOp() result
SERVER-17882598 Update with key too large to index crashes WiredTiger/RockDB secondary
583 https://jira.mongodb.org/browse/SERVER-18211
584 https://jira.mongodb.org/browse/SERVER-17273
585 https://jira.mongodb.org/browse/SERVER-17812
586 https://jira.mongodb.org/browse/SERVER-17749
587 https://jira.mongodb.org/browse/SERVER-16987
588 https://jira.mongodb.org/browse/SERVER-18211
589 https://jira.mongodb.org/browse/SERVER-18111
590 https://jira.mongodb.org/browse/SERVER-17939
591 https://jira.mongodb.org/browse/SERVER-17745
592 https://jira.mongodb.org/browse/SERVER-18205
593 https://jira.mongodb.org/browse/SERVER-18192
594 https://jira.mongodb.org/browse/SERVER-18014
595 https://jira.mongodb.org/browse/SERVER-17907
596 https://jira.mongodb.org/browse/SERVER-17892
597 https://jira.mongodb.org/browse/SERVER-18087
598 https://jira.mongodb.org/browse/SERVER-17882
Write Ops
SERVER-18111599 mongod allows user inserts into system.profile collection
Networking
SERVER-17832600 Memory leak when MongoD configured with SSL required and handle insecure connection
SERVER-17591601 Add SSL flag to select supported protocols
SERVER-16073602 Allow disabling SSL Ciphers via hidden flag: sslCipherConfig
Concurrency
SERVER-18304603 duplicates on FindAndModify with remove option
SERVER-16636604 Deadlock detection should check cycles for stability or should be disabled
Geo
SERVER-17835605 Aggregation geoNear deprecated uniqueDocs warning
SERVER-9220606 allow more than two values in the coordinate-array when using 2dsphere index
Aggregation Framework
SERVER-17835607 Aggregation geoNear deprecated uniqueDocs warning
MapReduce
SERVER-17889608 Using eval command to run mapReduce with non-inline out option triggers fatal assertion
failure
Admin
SERVER-18290609 Adding a read role for a user doesnt seem to propagate to secondary until restart
SERVER-18169610 Regression: Auth enabled arbiter cannot be shutdown using command
SERVER-17820611 Windows service stop can lead to mongod abrupt termination due to long shutdown time
599 https://jira.mongodb.org/browse/SERVER-18111
600 https://jira.mongodb.org/browse/SERVER-17832
601 https://jira.mongodb.org/browse/SERVER-17591
602 https://jira.mongodb.org/browse/SERVER-16073
603 https://jira.mongodb.org/browse/SERVER-18304
604 https://jira.mongodb.org/browse/SERVER-16636
605 https://jira.mongodb.org/browse/SERVER-17835
606 https://jira.mongodb.org/browse/SERVER-9220
607 https://jira.mongodb.org/browse/SERVER-17835
608 https://jira.mongodb.org/browse/SERVER-17889
609 https://jira.mongodb.org/browse/SERVER-18290
610 https://jira.mongodb.org/browse/SERVER-18169
611 https://jira.mongodb.org/browse/SERVER-17820
JavaScript
SERVER-17453620 warn that db.eval() / eval command is deprecated
Shell
SERVER-17951621 db.currentOp() fails with read preference set
SERVER-17273622 Add support for secondaryCatchupPeriodSecs to rs.stepdown shell helper
SERVER-16987623 sh.getRecentMigrations shows aborted migration as success
Testing
SERVER-18302624 remove test buildlogger instance
SERVER-18262625 setup_multiversion_mongodb should retry links download on timeouts
SERVER-18239626 dumpauth.js uses ambiguous db/collection args
SERVER-18229627 Smoke.py with PyMongo 3.0.1 fails to run certain tests
SERVER-18073628 Fix smoke.py to work with pymongo 3.0
SERVER-17998629 Ignore socket exceptions in initial_sync_unsupported_auth_schema.js test
SERVER-18293630 ASAN tests should run on larger instance size
612 https://jira.mongodb.org/browse/SERVER-18344
613 https://jira.mongodb.org/browse/SERVER-18299
614 https://jira.mongodb.org/browse/SERVER-18082
615 https://jira.mongodb.org/browse/SERVER-17730
616 https://jira.mongodb.org/browse/SERVER-17694
617 https://jira.mongodb.org/browse/SERVER-17465
618 https://jira.mongodb.org/browse/SERVER-17961
619 https://jira.mongodb.org/browse/SERVER-17780
620 https://jira.mongodb.org/browse/SERVER-17453
621 https://jira.mongodb.org/browse/SERVER-17951
622 https://jira.mongodb.org/browse/SERVER-17273
623 https://jira.mongodb.org/browse/SERVER-16987
624 https://jira.mongodb.org/browse/SERVER-18302
625 https://jira.mongodb.org/browse/SERVER-18262
626 https://jira.mongodb.org/browse/SERVER-18239
627 https://jira.mongodb.org/browse/SERVER-18229
628 https://jira.mongodb.org/browse/SERVER-18073
629 https://jira.mongodb.org/browse/SERVER-17998
630 https://jira.mongodb.org/browse/SERVER-18293
3.0.2 Changelog
Security
SERVER-17719632 mongo Shell crashes if -p is missing and user matches
SERVER-17705633 Fix credentials field inconsistency in HTTP interface
SERVER-17671634 Refuse to complete initial sync from nodes with 2.4-style auth data
SERVER-17669635 Remove auth prompt in webserver when auth is not enabled
SERVER-17647636 Compute BinData length in v8
SERVER-17529637 Cant list collections when mongos is running 3.0 and config servers are running 2.6 and
auth is on
Replication
SERVER-17677641 Replica Set member backtraces sometimes when removed from replica set
SERVER-17672642 serverStatus command with {oplog: 1} option can trigger segmentation fault in
mongod
SERVER-17822643 OpDebug::writeConflicts should be a 64-bit type
WiredTiger
SERVER-17713646 WiredTiger using zlib compression can create invalid compressed stream
631 https://jira.mongodb.org/browse/SERVER-17761
632 https://jira.mongodb.org/browse/SERVER-17719
633 https://jira.mongodb.org/browse/SERVER-17705
634 https://jira.mongodb.org/browse/SERVER-17671
635 https://jira.mongodb.org/browse/SERVER-17669
636 https://jira.mongodb.org/browse/SERVER-17647
637 https://jira.mongodb.org/browse/SERVER-17529
638 https://jira.mongodb.org/browse/SERVER-8188
639 https://jira.mongodb.org/browse/SERVER-17469
640 https://jira.mongodb.org/browse/SERVER-17642
641 https://jira.mongodb.org/browse/SERVER-17677
642 https://jira.mongodb.org/browse/SERVER-17672
643 https://jira.mongodb.org/browse/SERVER-17822
644 https://jira.mongodb.org/browse/SERVER-17805
645 https://jira.mongodb.org/browse/SERVER-17613
646 https://jira.mongodb.org/browse/SERVER-17713
MMAPv1
SERVER-17616655 Removing or inserting documents with large indexed arrays consumes excessive memory
SERVER-17313656 Segfault in BtreeLogic::_insert when inserting into previously-dropped namespace
HTTP Console
SERVER-17729658 Cannot start mongod httpinterface: sockets higher than 1023 not supported
SERVER-17705659 Fix credentials field inconsistency in HTTP interface
SERVER-17669660 Remove auth prompt in webserver when auth is not enabled
Admin
SERVER-17570661 MongoDB 3.0 NT Service shutdown race condition with db.serverShutdown()
SERVER-17699662 locks section empty in diagnostic log and profiler output for some operations
SERVER-17337663 RPM Init script breaks with quotes in yaml config file
SERVER-16731664 Remove unused DBPATH init script variable
647 https://jira.mongodb.org/browse/SERVER-17642
648 https://jira.mongodb.org/browse/SERVER-17587
649 https://jira.mongodb.org/browse/SERVER-17562
650 https://jira.mongodb.org/browse/SERVER-17551
651 https://jira.mongodb.org/browse/SERVER-17532
652 https://jira.mongodb.org/browse/SERVER-17471
653 https://jira.mongodb.org/browse/SERVER-17382
654 https://jira.mongodb.org/browse/SERVER-16804
655 https://jira.mongodb.org/browse/SERVER-17616
656 https://jira.mongodb.org/browse/SERVER-17313
657 https://jira.mongodb.org/browse/SERVER-17706
658 https://jira.mongodb.org/browse/SERVER-17729
659 https://jira.mongodb.org/browse/SERVER-17705
660 https://jira.mongodb.org/browse/SERVER-17669
661 https://jira.mongodb.org/browse/SERVER-17570
662 https://jira.mongodb.org/browse/SERVER-17699
663 https://jira.mongodb.org/browse/SERVER-17337
664 https://jira.mongodb.org/browse/SERVER-16731
Networking SERVER-17652665 Cannot start mongod due to sockets higher than 1023 not being supported
Testing
SERVER-17826666 Ignore ismaster exceptions in initial_sync_unsupported_auth_schema.js
test
SERVER-17808667 Ensure availability in initial_sync_unsupported_auth_schema.js test
SERVER-17433668 ASAN leak in small oplog suite write_result.js
3.0.1 Changelog
Security
SERVER-17507669 MongoDB3 enterprise AuditLog
SERVER-17379670 Change or to and in webserver localhost exception check
SERVER-16944671 dbAdminAnyDatabase should have full parity with dbAdmin for a given database
SERVER-16849672 On mongos we always invalidate the user cache once, even if no user definitions are changing
SERVER-16452673 Failed login attempts should log source IP address
Querying
SERVER-17395674 Add FSM tests to stress yielding
SERVER-17387675 invalid projection for findAndModify triggers fassert() failure
SERVER-14723676 Crash during query planning for geoNear with multiple 2dsphere indices
SERVER-17486677 Crash when parsing invalid polygon coordinates
Replication
SERVER-17515678 copyDatabase fails to replicate indexes to secondary
SERVER-17499679 Using eval command to run getMore on aggregation cursor trips fatal assertion
SERVER-17487680 cloner dropDups removes _id entries belonging to other records
SERVER-17302681 consider blacklist in shouldChangeSyncSource
665 https://jira.mongodb.org/browse/SERVER-17652
666 https://jira.mongodb.org/browse/SERVER-17826
667 https://jira.mongodb.org/browse/SERVER-17808
668 https://jira.mongodb.org/browse/SERVER-17433
669 https://jira.mongodb.org/browse/SERVER-17507
670 https://jira.mongodb.org/browse/SERVER-17379
671 https://jira.mongodb.org/browse/SERVER-16944
672 https://jira.mongodb.org/browse/SERVER-16849
673 https://jira.mongodb.org/browse/SERVER-16452
674 https://jira.mongodb.org/browse/SERVER-17395
675 https://jira.mongodb.org/browse/SERVER-17387
676 https://jira.mongodb.org/browse/SERVER-14723
677 https://jira.mongodb.org/browse/SERVER-17486
678 https://jira.mongodb.org/browse/SERVER-17515
679 https://jira.mongodb.org/browse/SERVER-17499
680 https://jira.mongodb.org/browse/SERVER-17487
681 https://jira.mongodb.org/browse/SERVER-17302
Sharding
SERVER-17398682 Deadlock in MigrateStatus::startCommit
SERVER-17300683 Balancer tries to create config.tags index multiple times
SERVER-16849684 On mongos we always invalidate the user cache once, even if no user definitions are changing
SERVER-5004685 balancer should check for stopped between chunk moves in current round
Indexing
SERVER-17521686 improve createIndex validation of empty name
SERVER-17436687 MultiIndexBlock may access deleted collection after recovering from yield
Aggregation Framework SERVER-17224688 Aggregation pipeline with 64MB document can terminate server
Write Ops
SERVER-17489689 in bulk ops, only mark last operation with commit=synchronous
SERVER-17276690 WriteConflictException retry loops needed for collection creation on upsert
Concurrency
SERVER-17501691 Increase journalling capacity limits
SERVER-17416692 Deadlock between MMAP V1 journal lock and oplog collection lock
SERVER-17395693 Add FSM tests to stress yielding
Storage
SERVER-17515694 copyDatabase fails to replicate indexes to secondary
SERVER-17436695 MultiIndexBlock may access deleted collection after recovering from yield
SERVER-17416696 Deadlock between MMAP V1 journal lock and oplog collection lock
SERVER-17381697 Rename rocksExperiment to RocksDB
SERVER-17369698 [Rocks] Fix the calculation of nextPrefix
682 https://jira.mongodb.org/browse/SERVER-17398
683 https://jira.mongodb.org/browse/SERVER-17300
684 https://jira.mongodb.org/browse/SERVER-16849
685 https://jira.mongodb.org/browse/SERVER-5004
686 https://jira.mongodb.org/browse/SERVER-17521
687 https://jira.mongodb.org/browse/SERVER-17436
688 https://jira.mongodb.org/browse/SERVER-17224
689 https://jira.mongodb.org/browse/SERVER-17489
690 https://jira.mongodb.org/browse/SERVER-17276
691 https://jira.mongodb.org/browse/SERVER-17501
692 https://jira.mongodb.org/browse/SERVER-17416
693 https://jira.mongodb.org/browse/SERVER-17395
694 https://jira.mongodb.org/browse/SERVER-17515
695 https://jira.mongodb.org/browse/SERVER-17436
696 https://jira.mongodb.org/browse/SERVER-17416
697 https://jira.mongodb.org/browse/SERVER-17381
698 https://jira.mongodb.org/browse/SERVER-17369
SERVER-17345699 WiredTiger -> session.truncate: the start cursor position is after the stop cursor position
SERVER-17331700 RocksDB configuring and monitoring
SERVER-17323701 MMAPV1Journal lock counts are changing during WT run
SERVER-17319702 invariant at shutdown rc9, rc10, rc11 with wiredTiger
SERVER-17293703 Server crash setting wiredTigerEngineRuntimeConfig:eviction=(threads_max=8)
WiredTiger
SERVER-17510704 Didnt find RecordId in WiredTigerRecordStore on collections after an idle period
SERVER-17506705 Race between inserts and checkpoints can lose records under WiredTiger
SERVER-17487706 cloner dropDups removes _id entries belonging to other records
SERVER-17481707 WiredTigerRecordStore::validate should call WT_SESSION::verify
SERVER-17451708 WiredTiger unable to start if crash leaves 0-length journal file
SERVER-17378709 WiredTigers compact code can return Operation timed out error (invariant failure)
SERVER-17345710 WiredTiger -> session.truncate: the start cursor position is after the stop cursor position
SERVER-17319711 invariant at shutdown rc9, rc10, rc11 with wiredTiger
MMAPv1
SERVER-17501712 Increase journalling capacity limits
SERVER-17416713 Deadlock between MMAP V1 journal lock and oplog collection lock
SERVER-17388714 Invariant failure in MMAPv1 when disk full
RocksDB
SERVER-17381715 Rename rocksExperiment to RocksDB
SERVER-17369716 [Rocks] Fix the calculation of nextPrefix
SERVER-17331717 RocksDB configuring and monitoring
699 https://jira.mongodb.org/browse/SERVER-17345
700 https://jira.mongodb.org/browse/SERVER-17331
701 https://jira.mongodb.org/browse/SERVER-17323
702 https://jira.mongodb.org/browse/SERVER-17319
703 https://jira.mongodb.org/browse/SERVER-17293
704 https://jira.mongodb.org/browse/SERVER-17510
705 https://jira.mongodb.org/browse/SERVER-17506
706 https://jira.mongodb.org/browse/SERVER-17487
707 https://jira.mongodb.org/browse/SERVER-17481
708 https://jira.mongodb.org/browse/SERVER-17451
709 https://jira.mongodb.org/browse/SERVER-17378
710 https://jira.mongodb.org/browse/SERVER-17345
711 https://jira.mongodb.org/browse/SERVER-17319
712 https://jira.mongodb.org/browse/SERVER-17501
713 https://jira.mongodb.org/browse/SERVER-17416
714 https://jira.mongodb.org/browse/SERVER-17388
715 https://jira.mongodb.org/browse/SERVER-17381
716 https://jira.mongodb.org/browse/SERVER-17369
717 https://jira.mongodb.org/browse/SERVER-17331
Platform
SERVER-17252727 Upgrade PCRE Version from 8.30 to Latest
SERVER-14166728 Semantics of the osx-version-min flag should be improved
Internal Code SERVER-17338729 NULL pointer crash when running copydb against stepped-down 2.6 primary
Testing
SERVER-17443730 get_replication_info_helper.js should assert.soon rather than assert for log messages
SERVER-17442731 increase tolerance for shutdown timeout in stepdown.js to fix windows build break
SERVER-17395732 Add FSM tests to stress yielding
Fixed issues where read preference of secondaryPreferred (page 721) can end up using unversioned
connections: SERVER-18671733
718 https://jira.mongodb.org/browse/SERVER-17226
719 https://jira.mongodb.org/browse/SERVER-17405
720 https://jira.mongodb.org/browse/SERVER-17347
721 https://jira.mongodb.org/browse/SERVER-17484
722 https://jira.mongodb.org/browse/SERVER-17463
723 https://jira.mongodb.org/browse/SERVER-17460
724 https://jira.mongodb.org/browse/SERVER-14166
725 https://jira.mongodb.org/browse/SERVER-17517
726 https://jira.mongodb.org/browse/SERVER-16452
727 https://jira.mongodb.org/browse/SERVER-17252
728 https://jira.mongodb.org/browse/SERVER-14166
729 https://jira.mongodb.org/browse/SERVER-17338
730 https://jira.mongodb.org/browse/SERVER-17443
731 https://jira.mongodb.org/browse/SERVER-17442
732 https://jira.mongodb.org/browse/SERVER-17395
733 https://jira.mongodb.org/browse/SERVER-18671
Fixed issue with MMAPv1 journaling where the last sequence number file (lsn file) may be ahead of what
is synced to the data files: SERVER-22261734 .
Fixed issue where a data size change for oplog deletes can overflow 32-bit int: SERVER-22634735
Fixed issue with high fragmentation on WiredTiger databases under write workloads: SERVER-22898736 .
All issues closed in 3.0.10737
Fixed issue where queries which specify sort and batch size can return results out of order if documents are
concurrently updated. SERVER-19996738
Fixed performance issue where large amounts of create and drop collections can cause listDatabases to
be slow under WiredTiger. SERVER-20961739
Modified the authentication failure message to include the client IP address. SERVER-22054740
All issues closed in 3.0.9741
Fixed issue where findAndModify on mongos can upsert to the wrong shard. SERVER-20407742 .
Fixed WiredTiger commit visibility issue which caused document not found. SERVER-21275743 .
Fixed issue where the oplog can grow to 3x configured size. SERVER-21553744
All issues closed in 3.0.8745
Fix missed writes with concurrent inserts during chunk migration from shards with WiredTiger primaries:
SERVER-18822769
Resolve write conflicts with multi-update updates with upsert=true with the Wired Tiger Storage engine:
SERVER-18213770
Fix case where secondary reads could block replication: SERVER-18190771
Improve performance on Windows with WiredTiger and documents larger than 16kb: SERVER-18079772
Fix issue where WiredTiger data files are not correctly recovered following unexpected system restarts:
SERVER-18316773
754 https://jira.mongodb.org/issues/?jql=project%20in%20(SERVER%2C%20TOOLS)%20AND%20fixVersion%20%3D%203.0.7%20AND%20resolution%20%3D%2
755 https://jira.mongodb.org/browse/SERVER-19751
756 https://jira.mongodb.org/browse/SERVER-19673
757 https://jira.mongodb.org/browse/SERVER-19573
758 https://jira.mongodb.org/browse/SERVER-19538
759 https://jira.mongodb.org/browse/SERVER-19464
760 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.6%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
761 https://jira.mongodb.org/browse/SERVER-19178
762 https://jira.mongodb.org/browse/SERVER-18875
763 https://jira.mongodb.org/browse/SERVER-19513
764 https://jira.mongodb.org/browse/SERVER-19189
765 https://jira.mongodb.org/browse/SERVER-18829
766 https://jira.mongodb.org/browse/SERVER-17836
767 https://jira.mongodb.org/browse/SERVER-18926
768 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.5%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
769 https://jira.mongodb.org/browse/SERVER-18822
770 https://jira.mongodb.org/browse/SERVER-18213
771 https://jira.mongodb.org/browse/SERVER-18190
772 https://jira.mongodb.org/browse/SERVER-18079
773 https://jira.mongodb.org/browse/SERVER-18316
Fixed race condition in WiredTiger between inserts and checkpoints that could result in lost records: SERVER-
17506786 .
Resolved issue in WiredTigers capped collections implementation that caused a server crash: SERVER-
17345787 .
Fixed issue is initial sync with duplicate _id entries: SERVER-17487788 .
Fixed deadlock condition in MMAPv1 between the journal lock and the oplog collection lock: SERVER-
17416789 .
All issues closed in 3.0.1790
774 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.4%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
775 https://jira.mongodb.org/browse/SERVER-17453
776 https://jira.mongodb.org/browse/SERVER-17802
777 https://jira.mongodb.org/browse/SERVER-17882
778 https://jira.mongodb.org/browse/SERVER-17889
779 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.3%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
780 https://jira.mongodb.org/browse/SERVER-17469
781 https://jira.mongodb.org/browse/SERVER-17652
782 https://jira.mongodb.org/browse/SERVER-17729
783 https://jira.mongodb.org/browse/SERVER-17713
784 https://jira.mongodb.org/browse/SERVER-17616
785 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.2%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
786 https://jira.mongodb.org/browse/SERVER-17506
787 https://jira.mongodb.org/browse/SERVER-17345
788 https://jira.mongodb.org/browse/SERVER-17487
789 https://jira.mongodb.org/browse/SERVER-17416
790 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%223.0.1%22%20AND%20project%20%3D%20SERVER%20AND%20resolution%20%3D%20Fixed
Major Changes
MongoDB 3.0 introduces a pluggable storage engine API that allows third parties to develop storage engines for
MongoDB.
WiredTiger
MongoDB 3.0 introduces support for the WiredTiger791 storage engine. With the support for WiredTiger, MongoDB
now supports two storage engines:
MMAPv1, the storage engine available in previous versions of MongoDB and the default storage engine for
MongoDB 3.0, and
WiredTiger792 , available only in the 64-bit versions of MongoDB 3.0.
WiredTiger Usage WiredTiger is an alternate to the default MMAPv1 storage engine. WiredTiger supports all Mon-
goDB features, including operations that report on server, database, and collection statistics. Switching to WiredTiger,
however, requires a change to the on-disk storage format (page 938). For instructions on changing the storage engine
to WiredTiger, see the appropriate sections in the Upgrade MongoDB to 3.0 (page 945) documentation.
MongoDB 3.0 replica sets and sharded clusters can have members with different storage engines; however, perfor-
mance can vary according to workload. For details, see the appropriate sections in the Upgrade MongoDB to 3.0
(page 945) documentation.
The WiredTiger storage engine requires the latest official MongoDB drivers. For more information, see WiredTiger
and Driver Version Compatibility (page 938).
See also:
Support for touch Command (page 939), WiredTiger Storage Engine (page 587) documentation
WiredTiger Configuration To configure the behavior and properties of the WiredTiger storage engine, see
storage.wiredTiger configuration options. You can set WiredTiger options on the command line.
See also:
WiredTiger Storage Engine (page 587)
WiredTiger Concurrency and Compression The 3.0 WiredTiger storage engine provides document-level locking
and compression.
By default, WiredTiger compresses collection data using the snappy compression library. WiredTiger uses prefix
compression on all indexes by default.
See also:
WiredTiger (page 216) section in the Production Notes (page 214), the blog post New Compression Options in Mon-
goDB 3.0793
791 http://wiredtiger.com
792 http://wiredtiger.com
793 https://www.mongodb.com/blog/post/new-compression-options-mongodb-30?jmp=docs
MMAPv1 Improvements
MMAPv1 Concurrency Improvement In version 3.0, the MMAPv1 storage engine adds support for collection-
level locking.
MMAPv1 Configuration Changes To support multiple storage engines, some configuration settings for MMAPv1
have changed. See Configuration File Options Changes (page 938).
MMAPv1 Record Allocation Behavior Changes MongoDB 3.0 no longer implements dynamic record alloca-
tion and deprecates paddingFactor. The default allocation strategy for collections in instances that use MMAPv1
is power of 2 allocation (page 596), which has been improved to better handle large document sizes. In 3.0,
the usePowerOf2Sizes flag is ignored, so the power of 2 strategy is used for all collections that do not have
noPadding flag set.
For collections with workloads that consist only of inserts or in-place updates (such as incrementing counters), you
can disable the power of 2 strategy. To disable the power of 2 strategy for a collection, use the collMod command
with the noPadding flag or the db.createCollection() method with the noPadding option.
Warning: Do not set noPadding if the workload includes removes or any updates that may cause documents
to grow. For more information, see No Padding Allocation Strategy (page 597).
When low on disk space, MongoDB 3.0 no longer errors on all writes but only when the required disk allocation fails.
As such, MongoDB now allows in-place updates and removes when low on disk space.
See also:
Dynamic Record Allocation (page 939)
Replica Sets
794
In MongoDB 3.0, replica sets can have up to 50 members. The following drivers support the larger replica sets:
C# (.NET) Driver 1.10
Java Driver 2.13
Python Driver (PyMongo) 3.0
Ruby Driver 2.0
Node.JS Driver 2.0
The C, C++, Perl, and PHP drivers, as well as the earlier versions of the Ruby, Python, and Node.JS drivers, discover
and monitor replica set members serially, and thus are not suitable for use with large replica sets.
The process that a primary member of a replica set uses to step down has the following changes:
Before stepping down, replSetStepDown will attempt to terminate long running user operations that would
block the primary from stepping down, such as an index build, a write operation or a map-reduce job.
794 The maximum number of voting members remains at 7.
To help prevent rollbacks, the replSetStepDown will wait for an electable secondary to catch up to the state
of the primary before stepping down. Previously, a primary would wait for a secondary to catch up to within 10
seconds of the primary (i.e. a secondary with a replication lag of 10 seconds or less) before stepping down.
replSetStepDown now allows users to specify a secondaryCatchUpPeriodSecs parameter to spec-
ify how long the primary should wait for a secondary to catch up before stepping down.
Initial sync builds indexes more efficiently for each collection and applies oplog entries in batches using threads.
Definition of w: majority (page 141) write concern changed to mean majority of voting nodes.
Stronger restrictions on Replica Set Configuration (page 709). For details, see Replica Set Configuration Vali-
dation (page 939).
For pre-existing collections on secondary members, MongoDB 3.0 no longer automatically builds missing _id
indexes.
See also:
Replication Changes (page 939) in Compatibility Changes in MongoDB 3.0 (page 938)
Sharded Clusters
Security Improvements
Improvements
MongoDB 3.0 includes a new query introspection system that provides an improved output format and a finer-grained
introspection into both query plan and query execution.
For details, see the new db.collection.explain() method and the new explain command as well as the
updated cursor.explain() method.
For information on the format of the new output, see https://docs.mongodb.org/manual/reference/explain-result
Enhanced Logging
To improve usability of the log messages for diagnosis, MongoDB categorizes some log messages under specific
components, or operations, and provides the ability to set the verbosity level for these components. For information,
see https://docs.mongodb.org/manual/reference/log-messages.
All MongoDB tools except for mongosniff and mongoperf are now written in Go and maintained as a separate
project.
New options for parallelized mongodump and mongorestore. You can control the number of collections
that mongorestore will restore at a time with the --numParallelCollections option.
New options -excludeCollection and --excludeCollectionsWithPrefix for mongodump to
exclude collections.
mongorestore can now accept BSON data input from standard input in addition to reading BSON data from
file.
mongostat and mongotop can now return output in JSON format with the --json option.
Added configurable write concern to mongoimport, mongorestore, and mongofiles. Use the
--writeConcern option. The default writeConcern has been changed to w:majority.
mongofiles now allows you to configure the GridFS prefix with the --prefix option so that you can use
custom namespaces and store multiple GridFS namespaces in a single database.
See also:
MongoDB Tools Changes (page 940)
Indexes
Background index builds will no longer automatically interrupt if dropDatabase, drop, dropIndexes
operations occur for the database or collection affected by the index builds. The dropDatabase, drop,
and dropIndexes commands will still fail with the error message a background operation is
currently running, as in 2.6.
If you specify multiple indexes to the createIndexes command,
the command only scans the collection once, and
if at least one index is to be built in the foreground, the operation will build all the specified indexes in the
foreground.
For sharded collections, indexes can now cover queries (page 70) that execute against the mongos if the index
includes the shard key.
See also:
Indexes (page 942) in Compatibility Changes in MongoDB 3.0 (page 938)
Query Enhancements
Most non-Enterprise MongoDB distributions now include support for TLS/SSL. Previously, only MongoDB Enter-
prise distributions came with TLS/SSL support included; for non-Enterprise distributions, you had to build MongoDB
locally with the --ssl flag (i.e. scons --ssl).
32-bit MongoDB builds are available for testing, but are not for production use. 32-bit MongoDB builds do not include
the WiredTiger storage engine.
MongoDB builds for Solaris do not support the WiredTiger storage engine.
MongoDB builds are available for Windows Server 2003 and Windows Vista (as 64-bit Legacy), but the minimum
officially supported Windows version is Windows Server 2008.
See also:
Platform Support (page 944), Deprecation of 32-bit Versions (page 5)
Package Repositories
Non-Enterprise MongoDB Linux packages for 3.0 and later are in a new repository. Follow the appropriate Linux
installation instructions (page 6) to install the 3.0 packages from the new location.
Auditing
Auditing (page 340) in MongoDB Enterprise can filter on any field in the audit message (page 434), including the fields
returned in the param (page 435) document. This enhancement, along with the auditAuthorizationSuccess
parameter, enables auditing to filter on CRUD operations. However, enabling auditAuthorizationSuccess to
audit of all authorization successes degrades performance more than auditing only the authorization failures.
Additional Information
On this page
Storage Engine (page 938)
Replication Changes (page 939)
MongoDB Tools Changes (page 940)
Compatibility Changes in MongoDB 3.0 Sharded Cluster Setting (page 940)
Security Changes (page 940)
Indexes (page 942)
Driver Compatibility Changes (page 942)
General Compatibility Changes (page 943)
The following 3.0 changes can affect the compatibility with older versions of MongoDB. See Release Notes for
MongoDB 3.0 (page 904) for the full list of the 3.0 changes.
Storage Engine
Configuration File Options Changes With the introduction of additional storage engines in 3.0, some
configuration file options have changed:
Previous Setting New Setting
storage.journal.commitIntervalMs storage.mmapv1.journal.commitIntervalMs
storage.journal.debugFlags storage.mmapv1.journal.debugFlags
storage.nsSize storage.mmapv1.nsSize
storage.preallocDataFiles storage.mmapv1.preallocDataFiles
storage.quota.enforced storage.mmapv1.quota.enforced
storage.quota.maxFilesPerDB storage.mmapv1.quota.maxFilesPerDB
storage.smallFiles storage.mmapv1.smallFiles
3.0 mongod instances are backward compatible with existing configuration files, but will issue warnings when if you
attempt to use the old settings.
Data Files Must Correspond to Configured Storage Engine The files in the dbPath directory must correspond
to the configured storage engine (i.e. --storageEngine). mongod will not start if dbPath contains data files
created by a storage engine other than the one specified by --storageEngine.
See also:
Change Storage Engine to WiredTiger sections in Upgrade MongoDB to 3.0 (page 945)
WiredTiger and Driver Version Compatibility For MongoDB 3.0 deployments that use the WiredTiger storage
engine, the following operations return no output when issued in previous versions of the mongo shell or drivers:
db.getCollectionNames()
db.collection.getIndexes()
show collections
show tables
Use the 3.0 mongo shell or the 3.0 compatible version (page 942) of the official drivers when connecting to 3.0
mongod instances that use WiredTiger. The 2.6.8 mongo shell is also compatible with 3.0 mongod instances that
use WiredTiger.
db.fsyncLock() is not Compatible with WiredTiger With WiredTiger the db.fsyncLock() and
db.fsyncUnlock() operations cannot guarantee that the data files do not change. As a result, do not use these
methods to ensure consistency for the purposes of creating backups.
Support for touch Command If a storage engine does not support the touch, then the touch command will
return an error.
The MMAPv1 storage engine supports touch.
The WiredTiger storage engine does not support touch.
Dynamic Record Allocation MongoDB 3.0 no longer supports dynamic record allocation and deprecates padding-
Factor.
MongoDB 3.0 deprecates the newCollectionsUsePowerOf2Sizes parameter such that you can no longer use
the parameter to disable the power of 2 sizes allocation for a collection. Instead, use the collMod command with the
noPadding flag or the db.createCollection() method with the noPadding option. Only set noPadding
for collections with workloads that consist only of inserts or in-place updates (such as incrementing counters).
Warning: Only set noPadding to true for collections whose workloads have no update operations that cause
documents to grow, such as for collections with workloads that are insert-only. For more information, see No
Padding Allocation Strategy (page 597).
For more information, see MMAPv1 Record Allocation Behavior Changes (page 934).
Replication Changes
Replica Set Oplog Format Change MongoDB 3.0 is not compatible with oplog entries generated by versions of
MongoDB before 2.2.1. If you upgrade from one of these versions, you must wait for new oplog entries to overwrite
all old oplog entries generated by one of these versions before upgrading to 3.0.0 or earlier.
Secondaries may abort if they replay a pre-2.6 oplog with an index build operation that would fail on a 2.6 or later
primary.
Replica Set Configuration Validation MongoDB 3.0 provides a stricter validation of replica set configuration
settings (page 709) and replica sets invalid replica set configurations.
Stricter validations include:
Arbiters can only have 1 vote. Previously, arbiters could also have a value of 0 for members[n].votes
(page 713). If an arbiter has any value other than 1 for members[n].votes (page 713), you must fix the
setting.
Non-arbiter members can only have value of 0 or 1 for members[n].votes (page 713). If a non-arbiter
member has any other value for members[n].votes (page 713), you must fix the setting.
_id (page 710) in the Replica Set Configuration (page 709) must specify the same name as that specified by
--replSet or replication.replSetName. Otherwise, you must fix the setting.
Change of w: majority Semantics A write concern with a w: majority (page 141) value is satisfied when a
majority of the voting members replicates a write operation. In previous versions, majority referred a majority of all
voting and non-voting members of the set.
Remove local.slaves Collection MongoDB 3.0 removes the local.slaves collection that tracked the sec-
ondaries replication progress. To track the replication progress, use the rs.status() method.
Replica Set State Change The FATAL replica set state does not exist as of 3.0.0.
HTTP Interface The HTTP Interface (i.e. net.http.enabled) no longer reports replication data.
Require a Running MongoDB Instance The 3.0 versions of MongoDB tools, mongodump, mongorestore,
mongoexport, mongoimport, mongofiles, and mongooplog, must connect to running MongoDB instances
and these tools cannot directly modify the data files with --dbpath as in previous versions. Ensure that you start
your mongod instance(s) before using these tools.
Removed Options
Removed --dbpath, --journal, and --filter options for mongodump, mongorestore,
mongoimport, mongoexport, and bsondump.
Removed --locks option for mongotop.
Removed --noobjcheck option for bsondump and mongorestore.
Removed --csv option for mongoexport. Use the new --type option to specify the export format type
(csv or json).
See also:
MongoDB Tools Enhancements (page 936)
Security Changes
MongoDB 2.4 User Model Removed MongoDB 3.0 completely removes support for the deprecated 2.4 user model.
MongoDB 3.0 will exit with an error message if there is user data with the 2.4 schema, i.e. if authSchema version
is less than 3.
To verify the version of your existing 2.6 schema, query the system.version collection in the admin database:
use admin
db.system.version.find( { _id: "authSchema" })
If you are currently using auth and you have schema version 2 or 3, the query returns the currentVersion of the
existing authSchema.
If you do not currently have any users or you are using authSchema version 1, the query will not return any result.
If your authSchema version is less than 3 or the query does not return any results, see Upgrade User Authorization
Data to 2.6 Format (page 1005) to upgrade the authSchema version before upgrading to MongoDB 3.0.
After upgrading MongoDB to 3.0 from 2.6, to use the new SCRAM-SHA-1 challenge-response mechanism if you have
existing user data, you will need to upgrade the authentication schema a second time. This upgrades the MONGODB-CR
user model to SCRAM-SHA-1 user model. See Upgrade to SCRAM-SHA-1 (page 949) for details.
Localhost Exception Changed In 3.0, the localhost exception changed so that these connections only have access
to create the first user on the admin database. In previous versions, connections that gained access using the localhost
exception had unrestricted access to the MongoDB instance.
See Localhost Exception (page 320) for more information.
db.addUser() Removed 3.0 removes the legacy db.addUser() method. Use db.createUser() and
db.updateUser() instead.
TLS/SSL Certificates Validation By default, when running in SSL mode, MongoDB instances will only
start if its certificate (i.e. net.ssl.PemKeyFile) is valid. You can disable this behavior with the
net.ssl.allowInvalidCertificates setting or the --sslAllowInvalidCertificates command
line option.
To start the mongo shell with --ssl, you must explicitly specify either the --sslCAFile or
--sslAllowInvalidCertificates option at startup. See TLS/SSL Configuration for Clients (page 386) for
more information.
TLS/SSL Certificate Hostname Validation By default, MongoDB validates the hostnames of hosts attempting
to connect using certificates against the hostnames listed in those certificates. In certain deployment situations
this behavior may be undesirable. It is now possible to disable such hostname validation without disabling val-
idation of the rest of the certificate information with the net.ssl.allowInvalidHostnames setting or the
--sslAllowInvalidHostnames command line option.
SSLv3 Ciphers Disabled In light of vulnerabilities in legacy SSL ciphers795 , these ciphers have been explicitly
disabled in MongoDB. No configuration changes are necessary.
mongo Shell Version Compatibility Versions of the mongo shell before 3.0 are not compatible with 3.0 deploy-
ments of MongoDB that enforce access control. If you have a 3.0 MongoDB deployment that requires access control,
you must use 3.0 versions of the mongo shell.
HTTP Status Interface and REST API Compatibility Neither the HTTP status interface nor the REST API sup-
port the SCRAM-SHA-1 (page 321) challenge-response user authentication mechanism introduced in version 3.0.
Indexes
Remove dropDups Option dropDups option is no longer available for createIndex(), ensureIndex(),
and createIndexes.
Changes to Restart Behavior during Background Indexing For 3.0 mongod instances, if a background index
build is in progress when the mongod process terminates, when the instance restarts the index build will restart as
foreground index build. If the index build encounters any errors, such as a duplicate key error, the mongod will exit
with an error.
To start the mongod after a failed index build, use the storage.indexBuildRetry or
--noIndexBuildRetry to skip the index build on start up.
2d Indexes and Geospatial Near Queries For $near queries that use a 2d (page 505) index:
MongoDB no longer uses a default limit of 100 documents.
Specifying a batchSize() is no longer analogous to specifying a limit().
For $nearSphere queries that use a 2d (page 505) index, MongoDB no longer uses a default limit of 100 documents.
Driver Compatibility Changes Each officially supported driver has release a version that includes support for all
new features introduced in MongoDB 3.0. Upgrading to one of these version is strongly recommended as part of the
upgrade process.
A driver upgrade is necessary in certain scenarios due to changes in functionality:
Use of the SCRAM-SHA-1 authentication method
Use of functionality that calls listIndexes or listCollections
The minimum 3.0-compatible driver versions are:
795 https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-3566
findAndModify Return Document In MongoDB 3.0, when performing an update with findAndModify that
also specifies upsert: true and either the new option is not set or new: false, findAndModify returns
null in the value field if the query does not match any document, regardless of the sort specification.
In previous versions, findAndModify returns an empty document {} in the value field if a sort is specified for
the update, and upsert: true, and the new option is not set or new: false.
upsert:true with a Dotted _id Query When you execute an update() with upsert: true and the
query matches no existing document, MongoDB will refuse to insert a new document if the query specifies conditions
on the _id field using dot notation (page 189).
This restriction ensures that the order of fields embedded in the _id document is well-defined and not bound to the
order specified in the query
If you attempt to insert a document in this way, MongoDB will raise an error.
For example, consider the following update operation. Since the update operation specifies upsert:true and the
query specifies conditions on the _id field using dot notation, then the update will result in an error when constructing
the document to insert.
796 https://docs.mongodb.org/ecosystem/drivers/c
797 https://github.com/mongodb/mongo-c-driver/releases
798 https://github.com/mongodb/mongo-cxx-driver
799 https://github.com/mongodb/mongo-cxx-driver/releases
800 https://docs.mongodb.org/ecosystem/drivers/csharp
801 https://github.com/mongodb/mongo-csharp-driver/releases
802 https://docs.mongodb.org/ecosystem/drivers/java
803 https://github.com/mongodb/mongo-java-driver/releases
804 https://docs.mongodb.org/ecosystem/drivers/node-js
805 https://github.com/mongodb/node-mongodb-native/releases
806 https://docs.mongodb.org/ecosystem/drivers/perl
807 http://search.cpan.org/dist/MongoDB/
808 https://docs.mongodb.org/ecosystem/drivers/php
809 http://pecl.php.net/package/mongo
810 https://docs.mongodb.org/ecosystem/drivers/python
811 https://pypi.python.org/pypi/pymongo/
812 https://docs.mongodb.org/ecosystem/drivers/python
813 https://pypi.python.org/pypi/motor/
814 https://docs.mongodb.org/ecosystem/drivers/ruby
815 https://rubygems.org/gems/mongo
816 https://docs.mongodb.org/ecosystem/drivers/scala
817 https://github.com/mongodb/casbah/releases
Deprecate Access to system.indexes and system.namespaces MongoDB 3.0 deprecates direct access
to system.indexes and system.namespaces collections. Use the createIndexes and listIndexes
commands instead. See also WiredTiger and Driver Version Compatibility (page 938).
Collection Name Validation MongoDB 3.0 more consistently enforces the collection naming
restrictions. Ensure your application does not create or depend on invalid collection names.
Platform Support Commercial support is no longer provided for MongoDB on 32-bit platforms (Linux and Win-
dows). Linux RPM and DEB packages are also no longer available. However, binary archives are still available.
Linux Package Repositories Non-Enterprise MongoDB Linux packages for 3.0 and later are in a new repository.
Follow the appropriate Linux installation instructions (page 6) to install the 3.0 packages from the new location.
Removed/Deprecated Commands The following commands and methods are no longer available in MongoDB
3.0:
closeAllDatabases
getoptime
text
indexStats, db.collection.getIndexStats(), and db.collection.indexStats()
The following commands and methods are deprecated in MongoDB 3.0:
diagLogging
eval, db.eval()
db.collection.copyTo()
In addition, you cannot use the now deprecated eval command or the db.eval() method to invoke mapReduce
or db.collection.mapReduce().
Date and Timestamp Comparison Order MongoDB 3.0 no longer treats the Timestamp (page 196) and the Date
(page 197) data types as equivalent for comparison purposes. Instead, the Timestamp (page 196) data type has a higher
comparison/sort order (i.e. is greater) than the Date (page 197) data type. If your application relies on the equivalent
comparison/sort order of Date and Timestamp objects, modify your application accordingly before upgrading.
Server Status Output Change The serverStatus command and the db.serverStatus() method no
longer return workingSet, indexCounters, and recordStats sections in the output.
Unix Socket Permissions Change Unix domain socket file permission now defaults to 0700. To change the
permission, MongoDB provides the net.unixDomainSocket.filePermissions setting as well as the
--filePermission option.
cloneCollection The cloneCollection command and the db.cloneCollection() method will now
return an error if the collection already exists, instead of inserting into it.
Some changes in 3.0 can affect compatibility (page 938) and may require user actions. For a detailed list of compati-
bility changes, see Compatibility Changes in MongoDB 3.0 (page 938).
Upgrade Process
On this page
Upgrade Recommendations and Checklists (page 945)
Upgrade MongoDB to 3.0 Upgrade MongoDB Processes (page 945)
Upgrade Existing MONGODB-CR Users to Use SCRAM-SHA-1 (page 949)
General Upgrade Procedure (page 949)
In the general case, the upgrade from MongoDB 2.6 to 3.0 is a binary-compatible drop-in upgrade: shut down the
mongod instances and replace them with mongod instances running 3.0. However, before you attempt any upgrade
please familiarize yourself with the content of this document, particularly the procedure for upgrading sharded clusters
(page 947).
If you need guidance on upgrading to 3.0, MongoDB offers consulting818 to help ensure a smooth transition without
interruption to your MongoDB application.
Upgrade Requirements To upgrade an existing MongoDB deployment to 3.0, you must be running 2.6. If youre
running a version of MongoDB before 2.6, you must upgrade to 2.6 before upgrading to 3.0. See Upgrade MongoDB to
2.6 (page 1001) for the procedure to upgrade from 2.4 to 2.6. Once upgraded to MongoDB 2.6, you cannot downgrade
to any version earlier than MongoDB 2.4.
If your existing MongoDB deployment is already running with authentication and authorization, your user data model
authSchema must be at least version 3. To verify the version of your existing authSchema, see MongoDB 2.4
User Model Removed (page 941). To upgrade your authSchema version, see Upgrade User Authorization Data to
2.6 Format (page 1005) for details.
Preparedness Before upgrading MongoDB, always test your application in a staging environment before deploying
the upgrade to your production environment.
Some changes in MongoDB 3.0 require manual checks and intervention. Before beginning your upgrade, see the
Compatibility Changes in MongoDB 3.0 (page 938) document to ensure that your applications and deployments are
compatible with MongoDB 3.0. Resolve the incompatibilities in your deployment before starting the upgrade.
Downgrade Limitations Once upgraded to MongoDB 3.0, you cannot downgrade to a version lower than 2.6.8.
If you upgrade to 3.0 and have run authSchemaUpgrade, you cannot downgrade to 2.6 without disabling --auth
or restoring a pre-upgrade backup, as authSchemaUpgrade discards the MONGODB-CR credentials used in 2.6.
See Upgrade Existing MONGODB-CR Users to Use SCRAM-SHA-1 (page 949).
Upgrade Standalone mongod Instance to MongoDB 3.0 The following steps outline the procedure to upgrade a
standalone mongod from version 2.6 to 3.0. To upgrade from version 2.4 to 3.0, upgrade to version 2.6 (page 1001)
first, and then use the following procedure to upgrade from 2.6 to 3.0.
Upgrade Binaries If you installed MongoDB from the MongoDB apt, yum, or zypper repositories, you should
upgrade to 3.0 using your package manager. Follow the appropriate installation instructions (page 6) for your Linux
system. This will involve adding a repository for the new release, then performing the actual upgrade.
Otherwise, you can manually upgrade MongoDB:
Step 1: Download 3.0 binaries. Download binaries of the latest release in the 3.0 series from the MongoDB Down-
load Page819 . See Install MongoDB (page 5) for more information.
Step 2: Replace 2.6 binaries. Shut down your mongod instance. Replace the existing binary with the 3.0 mongod
binary and restart mongod.
Change Storage Engine for Standalone to WiredTiger To change the storage engine for a standalone mongod
instance to WiredTiger, see Change Standalone to WiredTiger (page 589).
Prerequisites
If the oplog contains entries generated by versions of MongoDB that precede version 2.2.1, you must wait for
the entries to be overwritten by later versions before you can upgrade to MongoDB 3.0. For more information,
see Replica Set Oplog Format Change (page 939)
Stricter validation in MongoDB 3.0 (page 939) of replica set configuration may invalidate previously-valid
replica set configuration, preventing replica sets from starting in MongoDB 3.0. For more information, see
Replica Set Configuration Validation (page 939).
All replica set members must be running version 2.6 before you can upgrade them to version 3.0. To upgrade a
replica set from an earlier MongoDB version, upgrade all members of the replica set to version 2.6 (page 1001)
first, and then follow the procedure to upgrade from MongoDB 2.6 to 3.0.
Upgrade Binaries You can upgrade from MongoDB 2.6 to 3.0 using a rolling upgrade to minimize downtime by
upgrading the members individually while the other members are available:
Step 1: Upgrade secondary members of the replica set. Upgrade the secondary members of the set one at a time
by shutting down the mongod and replacing the 2.6 binary with the 3.0 binary. After upgrading a mongod instance,
wait for the member to recover to SECONDARY state before upgrading the next instance. To check the members state,
issue rs.status() in the mongo shell.
Step 2: Step down the replica set primary. Use rs.stepDown() in the mongo shell to step down the primary
and force the set to failover (page 635). rs.stepDown() expedites the failover procedure and is preferable to
shutting down the primary directly.
819 http://www.mongodb.org/downloads?jmp=docs
Step 3: Upgrade the primary. When rs.status() shows that the primary has stepped down and another mem-
ber has assumed PRIMARY state, shut down the previous primary and replace the mongod binary with the 3.0 binary
and start the new instance.
Replica set failover is not instant and will render the set unavailable to accept writes until the failover process com-
pletes. This may take 30 seconds or more: schedule the upgrade procedure during a scheduled maintenance window.
Change Replica Set Storage Engine to WiredTiger To change the storage engine for a replica set to WiredTiger,
see Change Replica Set to WiredTiger (page 590).
Upgrade a Sharded Cluster to 3.0 Only upgrade sharded clusters to 3.0 if all members of the cluster are currently
running instances of 2.6. The only supported upgrade path for sharded clusters running 2.4 is via 2.6. The upgrade
process checks all components of the cluster and will produce warnings if any component is running version 2.4.
Considerations The upgrade process does not require any downtime. However, while you upgrade the sharded
cluster, ensure that clients do not make changes to the collection meta-data. For example, during the upgrade, do not
do any of the following:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster metadata in any way. See Sharding Reference (page 814) for a com-
plete list of sharding commands. Note, however, that not all commands on the Sharding Reference (page 814)
page modifies the cluster meta-data.
Upgrade Sharded Clusters Optional but Recommended. As a precaution, take a backup of the config database
before upgrading the sharded cluster.
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 2: Upgrade the clusters meta data. Start a single 3.0 mongos instance with the configDB pointing to the
clusters config servers and with the --upgrade option.
To run a mongos with the --upgrade option, you can upgrade an existing mongos instance to 3.0, or if you need
to avoid reconfiguring a production mongos instance, you can use a new 3.0 mongos that can reach all the config
servers.
To upgrade the meta data, run:
mongos --configdb <configDB string> --upgrade
You can include the --logpath option to output the log messages to a file instead of the standard output. Also
include any other options required to start mongos instances in your cluster, such as --sslOnNormalPorts or
--sslPEMKeyFile.
The 3.0 mongos will output informational log messages.
<timestamp> I SHARDING [mongosMain] MongoS version 3.0.0 starting: ...
...
<timestamp> I SHARDING [mongosMain] starting upgrade of config server from v5 to v6
<timestamp> I SHARDING [mongosMain] starting next upgrade step from v5 to v6
<timestamp> I SHARDING [mongosMain] about to log new metadata event: ...
<timestamp> I SHARDING [mongosMain] checking that version of host ... is compatible with 2.6
...
<timestamp> I SHARDING [mongosMain] upgrade of config server to v6 successful
...
<timestamp> I SHARDING [mongosMain] distributed lock 'configUpgrade/...' unlocked.
<timestamp> I - [mongosMain] Config database is at version v6
Step 3: Ensure mongos --upgrade process completes successfully. The mongos will exit upon completion
of the meta data upgrade process. If successful, the process will log the following messages:
<timestamp> I SHARDING [mongosMain] upgrade of config server to v6 successful
...
<timestamp> I - [mongosMain] Config database is at version v6
After a successful upgrade, restart the mongos instance. If mongos fails to start, check the log for more information.
If the mongos instance loses its connection to the config servers during the upgrade or if the upgrade is otherwise
unsuccessful, you may always safely retry the upgrade.
Step 4: Upgrade the remaining mongos instances to 3.0. Upgrade and restart without the --upgrade option
the other mongos instances in the sharded cluster.
After you have successfully upgraded all mongos instances, you can proceed to upgrade the other components in
your sharded cluster.
Warning: Do not upgrade the mongod instances until after you have upgraded all the mongos instances.
Step 5: Upgrade the config servers. After you have successfully upgraded all mongos instances, upgrade all 3
mongod config server instances, leaving the first config server listed in the mongos --configdb argument to
upgrade last.
Step 6: Upgrade the shards. Upgrade each shard, one at a time, upgrading the mongod secondaries before running
replSetStepDown and upgrading the primary of each shard.
Step 7: Re-enable the balancer. Once the upgrade of sharded cluster components is complete, Re-enable the bal-
ancer (page 795).
Change Sharded Cluster Storage Engine to WiredTiger For a sharded cluster in MongoDB 3.0, you can choose
to update the shards to use WiredTiger storage engine and have the config servers use MMAPv1. If you update the
config servers to use WiredTiger, you must update all three config servers to use WiredTiger.
To change a sharded cluster to use WiredTiger, see Change Sharded Cluster to WiredTiger (page 591).
Upgrade Existing MONGODB-CR Users to Use SCRAM-SHA-1 After upgrading the binaries, see Upgrade to
SCRAM-SHA-1 (page 949) for details on SCRAM-SHA-1 upgrade scenarios.
General Upgrade Procedure Except as described on this page, moving between 2.6 and 3.0 is a drop-in replace-
ment:
Step 1: Stop the existing mongod instance. For example, on Linux, run 2.6 mongod with the --shutdown
option as follows:
mongod --dbpath /var/mongod/data --shutdown
Replace /var/mongod/data with your MongoDB dbPath. See also the Stop mongod Processes (page 246) for
alternate methods of stopping a mongod instance.
Step 2: Start the new mongod instance. Ensure you start the 3.0 mongod with the same dbPath:
mongod --dbpath /var/mongod/data
On this page
Overview (page 949)
Upgrade to SCRAM-SHA-1 Considerations (page 950)
Upgrade 2.6 MONGODB-CR Users to SCRAM-SHA-1 (page 952)
Result (page 952)
Additional Resources (page 953)
Overview MongoDB 3.0 includes support for the SCRAM-SHA-1 (page 321) challenge-response user authentication
mechanism, which changes how MongoDB uses and stores user credentials.
For deployments that already contain user authentication data, to use the SCRAM-SHA-1 mechanism, you must up-
grade the authentication schema in addition to upgrading the MongoDB processes.
You may, alternatively, opt to continue to use the MONGODB-CR challenge-response mechanism and skip this upgrade.
See Upgrade Scenarios (page 949) for details.
Upgrade Scenarios The following scenarios are possible when upgrading from 2.6 to 3.0:
Continue to Use MONGODB-CR If you are upgrading from a 2.6 database with existing user authentication data,
to continue to use MONGODB-CR for existing challenge-response users, no upgrade to the existing user data is
required. However, new challenge-response users created in 3.0 will use the following authentication mechanism:
If you populated MongoDB 3.0 user data by importing the 2.6 user authentication data, including user data, new
challenge-response users created in MongoDB 3.0 will use SCRAM-SHA1.
If you run MongoDB 3.0 binary against the 2.6 data files, including the user authentication data files, new
challenge-response users created in MongoDB 3.0 will continue to use the MONGODB-CR.
You can execute the upgrade to SCRAM-SHA-1 at any point in the future.
Use SCRAM-SHA-1
If you are starting with a new 3.0 installation without any users or upgrading from a 2.6 database that has no
users, to use SCRAM-SHA-1, no user data upgrade is required. All newly created users will have the correct
format for SCRAM-SHA-1.
If you are upgrading from a 2.6 database with existing user data, to use SCRAM-SHA-1, follow the steps in
Upgrade 2.6 MONGODB-CR Users to SCRAM-SHA-1 (page 952).
Important: Before you attempt any upgrade, familiarize yourself with the Considerations (page 950) as the upgrade
to SCRAM-SHA-1 is irreversible short of restoring from backups.
Recommendation SCRAM-SHA-1 represents a significant improvement in security over MONGODB-CR, the previ-
ous default authentication mechanism: you are strongly urged to upgrade. For advantages of using SCRAM-SHA-1,
see SCRAM-SHA-1 (page 321).
Considerations
Backwards Incompatibility The procedure to upgrade to SCRAM-SHA-1 discards the MONGODB-CR credentials
used by 2.6. As such, the procedure is irreversible, short of restoring from backups.
The procedure also disables MONGODB-CR as an authentication mechanism.
Upgrade Binaries Before upgrading the authentication model, you should first upgrade MongoDB binaries to 3.0.
For sharded clusters, ensure that all cluster components are 3.0.
Upgrade Drivers You must upgrade all drivers used by applications that will connect to upgraded database instances
to version that support SCRAM-SHA-1. The minimum driver versions that support SCRAM-SHA-1 are:
Requirements To upgrade the authentication model, you must have a user in the admin database with the role
userAdminAnyDatabase (page 421).
Timing Because downgrades are more difficult after you upgrade the user authentication model, once you upgrade
the MongoDB binaries to version 3.0, allow your MongoDB deployment to run for a day or two before following this
procedure.
This allows 3.0 some time to burn in and decreases the likelihood of downgrades occurring after the user privilege
model upgrade. The user authentication and access control will continue to work as it did in 2.6.
If you decide to upgrade the user authentication model immediately instead of waiting the recommended burn in
period, then for sharded clusters, you must wait at least 10 seconds after upgrading the sharded clusters to run the
authentication upgrade command.
Replica Sets For a replica set, it is only necessary to run the upgrade process on the primary as the changes will
automatically replicate to the secondaries.
820 https://docs.mongodb.org/ecosystem/drivers/c
821 https://github.com/mongodb/mongo-c-driver/releases
822 https://github.com/mongodb/mongo-cxx-driver
823 https://github.com/mongodb/mongo-cxx-driver/releases
824 https://docs.mongodb.org/ecosystem/drivers/csharp
825 https://github.com/mongodb/mongo-csharp-driver/releases
826 https://docs.mongodb.org/ecosystem/drivers/java
827 https://github.com/mongodb/mongo-java-driver/releases
828 https://docs.mongodb.org/ecosystem/drivers/node-js
829 https://github.com/mongodb/node-mongodb-native/releases
830 https://docs.mongodb.org/ecosystem/drivers/perl
831 http://search.cpan.org/dist/MongoDB/
832 https://docs.mongodb.org/ecosystem/drivers/php
833 http://pecl.php.net/package/mongo
834 https://docs.mongodb.org/ecosystem/drivers/python
835 https://pypi.python.org/pypi/pymongo/
836 https://docs.mongodb.org/ecosystem/drivers/python
837 https://pypi.python.org/pypi/motor/
838 https://docs.mongodb.org/ecosystem/drivers/ruby
839 https://rubygems.org/gems/mongo
840 https://docs.mongodb.org/ecosystem/drivers/scala
841 https://github.com/mongodb/casbah/releases
842 https://docs.mongodb.org/ecosystem/drivers
Sharded Clusters For a sharded cluster, connect to one mongos instance and run the upgrade procedure to upgrade
the clusters authentication data. By default, the procedure will upgrade the authentication data of the shards as well.
To override this behavior, run authSchemaUpgrade with the upgradeShards: false option. If you choose
to override, you must run the upgrade procedure on the mongos first, and then run the procedure on the primary
members of each shard.
For a sharded cluster, do not run the upgrade process directly against the config servers (page 734). Instead, perform
the upgrade process using one mongos instance to interact with the config database.
Important: To use the SCRAM-SHA-1 authentication mechanism, a driver upgrade is necessary if your current
driver version does not support SCRAM-SHA-1. See required driver versions (page 950) for details.
Step 1: Connect to the MongoDB instance. Connect and authenticate to the mongod instance for a single deploy-
ment, the primary mongod for a replica set, or a mongos for a sharded cluster as an admin database user with the
role userAdminAnyDatabase (page 421).
Step 2: Upgrade authentication schema. Use the authSchemaUpgrade command in the admin database to
update the user data using the mongo shell.
Sharded cluster authSchemaUpgrade consideration. For a sharded cluster without shard local users
(page 319), authSchemaUpgrade will, by default, upgrade the authorization data of the shards as well, completing
the upgrade.
You can, however, override this behavior by including upgradeShards: false in the command, as in the
following example:
db.adminCommand(
{authSchemaUpgrade: 1, upgradeShards: false }
);
If you override the default behavior or your cluster has shard local users, after running authSchemaUpgrade on
a mongos instance, you will need to connect to the primary for each shard and repeat the upgrade process after
upgrading on the mongos.
Result After this procedure is complete, all users in the database will have SCRAM-SHA-1-style credentials, and
any subsequently-created users will also have this type of credentials.
Additional Resources
Blog Post: Improved Password-Based Authentication in MongoDB 3.0: SCRAM Explained (Part 1)843
Blog Post: Improved Password-Based Authentication in MongoDB 3.0: SCRAM Explained (Part 2)844
On this page
Downgrade MongoDB from 3.0 Downgrade Recommendations and Checklist (page 953)
Downgrade MongoDB Processes (page 953)
General Downgrade Procedure (page 958)
Before you attempt any downgrade, familiarize yourself with the content of this document, particularly the Downgrade
Recommendations and Checklist (page 953) and the procedure for downgrading sharded clusters (page 955).
Downgrade Path Once upgraded to MongoDB 3.0, you cannot downgrade to a version lower than 2.6.8.
Important: If you upgrade to MongoDB 3.0 and have run authSchemaUpgrade, you cannot downgrade to the
2.6 series without disabling --auth.
Note: Optional. Consider compacting collections after downgrading. Otherwise, older versions will not be able
to reuse free space regions larger than 2MB created while running 3.0. This can result in wasted space but no data loss
following the downgrade.
Downgrade a Standalone mongod Instance If you have changed the storage engine to WiredTiger, change the
storage engine to MMAPv1 before downgrading to 2.6.
Change Storage Engine to MMAPv1 To change storage engine to MMAPv1 for a standalone mongod instance,
you will need to manually export and upload the data using mongodump and mongorestore.
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 3: Create data directory for MMAPv1. Create a new data directory for MMAPv1. Ensure that the user
account running mongod has read and write permissions for the new directory.
Step 4: Restart the mongod with MMAPv1. Restart the 3.0 mongod, specifying the newly created data directory
for MMAPv1 as the --dbpath. You do not have to specify --storageEngine as MMAPv1 is the default.
mongod --dbpath <newMMAPv1DBPath>
Downgrade Binaries The following steps outline the procedure to downgrade a standalone mongod from version
3.0 to 2.6.
Once upgraded to MongoDB 3.0, you cannot downgrade to a version lower than 2.6.8.
Step 1: Download 2.6 binaries. Download binaries of the latest release in the 2.6 series from the MongoDB Down-
load Page845 . See Install MongoDB (page 5) for more information.
Step 2: Replace with 2.6 binaries. Shut down your mongod instance. Replace the existing binary with the 2.6
mongod binary and restart mongod.
Downgrade a 3.0 Replica Set If you have changed the storage engine to WiredTiger, change the storage engine to
MMAPv1 before downgrading to 2.6.
Change Storage Engine to MMAPv1 You can update members to use the MMAPv1 storage engine in a rolling
manner.
Note: When running a replica set with mixed storage engines, performance can vary according to workload.
To change the storage engine to MMAPv1 for an existing secondary replica set member, remove the members data
and perform an initial sync (page 690):
Step 1: Shutdown the secondary member. Stop the mongod instance for the secondary member.
845 http://www.mongodb.org/downloads
Step 2: Prepare data directory for MMAPv1. Prepare --dbpath directory for initial sync.
For the stopped secondary member, either delete the content of the data directory or create a new data directory. If
creating a new directory, ensure that the user account running mongod has read and write permissions for the new
directory.
Step 3: Restart the secondary member with MMAPv1. Restart the 3.0 mongod, specifying the MMAPv1 data
directory as the --dbpath. Specify additional options as appropriate for the member. You do not have to specify
--storageEngine since MMAPv1 is the default.
mongod --dbpath <preparedMMAPv1DBPath>
Since no data exists in the --dbpath, the mongod will perform an initial sync. The length of the initial sync process
depends on the size of the database and network connection between members of the replica set.
Repeat for the remaining the secondary members. Once all the secondary members have switched to MMAPv1, step
down the primary, and update the stepped-down member.
Downgrade Binaries Once upgraded to MongoDB 3.0, you cannot downgrade to a version lower than 2.6.8.
The following steps outline a rolling downgrade process for the replica set. The rolling downgrade process
minimizes downtime by downgrading the members individually while the other members are available:
Step 1: Downgrade secondary members of the replica set. Downgrade each secondary member of the replica set,
one at a time:
1. Shut down the mongod. See Stop mongod Processes (page 246) for instructions on safely terminating mongod
processes.
2. Replace the 3.0 binary with the 2.6 binary and restart.
3. Wait for the member to recover to SECONDARY state before downgrading the next secondary. To check the
members state, use the rs.status() method in the mongo shell.
Step 2: Step down the primary. Use rs.stepDown() in the mongo shell to step down the primary and force
the normal failover (page 635) procedure.
rs.stepDown()
rs.stepDown() expedites the failover procedure and is preferable to shutting down the primary directly.
Step 3: Replace and restart former primary mongod. When rs.status() shows that the primary has stepped
down and another member has assumed PRIMARY state, shut down the previous primary and replace the mongod
binary with the 2.6 binary and start the new instance.
Replica set failover is not instant but will render the set unavailable to writes and interrupt reads until the failover pro-
cess completes. Typically this takes 10 seconds or more. You may wish to plan the downgrade during a predetermined
maintenance window.
Requirements While the downgrade is in progress, you cannot make changes to the collection meta-data. For
example, during the downgrade, do not do any of the following:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster meta-data in any way. See Sharding Reference (page 814) for a com-
plete list of sharding commands. Note, however, that not all commands on the Sharding Reference (page 814)
page modifies the cluster meta-data.
Change Storage Engine to MMAPv1 If you have changed the storage engine to WiredTiger, change the storage
engine to MMAPv1 before downgrading to 2.6.
Change Shards to Use MMAPv1 To change the storage engine to MMAPv1, refer to the procedure in Change Stor-
age Engine to MMAPv1 for replica set members (page 954) and Change Storage Engine to MMAPv1 for standalone
mongod (page 953) as appropriate for your shards.
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 2: Stop the last config server listed in the mongos configDB setting.
Step 3: Export data of the second config server listed in the mongos configDB setting.
mongodump --out <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 4: For the second config server, create a new data directory for MMAPv1. Ensure that the user account
running mongod has read and write permissions for the new directory.
Step 5: Restart the second config server with MMAPv1. Specify the newly created MMAPv1 data directory as
the --dbpath as well as any additional options as appropriate.
mongod --dbpath <newMMAPv1DBPath> --configsvr
Step 6: Upload the exported data using mongorestore to the second config server.
mongorestore <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 10: For the third config server, create a new data directory for MMAPv1. Ensure that the user account
running mongod has read and write permissions for the new directory.
Step 11: Restart the third config server with MMAPv1. Specify the newly created MMAPv1 data directory as
the --dbpath as well as any additional options as appropriate.
mongod --dbpath <newMMAPv1DBPath> --configsvr
Step 12: Upload the exported data using mongorestore to the third config server.
mongorestore <exportDataDestination>
Step 13: Export data of the first config server listed in the mongos configDB setting.
mongodump --out <exportDataDestination>
Specify additional options as appropriate, such as username and password if running with authorization enabled. See
mongodump for available options.
Step 14: For the first config server, create data directory for MMAPv1. Ensure that the user account running
mongod has read and write permissions for the new directory.
Step 15: Restart the first config server with MMAPv1. Specify the newly created MMAPv1 data directory as the
--dbpath as well as any additional options as appropriate.
mongod --dbpath <newMMAPv1DBPath> --configsvr
Step 16: Upload the exported data using mongorestore to the first config server.
mongorestore <exportDataDestination>
Step 17: Enable writes to the sharded clusters metadata. Restart the second config server, specifying the newly
created MMAPv1 data directory as the --dbpath. Specify additional options as appropriate.
mongod --dbpath <newMMAPv1DBPath> --configsvr
Once all three config servers are up, the sharded clusters metadata is available for writes.
Step 18: Re-enable the balancer. Once all three config servers are up and running with WiredTiger, Re-enable the
balancer (page 795).
Downgrade Binaries Once upgraded to MongoDB 3.0, you cannot downgrade to a version lower than 2.6.8.
The downgrade procedure for a sharded cluster reverses the order of the upgrade procedure. The version v6 config
database is backwards compatible with MongoDB 2.6.
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 3: Downgrade the config servers. Downgrade all 3 mongod config server instances, leaving the first system
in the mongos --configdb argument to downgrade last.
Step 4: Downgrade the mongos instances. Downgrade and restart each mongos, one at a time. The downgrade
process is a binary drop-in replacement.
Step 5: Re-enable the balancer. Once the upgrade of sharded cluster components is complete, re-enable the bal-
ancer (page 795).
General Downgrade Procedure Except as described on this page, moving between 2.6 and 3.0 is a drop-in replace-
ment:
Step 1: Stop the existing mongod instance. For example, on Linux, run 3.0 mongod with the --shutdown
option as follows:
mongod --dbpath /var/mongod/data --shutdown
Replace /var/mongod/data with your MongoDB dbPath. See also the Stop mongod Processes (page 246) for
alternate methods of stopping a mongod instance.
Step 2: Start the new mongod instance. Ensure you start the 2.6 mongod with the same dbPath:
mongod --dbpath /var/mongod/data
Download
Additional Resources
On this page
Minor Releases (page 959)
Major Changes (page 985)
Security Improvements (page 987)
Query Engine Improvements (page 987)
Improvements (page 987)
Operational Changes (page 988)
MongoDB Enterprise Features (page 989)
Additional Information (page 990)
April 8, 2014
MongoDB 2.6 is now available. Key features include aggregation enhancements, text-search integration, query-engine
improvements, a new write-operation protocol, and security enhancements.
Minor Releases
2.6 Changelog
846 http://www.mongodb.org/downloads
847 https://github.com/mongodb/mongo/blob/v3.0/distsrc/THIRD-PARTY-NOTICES
848 http://bit.ly/1CpOu6t
849 http://www.mongodb.com/blog/post/announcing-mongodb-30?jmp=docs
850 https://www.mongodb.com/lp/white-paper/mongodb-3.0?jmp=docs
851 https://www.mongodb.com/webinar/Whats-New-in-MongoDB-3-0?jmp=docs
On this page
2.6.11 Changes (page 960)
2.6.10 Changes (page 961)
2.6.9 Changes (page 963)
2.6.8 Changes (page 964)
2.6.7 Changes (page 966)
2.6.6 Changes (page 966)
2.6.5 Changes (page 969)
2.6.4 Changes (page 972)
2.6.3 Changes (page 976)
2.6.2 Changes (page 976)
2.6.1 Changes (page 980)
2.6.11 Changes
Querying
SERVER-19553852 mongod shouldnt use sayPiggyBack to send killCursor messages
SERVER-18620853 Reduce frequency of staticYield cant unlock log message
SERVER-18461854 Range predicates comparing against a BinData value should be covered, but are not in 2.6
SERVER-17815855 Plan ranking tie breaker is computed incorrectly
SERVER-16265856 Add query details to getmore entry in profiler and db.currentOp()
SERVER-15217857 v2.6 query plan ranking test NonCoveredIxisectFetchesLess relies on order of
deleted record list
SERVER-14070858 Compound index not providing sort if equality predicate given on sort field
Replication
SERVER-18280859 ReplicaSetMonitor should use electionId to avoid talking to old primaries
SERVER-18795860 db.printSlaveReplicationInfo()/rs.printSlaveReplicationInfo()
can not work with ARBITER role
Sharding
SERVER-19464861 $sort stage in aggregation doesnt call scoped connections done ()
SERVER-18955862 mongos doesnt set batch size (and keeps the old one, 0) on getMore if performed on first
_cursor->more()
852 https://jira.mongodb.org/browse/SERVER-19553
853 https://jira.mongodb.org/browse/SERVER-18620
854 https://jira.mongodb.org/browse/SERVER-18461
855 https://jira.mongodb.org/browse/SERVER-17815
856 https://jira.mongodb.org/browse/SERVER-16265
857 https://jira.mongodb.org/browse/SERVER-15217
858 https://jira.mongodb.org/browse/SERVER-14070
859 https://jira.mongodb.org/browse/SERVER-18280
860 https://jira.mongodb.org/browse/SERVER-18795
861 https://jira.mongodb.org/browse/SERVER-19464
862 https://jira.mongodb.org/browse/SERVER-18955
Indexing
SERVER-19559863 Document growth of key too large document makes it disappear from the index
SERVER-16348864 Assertion failure n >= 0 && n < static_cast<int>(_files.size())
src/mongo/db/storage/extent_manager.cpp 109
SERVER-13875865 ensureIndex() of 2dsphere index breaks after upgrading to 2.6 (with the new
createIndex command)
2.6.10 Changes
Security
SERVER-18312870 Upgrade PCRE to latest
SERVER-17812871 LockPinger has audit-related GLE failure
SERVER-17647872 Compute BinData length in v8
SERVER-17591873 Add SSL flag to select supported protocols
SERVER-16849874 On mongos we always invalidate the user cache once, even if no user definitions are changing
SERVER-11980875 Improve user cache invalidation enforcement on mongos
Querying
SERVER-18364876 Ensure non-negation predicates get chosen over negation predicates for multikey index
bounds construction
SERVER-17815877 Plan ranking tie breaker is computed incorrectly
SERVER-16256878 $all clause with elemMatch uses wider bounds than needed
863 https://jira.mongodb.org/browse/SERVER-19559
864 https://jira.mongodb.org/browse/SERVER-16348
865 https://jira.mongodb.org/browse/SERVER-13875
866 https://jira.mongodb.org/browse/SERVER-19389
867 https://jira.mongodb.org/browse/SERVER-18097
868 https://jira.mongodb.org/browse/SERVER-18068
869 https://jira.mongodb.org/browse/SERVER-18371
870 https://jira.mongodb.org/browse/SERVER-18312
871 https://jira.mongodb.org/browse/SERVER-17812
872 https://jira.mongodb.org/browse/SERVER-17647
873 https://jira.mongodb.org/browse/SERVER-17591
874 https://jira.mongodb.org/browse/SERVER-16849
875 https://jira.mongodb.org/browse/SERVER-11980
876 https://jira.mongodb.org/browse/SERVER-18364
877 https://jira.mongodb.org/browse/SERVER-17815
878 https://jira.mongodb.org/browse/SERVER-16256
Replication
SERVER-18211879 MongoDB fails to correctly roll back collection creation
SERVER-17771880 Reconfiguring a replica set to remove a node causes a segmentation fault on 2.6.8
SERVER-13542881 Expose electionId on primary in isMaster
Sharding
SERVER-17812882 LockPinger has audit-related GLE failure
SERVER-17805883 logOp / OperationObserver should always check shardversion
SERVER-17749884 collMod usePowerOf2Sizes fails on mongos
SERVER-11980885 Improve user cache invalidation enforcement on mongos
Storage
SERVER-18211886 MongoDB fails to correctly roll back collection creation
SERVER-17653887 ERROR: socket XXX is higher than 1023; not supported on 2.6.*
Write Ops
SERVER-18111889 mongod allows user inserts into system.profile collection
SERVER-13542890 Expose electionId on primary in isMaster
Networking
SERVER-18096891 Shard primary incorrectly reuses closed sockets after relinquish and re-election
SERVER-17591892 Add SSL flag to select supported protocols
Testing
SERVER-18262899 setup_multiversion_mongodb should retry links download on timeouts
SERVER-18229900 smoke.py with PyMongo 3.0.1 fails to run certain tests
SERVER-18073901 Fix smoke.py to work with PyMongo 3.0
2.6.9 Changes
Querying
SERVER-14723903 Crash during query planning for geoNear with multiple 2dsphere indexes
SERVER-14071904 For queries with sort(), bad non-blocking plan can be cached if there are zero results
SERVER-8188905 Configurable idle cursor timeout
Storage SERVER-15907908 Use ftruncate rather than fallocate when running on tmpfs
895 https://jira.mongodb.org/browse/SERVER-18312
896 https://jira.mongodb.org/browse/SERVER-17780
897 https://jira.mongodb.org/browse/SERVER-16563
898 https://jira.mongodb.org/browse/SERVER-17951
899 https://jira.mongodb.org/browse/SERVER-18262
900 https://jira.mongodb.org/browse/SERVER-18229
901 https://jira.mongodb.org/browse/SERVER-18073
902 https://jira.mongodb.org/browse/SERVER-16073
903 https://jira.mongodb.org/browse/SERVER-14723
904 https://jira.mongodb.org/browse/SERVER-14071
905 https://jira.mongodb.org/browse/SERVER-8188
906 https://jira.mongodb.org/browse/SERVER-17429
907 https://jira.mongodb.org/browse/SERVER-17441
908 https://jira.mongodb.org/browse/SERVER-15907
Aggregation Framework
SERVER-17426909 Aggregation framework query by _id returns duplicates in sharded cluster (orphan docu-
ments)
SERVER-17224910 Aggregation pipeline with 64MB document can terminate server
2.6.8 Changes
Replication
SERVER-16599923 copydb and clone commands can crash the server if a primary steps down
SERVER-16315924 Replica set nodes should not threaten to veto nodes whose config version is higher than their
own
SERVER-16274925 secondary fasserts trying to replicate an index
SERVER-15471926 Better error message when replica is not found in GhostSync::associateSlave
Sharding
SERVER-17191927 Spurious warning during upgrade of sharded cluster
SERVER-17163928 Fatal error logOp but not primary in MigrateStatus::go
SERVER-16984929 UpdateLifecycleImpl can return empty collectionMetadata even if ns is
sharded
SERVER-10904930 Possible for _master and _slaveConn to be pointing to different connections even with
primary read pref
Storage
SERVER-17087931 Add listCollections command functionality to 2.6 shell & client
SERVER-14572932 Increase C runtime stdio file limit
Tools
SERVER-17216933 2.6 mongostat cannot be used with 3.0 mongod
SERVER-14190934 mongorestore parseMetadataFile passes non-null terminated string to
fromjson
2.6.7 Changes
Stability
SERVER-16237939 Dont check the shard version if the primary server is down
Querying
SERVER-16408940 max_time_ms.js should not run in parallel suite.
Replication
SERVER-16732941 SyncSourceFeedback::replHandshake() may perform an illegal erase from a
std::map in some circumstances
Sharding
SERVER-16683942 Decrease mongos memory footprint when shards have several tags
SERVER-15766943 prefix_shard_key.js depends on primary allocation to particular shards
SERVER-14306944 mongos can cause shards to hit the in-memory sort limit by requesting more results than
needed.
Packaging
SERVER-16081945 /etc/init.d/mongod startup script fails, with dirname message
2.6.6 Changes
Security
SERVER-15673946 Disable SSLv3 ciphers
SERVER-15515947 New test for mixed version replSet, 2.4 primary, user updates
SERVER-15500948 New test for system.user operations
938 https://jira.mongodb.org/browse/SERVER-16421
939 https://jira.mongodb.org/browse/SERVER-16237
940 https://jira.mongodb.org/browse/SERVER-16408
941 https://jira.mongodb.org/browse/SERVER-16732
942 https://jira.mongodb.org/browse/SERVER-16683
943 https://jira.mongodb.org/browse/SERVER-15766
944 https://jira.mongodb.org/browse/SERVER-14306
945 https://jira.mongodb.org/browse/SERVER-16081
946 https://jira.mongodb.org/browse/SERVER-15673
947 https://jira.mongodb.org/browse/SERVER-15515
948 https://jira.mongodb.org/browse/SERVER-15500
Stability
SERVER-12061949 Do not silently ignore read errors when syncing a replica set node
SERVER-12058950 Primary should abort if encountered problems writing to the oplog
Querying
SERVER-16291951 Cannot set/list/clear index filters on the secondary
SERVER-15958952 The isMultiKey value is not correct in the output of aggregation explain plan
SERVER-15899953 Querying against path in document containing long array of subdocuments with nested
arrays causes stack overflow
SERVER-15696954 $regex, $in and $sort with index returns too many results
SERVER-15639955 Text queries can return incorrect results and leak memory when multiple predicates given
on same text index prefix field
SERVER-15580956 Evaluating candidate query plans with concurrent writes on same collection may crash
mongod
SERVER-15528957 Distinct queries can scan many index keys without yielding read lock
SERVER-15485958 CanonicalQuery::canonicalize can leak a LiteParsedQuery
SERVER-15403959 $min and $max equal errors in 2.6 but not in 2.4
SERVER-15233960 Cannot run planCacheListQueryShapes on a Secondary
SERVER-14799961 count with hint doesnt work when hint is a document
Replication
SERVER-16107962 2.6 mongod crashes with segfault when added to a 2.8 replica set with >= 12 nodes.
SERVER-15994963 listIndexes and listCollections can be run on secondaries without slaveOk bit
SERVER-15849964 do not forward replication progress for nodes that are no longer part of a replica set
SERVER-15491965 SyncSourceFeedback can crash due to a SocketException in
authenticateInternalUser
949 https://jira.mongodb.org/browse/SERVER-12061
950 https://jira.mongodb.org/browse/SERVER-12058
951 https://jira.mongodb.org/browse/SERVER-16291
952 https://jira.mongodb.org/browse/SERVER-15958
953 https://jira.mongodb.org/browse/SERVER-15899
954 https://jira.mongodb.org/browse/SERVER-15696
955 https://jira.mongodb.org/browse/SERVER-15639
956 https://jira.mongodb.org/browse/SERVER-15580
957 https://jira.mongodb.org/browse/SERVER-15528
958 https://jira.mongodb.org/browse/SERVER-15485
959 https://jira.mongodb.org/browse/SERVER-15403
960 https://jira.mongodb.org/browse/SERVER-15233
961 https://jira.mongodb.org/browse/SERVER-14799
962 https://jira.mongodb.org/browse/SERVER-16107
963 https://jira.mongodb.org/browse/SERVER-15994
964 https://jira.mongodb.org/browse/SERVER-15849
965 https://jira.mongodb.org/browse/SERVER-15491
Sharding
SERVER-15318966 copydb should not use exhaust flag when used against mongos
SERVER-14728967 Shard depends on string comparison of replica set connection string
SERVER-14506968 special top chunk logic can move max chunk to a shard with incompatible tag
SERVER-14299969 For sharded limit=N queries with sort, mongos can request >N results from shard
SERVER-14080970 Have migration result reported in the changelog correctly
SERVER-12472971 Fail MoveChunk if an index is needed on TO shard and data exists
Storage
SERVER-16283972 Cant start new wiredtiger node with log file or config file in data directory - false detection
of old mmapv1 files
SERVER-15986973 Starting with different storage engines in the same dbpath should error/warn
SERVER-14057974 Changing TTL expiration time with collMod does not correctly update index definition
Data Aggregation SERVER-15552977 Errors writing to temporary collections during mapReduce command exe-
cution should be operation-fatal
2.6.5 Changes
Security
SERVER-15465992 OpenSSL crashes on stepdown
SERVER-15360993 User document changes made on a 2.4 primary and replicated to a 2.6 secondary dont make
the 2.6 secondary invalidate its user cache
SERVER-14887994 Allow user document changes made on a 2.4 primary to replicate to a 2.6 secondary
SERVER-14727995 Details of SASL failures arent logged
SERVER-12551996 Audit DML/CRUD operations
Querying
SERVER-15287998 Query planner sort analysis incorrectly allows index key pattern plugin fields to provide sort
SERVER-15286999 Assertion in date indexes when opposite-direction-sorted and double or filtered
SERVER-152791000 Disable hash-based index intersection (AND_HASH) by default
SERVER-151521001 When evaluating plans, some index candidates cause complete index scan
SERVER-150151002 Assertion failure when combining $max and $min and reverse index scan
SERVER-150121003 Server crashes on indexed rooted $or queries using a 2d index
SERVER-149691004 Dropping index during active aggregation operation can crash server
SERVER-149611005 Plan ranker favors intersection plans if predicate generates empty range index scan
SERVER-148921006 Invalid {$elemMatch: {$where}} query causes memory leak
SERVER-147061007 Queries that use negated $type predicate over a field may return incomplete results when
an index is present on that field
SERVER-131041008 Plan enumerator doesnt enumerate all possibilities for a nested $or
SERVER-149841009 Server aborts when running $centerSphere query with NaN radius
SERVER-149811010 Server aborts when querying against 2dsphere index with
coarsestIndexedLevel:0
SERVER-148311011 Text search trips assertion when default language only supported in
textIndexVersion=1 used
Replication
SERVER-150381012 Multiple background index builds may not interrupt cleanly for commands, on secondaries
SERVER-148871013 Allow user document changes made on a 2.4 primary to replicate to a 2.6 secondary
SERVER-148051014 Use multithreaded oplog replay during initial sync
Sharding
SERVER-150561015 Sharded connection cleanup on setup error can crash mongos
SERVER-137021016 Commands without optional query may target to wrong shards on mongos
998 https://jira.mongodb.org/browse/SERVER-15287
999 https://jira.mongodb.org/browse/SERVER-15286
1000 https://jira.mongodb.org/browse/SERVER-15279
1001 https://jira.mongodb.org/browse/SERVER-15152
1002 https://jira.mongodb.org/browse/SERVER-15015
1003 https://jira.mongodb.org/browse/SERVER-15012
1004 https://jira.mongodb.org/browse/SERVER-14969
1005 https://jira.mongodb.org/browse/SERVER-14961
1006 https://jira.mongodb.org/browse/SERVER-14892
1007 https://jira.mongodb.org/browse/SERVER-14706
1008 https://jira.mongodb.org/browse/SERVER-13104
1009 https://jira.mongodb.org/browse/SERVER-14984
1010 https://jira.mongodb.org/browse/SERVER-14981
1011 https://jira.mongodb.org/browse/SERVER-14831
1012 https://jira.mongodb.org/browse/SERVER-15038
1013 https://jira.mongodb.org/browse/SERVER-14887
1014 https://jira.mongodb.org/browse/SERVER-14805
1015 https://jira.mongodb.org/browse/SERVER-15056
1016 https://jira.mongodb.org/browse/SERVER-13702
Storage
SERVER-153691018 explicitly zero .ns files on creation
SERVER-153191019 Verify 2.8 freelist is upgrade-downgrade safe with 2.6
SERVER-151111020 partially written journal last section causes recovery to fail
Indexing
SERVER-148481021 Port index_id_desc.js to v2.6 and master branches
SERVER-142051022 ensureIndex failure reports ok: 1 on some failures
Write Operations
SERVER-151061023 Incorrect nscanned and nscannedObjects for idhack updates in 2.6.4 profiler or slow query
log
SERVER-150291024 The $rename modifier uses incorrect dotted source path
SERVER-148291025 UpdateIndexData::clear() should reset all member variables
Data Aggregation
SERVER-150871026 Server crashes when running concurrent mapReduce and dropDatabase commands
SERVER-149691027 Dropping index during active aggregation operation can crash server
SERVER-141681028 Warning logged when incremental MR collections are unsuccessfully dropped on secon-
daries
Packaging
SERVER-146791029 (CentOS 7/RHEL 7) init.d script should create directory for pid file if it is missing
SERVER-140231030 Support for RHEL 7 Enterprise .rpm packages
SERVER-132431031 Support for Ubuntu 14 Trusty Enterprise .deb packages
SERVER-110771032 Support for Debian 7 Enterprise .deb packages
1017 https://jira.mongodb.org/browse/SERVER-15156
1018 https://jira.mongodb.org/browse/SERVER-15369
1019 https://jira.mongodb.org/browse/SERVER-15319
1020 https://jira.mongodb.org/browse/SERVER-15111
1021 https://jira.mongodb.org/browse/SERVER-14848
1022 https://jira.mongodb.org/browse/SERVER-14205
1023 https://jira.mongodb.org/browse/SERVER-15106
1024 https://jira.mongodb.org/browse/SERVER-15029
1025 https://jira.mongodb.org/browse/SERVER-14829
1026 https://jira.mongodb.org/browse/SERVER-15087
1027 https://jira.mongodb.org/browse/SERVER-14969
1028 https://jira.mongodb.org/browse/SERVER-14168
1029 https://jira.mongodb.org/browse/SERVER-14679
1030 https://jira.mongodb.org/browse/SERVER-14023
1031 https://jira.mongodb.org/browse/SERVER-13243
1032 https://jira.mongodb.org/browse/SERVER-11077
2.6.4 Changes
Security
SERVER-147011041 The backup auth role should allow running the collstats command for all resources
SERVER-145181042 Allow disabling hostname validation for SSL
SERVER-142681043 Potential information leak
SERVER-141701044 Cannot read from secondary if both audit and auth are enabled in a sharded cluster
SERVER-138331045 userAdminAnyDatabase role should be able to create indexes on admin.system.users and
admin.system.roles
SERVER-125121046 Add role-based, selective audit logging.
SERVER-94821047 Add build flag for sslFIPSMode
Querying
SERVER-146251048 Query planner can construct incorrect bounds for negations inside $elemMatch
1033 https://jira.mongodb.org/browse/SERVER-10642
1034 https://jira.mongodb.org/browse/SERVER-14964
1035 https://jira.mongodb.org/browse/SERVER-12551
1036 https://jira.mongodb.org/browse/SERVER-14904
1037 https://jira.mongodb.org/browse/SERVER-13770
1038 https://jira.mongodb.org/browse/SERVER-14284
1039 https://jira.mongodb.org/browse/SERVER-14076
1040 https://jira.mongodb.org/browse/SERVER-14778
1041 https://jira.mongodb.org/browse/SERVER-14701
1042 https://jira.mongodb.org/browse/SERVER-14518
1043 https://jira.mongodb.org/browse/SERVER-14268
1044 https://jira.mongodb.org/browse/SERVER-14170
1045 https://jira.mongodb.org/browse/SERVER-13833
1046 https://jira.mongodb.org/browse/SERVER-12512
1047 https://jira.mongodb.org/browse/SERVER-9482
1048 https://jira.mongodb.org/browse/SERVER-14625
SERVER-146071049 hash intersection of fetched and non-fetched data can discard data from a result
SERVER-145321050 Improve logging in the case of plan ranker ties
SERVER-143501051 Server crash when $centerSphere has non-positive radius
SERVER-143171052 Dead code in IDHackRunner::applyProjection
SERVER-143111053 skipping of index keys is not accounted for in plan ranking by the index scan stage
SERVER-141231054 some operations can create BSON object larger than the 16MB limit
SERVER-140341055 Sorted $in query with large number of elements cant use merge sort
SERVER-139941056 do not aggressively pre-fetch data for parallelCollectionScan
Replication
SERVER-146651057 Build failure for v2.6 in closeall.js caused by access violation reading _me
SERVER-145051058 cannot dropAllIndexes when index builds in progress assertion failure
SERVER-144941059 Dropping collection during active background index build on secondary triggers segfault
SERVER-138221060 Running resync before replset config is loaded can crash mongod
SERVER-117761061 Replication isself check should allow mapped ports
Sharding
SERVER-145511062 Runner yield during migration cleanup (removeRange) results in fassert
SERVER-144311063 Invalid chunk data after splitting on a key thats too large
SERVER-142611064 stepdown during migration range delete can abort mongod
SERVER-140321065 v2.6 mongos doesnt verify _id is present for config server upserts
SERVER-136481066 better stats from migration cleanup
SERVER-127501067 mongos shouldnt accept initial query with exhaust flag set
SERVER-97881068 mongos does not re-evaluate read preference once a valid replica set member is chosen
SERVER-95261069 Log messages regarding chunks not very informative when the shard key is of type BinData
1049 https://jira.mongodb.org/browse/SERVER-14607
1050 https://jira.mongodb.org/browse/SERVER-14532
1051 https://jira.mongodb.org/browse/SERVER-14350
1052 https://jira.mongodb.org/browse/SERVER-14317
1053 https://jira.mongodb.org/browse/SERVER-14311
1054 https://jira.mongodb.org/browse/SERVER-14123
1055 https://jira.mongodb.org/browse/SERVER-14034
1056 https://jira.mongodb.org/browse/SERVER-13994
1057 https://jira.mongodb.org/browse/SERVER-14665
1058 https://jira.mongodb.org/browse/SERVER-14505
1059 https://jira.mongodb.org/browse/SERVER-14494
1060 https://jira.mongodb.org/browse/SERVER-13822
1061 https://jira.mongodb.org/browse/SERVER-11776
1062 https://jira.mongodb.org/browse/SERVER-14551
1063 https://jira.mongodb.org/browse/SERVER-14431
1064 https://jira.mongodb.org/browse/SERVER-14261
1065 https://jira.mongodb.org/browse/SERVER-14032
1066 https://jira.mongodb.org/browse/SERVER-13648
1067 https://jira.mongodb.org/browse/SERVER-12750
1068 https://jira.mongodb.org/browse/SERVER-9788
1069 https://jira.mongodb.org/browse/SERVER-9526
Storage
SERVER-141981070 Std::set<pointer> and Windows Heap Allocation Reuse produces non-deterministic results
SERVER-139751071 Creating index on collection named system can cause server to abort
SERVER-137291072 Reads & Writes are blocked during data file allocation on Windows
SERVER-136811073 mongod B stalls during background flush on Windows
Indexing SERVER-144941074 Dropping collection during active background index build on secondary triggers seg-
fault
Write Ops
SERVER-142571075 remove command can cause process termination by throwing unhandled exception if
profiling is enabled
SERVER-140241076 Update fails when query contains part of a DBRef and results in an insert (upsert:true)
SERVER-137641077 debug mechanisms report incorrect nscanned / nscannedObjects for updates
Geo
SERVER-140391079 $nearSphere query with 2d index, skip, and limit returns incomplete results
SERVER-137011080 Query using 2d index throws exception when using explain()
Text Search
SERVER-147381081 Updates to documents with text-indexed fields may lead to incorrect entries
SERVER-140271082 Renaming collection within same database fails if wildcard text index present
Tools
SERVER-142121083 mongorestore may drop system users and roles
SERVER-140481084 mongodump against mongos cant send dump to standard output
1070 https://jira.mongodb.org/browse/SERVER-14198
1071 https://jira.mongodb.org/browse/SERVER-13975
1072 https://jira.mongodb.org/browse/SERVER-13729
1073 https://jira.mongodb.org/browse/SERVER-13681
1074 https://jira.mongodb.org/browse/SERVER-14494
1075 https://jira.mongodb.org/browse/SERVER-14257
1076 https://jira.mongodb.org/browse/SERVER-14024
1077 https://jira.mongodb.org/browse/SERVER-13764
1078 https://jira.mongodb.org/browse/SERVER-13734
1079 https://jira.mongodb.org/browse/SERVER-14039
1080 https://jira.mongodb.org/browse/SERVER-13701
1081 https://jira.mongodb.org/browse/SERVER-14738
1082 https://jira.mongodb.org/browse/SERVER-14027
1083 https://jira.mongodb.org/browse/SERVER-14212
1084 https://jira.mongodb.org/browse/SERVER-14048
Admin
SERVER-145561085 Default dbpath for mongod --configsvr changes in 2.6
SERVER-143551086 Allow dbAdmin role to manually create system.profile collections
JavaScript
SERVER-142541088 Do not store native function pointer as a property in function prototype
SERVER-137981089 v8 garbage collection can cause crash due to independent lifetime of DBClient and Cursor
objects
SERVER-137071090 mongo shell may crash when converting invalid regular expression
Shell
SERVER-143411091 negative opcounter values in serverStatus
SERVER-141071092 Querying for a document containing a value of either type Javascript or JavascriptWith-
Scope crashes the shell
Testing
SERVER-147311096 plan_cache_ties.js sometimes fails
SERVER-141471097 make index_multi.js retry on connection failure
SERVER-136151098 sharding_rs2.js intermittent failure due to reliance on opcounters
1085 https://jira.mongodb.org/browse/SERVER-14556
1086 https://jira.mongodb.org/browse/SERVER-14355
1087 https://jira.mongodb.org/browse/SERVER-14283
1088 https://jira.mongodb.org/browse/SERVER-14254
1089 https://jira.mongodb.org/browse/SERVER-13798
1090 https://jira.mongodb.org/browse/SERVER-13707
1091 https://jira.mongodb.org/browse/SERVER-14341
1092 https://jira.mongodb.org/browse/SERVER-14107
1093 https://jira.mongodb.org/browse/SERVER-13833
1094 https://jira.mongodb.org/browse/SERVER-12512
1095 https://jira.mongodb.org/browse/SERVER-14341
1096 https://jira.mongodb.org/browse/SERVER-14731
1097 https://jira.mongodb.org/browse/SERVER-14147
1098 https://jira.mongodb.org/browse/SERVER-13615
2.6.3 Changes
SERVER-143021099 Fixed: Equality queries on _id with projection may return no results on sharded collec-
tions
SERVER-143041100 Fixed: Equality queries on _id with projection on _id may return orphan documents on
sharded collections
2.6.2 Changes
Security
SERVER-137271101 The backup (page 420) authorization role now includes privileges to run the collStats
command.
SERVER-138041102 The built-in role restore (page 420) now has privileges on system.roles collection.
SERVER-136121103 Fixed: SSL-enabled server appears not to be sending the list of supported certificate issuers
to the client
SERVER-137531104 Fixed: mongod may terminate if x.509 authentication certificate is invalid
SERVER-139451105 For replica set/sharded cluster member authentication (page 355), now matches x.509 clus-
ter certificates by attributes instead of by substring comparison.
SERVER-138681106 Now marks V1 users as probed on databases that do not have surrogate user documents.
SERVER-138501107 Now ensures that the user cache entry is up to date before using it to determine a users
roles in user management commands on mongos.
SERVER-135881108 Fixed: Shell prints startup warning when auth enabled
Querying
SERVER-137311109 Fixed: Stack overflow when parsing deeply nested $not query
SERVER-138901110 Fixed: Index bounds builder constructs invalid bounds for multiple negations joined by an
$or
SERVER-137521111 Verified assertion on empty $in clause and sort on second field in a compound index.
SERVER-133371112 Re-enabled idhack for queries with projection.
SERVER-137151113 Fixed: Aggregation pipeline execution can fail with $or and blocking sorts
SERVER-137141114 Fixed: non-top-level indexable $not triggers query planning bug
1099 https://jira.mongodb.org/browse/SERVER-14302
1100 https://jira.mongodb.org/browse/SERVER-14304
1101 https://jira.mongodb.org/browse/SERVER-13727
1102 https://jira.mongodb.org/browse/SERVER-13804
1103 https://jira.mongodb.org/browse/SERVER-13612
1104 https://jira.mongodb.org/browse/SERVER-13753
1105 https://jira.mongodb.org/browse/SERVER-13945
1106 https://jira.mongodb.org/browse/SERVER-13868
1107 https://jira.mongodb.org/browse/SERVER-13850
1108 https://jira.mongodb.org/browse/SERVER-13588
1109 https://jira.mongodb.org/browse/SERVER-13731
1110 https://jira.mongodb.org/browse/SERVER-13890
1111 https://jira.mongodb.org/browse/SERVER-13752
1112 https://jira.mongodb.org/browse/SERVER-13337
1113 https://jira.mongodb.org/browse/SERVER-13715
1114 https://jira.mongodb.org/browse/SERVER-13714
SERVER-137691115 Fixed: distinct command on indexed field with geo predicate fails to execute
SERVER-136751116 Fixed Plans with differing performance can tie during plan ranking
SERVER-138991117 Fixed: Whole index scan query solutions can use incompatible indexes, return incorrect
results
SERVER-138521118 Fixed IndexBounds::endKeyInclusive not initialized by constructor
SERVER-140731119 planSummary no longer truncated at 255 characters
SERVER-141741120 Fixed: If ntoreturn is a limit (rather than batch size) extra data gets buffered during plan
ranking
SERVER-137891121 Some nested queries no longer trigger an assertion error
SERVER-140641122 Added planSummary information for count command log message.
SERVER-139601123 Queries containing $or no longer miss results if multiple clauses use the same index.
SERVER-141801124 Fixed: Crash with and clause, $elemMatch, and nested $mod or regex
SERVER-141761125 Natural order sort specification no longer ignored if query is specified.
SERVER-137541126 Bounds no longer combined for $or queries that can use merge sort.
Geospatial SERVER-136871127 Results of $near query on compound multi-key 2dsphere index are now sorted by
distance.
Write Operations SERVER-138021128 Insert field validation no longer stops at first Timestamp() field.
Replication
SERVER-139931129 Fixed: log a message when shouldChangeSyncTarget() believes a node should
change sync targets
SERVER-139761130 Fixed: Cloner needs to detect failure to create collection
Sharding
SERVER-136161131 Resolved: type 7 (OID) error when acquiring distributed lock for first time
SERVER-138121132 Now catches exception thrown by getShardsForQuery for geo query.
1115 https://jira.mongodb.org/browse/SERVER-13769
1116 https://jira.mongodb.org/browse/SERVER-13675
1117 https://jira.mongodb.org/browse/SERVER-13899
1118 https://jira.mongodb.org/browse/SERVER-13852
1119 https://jira.mongodb.org/browse/SERVER-14073
1120 https://jira.mongodb.org/browse/SERVER-14174
1121 https://jira.mongodb.org/browse/SERVER-13789
1122 https://jira.mongodb.org/browse/SERVER-14064
1123 https://jira.mongodb.org/browse/SERVER-13960
1124 https://jira.mongodb.org/browse/SERVER-14180
1125 https://jira.mongodb.org/browse/SERVER-14176
1126 https://jira.mongodb.org/browse/SERVER-13754
1127 https://jira.mongodb.org/browse/SERVER-13687
1128 https://jira.mongodb.org/browse/SERVER-13802
1129 https://jira.mongodb.org/browse/SERVER-13993
1130 https://jira.mongodb.org/browse/SERVER-13976
1131 https://jira.mongodb.org/browse/SERVER-13616
1132 https://jira.mongodb.org/browse/SERVER-13812
SERVER-141381133 mongos will now correctly target multiple shards for nested field shard key predicates.
SERVER-113321134 Fixed: Authentication requests delayed if first config server is unresponsive
Map/Reduce
SERVER-141861135 Resolved: rs.stepDown during mapReduce causes fassert in logOp
SERVER-139811136 Temporary map/reduce collections are now correctly replicated to secondaries.
Storage
SERVER-137501137 convertToCapped on empty collection no longer aborts after invariant() failure.
SERVER-140561138 Moving large collection across databases with renameCollection no longer triggers fatal
assertion.
SERVER-140821139 Fixed: Excessive freelist scanning for MaxBucket
SERVER-137371140 CollectionOptions parser now skips non-numeric for size/max elements if values non-
numeric.
Diagnostics
SERVER-135871144 Resolved: ndeleted (page 303) in system.profile documents reports 1 too few
documents removed
SERVER-133681145 Improved exposure of timing information in currentOp.
Tools
SERVER-104641147 mongodump can now query oplog.$main and oplog.rs when using --dbpath.
SERVER-137601148 mongoexport can now handle large timestamps on Windows.
Shell
SERVER-138651149 Shell now returns correct WriteResult for compatibility-mode upsert with non-OID
equality predicate on _id field.
SERVER-130371150 Fixed typo in error message for compatibility mode.
Internal Code
SERVER-137941151 Fixed: Unused snapshot history consuming significant heap space
SERVER-134461152 Removed Solaris builds dependency on ILLUMOS libc.
SERVER-140921153 MongoDB upgrade 2.4 to 2.6 check no longer returns an error in internal collections.
SERVER-140001154 Added new lsb file location for Debian 7.1
Testing
SERVER-137231155 Stabilized tags.js after a change in its timeout when it was ported to use write com-
mands.
SERVER-134941156 Fixed: setup_multiversion_mongodb.py doesnt download 2.4.10 because of
non-numeric version sorting
SERVER-136031157 Fixed: Test suites with options tests fail when run with --nopreallocj
SERVER-139481158 Fixed: awaitReplication() failures related to getting a config version from master
causing test failures
SERVER-138391159 Fixed sync2.js failure.
SERVER-139721160 Fixed connections_opened.js failure.
SERVER-137121161 Reduced peak disk usage of test suites.
SERVER-142491162 Added tests for querying oplog via mongodump using --dbpath
SERVER-104621163 Fixed: Windows file locking related buildbot failures
1147 https://jira.mongodb.org/browse/SERVER-10464
1148 https://jira.mongodb.org/browse/SERVER-13760
1149 https://jira.mongodb.org/browse/SERVER-13865
1150 https://jira.mongodb.org/browse/SERVER-13037
1151 https://jira.mongodb.org/browse/SERVER-13794
1152 https://jira.mongodb.org/browse/SERVER-13446
1153 https://jira.mongodb.org/browse/SERVER-14092
1154 https://jira.mongodb.org/browse/SERVER-14000
1155 https://jira.mongodb.org/browse/SERVER-13723
1156 https://jira.mongodb.org/browse/SERVER-13494
1157 https://jira.mongodb.org/browse/SERVER-13603
1158 https://jira.mongodb.org/browse/SERVER-13948
1159 https://jira.mongodb.org/browse/SERVER-13839
1160 https://jira.mongodb.org/browse/SERVER-13972
1161 https://jira.mongodb.org/browse/SERVER-13712
1162 https://jira.mongodb.org/browse/SERVER-14249
1163 https://jira.mongodb.org/browse/SERVER-10462
2.6.1 Changes
Querying
SERVER-130661169 Negations over multikey fields do not use index
SERVER-134951170 Concurrent GETMORE and KILLCURSORS operations can cause race condition and server
crash
SERVER-135031171 The $where operator should not be allowed under $elemMatch
SERVER-135371172 Large skip and and limit values can cause crash in blocking sort stage
SERVER-135571173 Incorrect negation of $elemMatch value in 2.6
SERVER-135621174 Queries that use tailable cursors do not stream results if skip() is applied
SERVER-135661175 Using the OplogReplay flag with extra predicates can yield incorrect results
SERVER-136111176 Missing sort order for compound index leads to unnecessary in-memory sort
SERVER-136181177 Optimization for sorted $in queries not applied to reverse sort order
SERVER-136611178 Increase the maximum allowed depth of query objects
SERVER-136641179 Query with $elemMatch using a compound multikey index can generate incorrect results
SERVER-136771180 Query planner should traverse through $all while handling $elemMatch object predicates
SERVER-137661181 Dropping index or collection while $or query is yielding triggers fatal assertion
1164 https://jira.mongodb.org/browse/SERVER-13739
1165 https://jira.mongodb.org/browse/SERVER-13287
1166 https://jira.mongodb.org/browse/SERVER-13563
1167 https://jira.mongodb.org/browse/SERVER-13691
1168 https://jira.mongodb.org/browse/SERVER-13515
1169 https://jira.mongodb.org/browse/SERVER-13066
1170 https://jira.mongodb.org/browse/SERVER-13495
1171 https://jira.mongodb.org/browse/SERVER-13503
1172 https://jira.mongodb.org/browse/SERVER-13537
1173 https://jira.mongodb.org/browse/SERVER-13557
1174 https://jira.mongodb.org/browse/SERVER-13562
1175 https://jira.mongodb.org/browse/SERVER-13566
1176 https://jira.mongodb.org/browse/SERVER-13611
1177 https://jira.mongodb.org/browse/SERVER-13618
1178 https://jira.mongodb.org/browse/SERVER-13661
1179 https://jira.mongodb.org/browse/SERVER-13664
1180 https://jira.mongodb.org/browse/SERVER-13677
1181 https://jira.mongodb.org/browse/SERVER-13766
Geospatial
SERVER-136661182 $near queries with out-of-bounds points in legacy format can lead to crashes
SERVER-135401183 The geoNear command no longer returns distance in radians for legacy point
SERVER-134861184 : The geoNear command can create too large BSON objects for aggregation.
Replication
SERVER-135001185 Changing replica set configuration can crash running members
SERVER-135891186 Background index builds from a 2.6.0 primary fail to complete on 2.4.x secondaries
SERVER-136201187 Replicated data definition commands will fail on secondaries during background index
build
SERVER-134961188 Creating index with same name but different spec in mixed version replicaset can abort
replication
Sharding
SERVER-126381189 Initial sharding with hashed shard key can result in duplicate split points
SERVER-135181190 The _id field is no longer automatically generated by mongos when missing
SERVER-137771191 Migrated ranges waiting for deletion do not report cursors still open
Security
SERVER-93581192 Log rotation can overwrite previous log files
SERVER-136441193 Sensitive credentials in startup options are not redacted and may be exposed
SERVER-134411194 Inconsistent error handling in user management shell helpers
Write Operations
SERVER-134661195 Error message in collection creation failure contains incorrect namespace
SERVER-134991196 Yield policy for batch-inserts should be the same as for batch-updates/deletes
SERVER-135161197 Array updates on documents with more than 128 BSON elements may crash mongod
1182 https://jira.mongodb.org/browse/SERVER-13666
1183 https://jira.mongodb.org/browse/SERVER-13540
1184 https://jira.mongodb.org/browse/SERVER-13486
1185 https://jira.mongodb.org/browse/SERVER-13500
1186 https://jira.mongodb.org/browse/SERVER-13589
1187 https://jira.mongodb.org/browse/SERVER-13620
1188 https://jira.mongodb.org/browse/SERVER-13496
1189 https://jira.mongodb.org/browse/SERVER-12638
1190 https://jira.mongodb.org/browse/SERVER-13518
1191 https://jira.mongodb.org/browse/SERVER-13777
1192 https://jira.mongodb.org/browse/SERVER-9358
1193 https://jira.mongodb.org/browse/SERVER-13644
1194 https://jira.mongodb.org/browse/SERVER-13441
1195 https://jira.mongodb.org/browse/SERVER-13466
1196 https://jira.mongodb.org/browse/SERVER-13499
1197 https://jira.mongodb.org/browse/SERVER-13516
Decreased mongos memory footprint when shards have several tags SERVER-166831217
Removed check for shard version if the primary server is down SERVER-162371218
Fixed: /etc/init.d/mongod startup script failure with dirname message SERVER-160811219
Fixed: mongos can cause shards to hit the in-memory sort limit by requesting more results than needed
SERVER-143061220
All issues closed in 2.6.71221
Fixed: Evaluating candidate query plans with concurrent writes on same collection may crash mongod
SERVER-155801222
Fixed: 2.6 mongod crashes with segfault when added to a 2.8 replica set with 12 or more members SERVER-
161071223
Fixed: $regex, $in and $sort with index returns too many results SERVER-156961224
Change: moveChunk will fail if there is data on the target shard and a required index does not exist. SERVER-
124721225
Primary should abort if encountered problems writing to the oplog SERVER-120581226
All issues closed in 2.6.61227
Plan ranker will no longer favor intersection plans if predicate generates empty range index scan SERVER-
149611231
Generate Community and Enterprise packages for SUSE 11 SERVER-106421232
All issues closed in 2.6.51233
Fix for text index where under specific circumstances, in-place updates to a text-indexed field may result in
incorrect/incomplete results SERVER-147381234
Check the size of the split point before performing a manual split chunk operation SERVER-144311235
Ensure read preferences are re-evaluated by drawing secondary connections from a global pool and releasing
back to the pool at the end of a query/command SERVER-97881236
Allow read from secondaries when both audit and authorization are enabled in a sharded cluster SERVER-
141701237
All issues closed in 2.6.41238
Equality queries on _id with projection may return no results on sharded collections SERVER-143021239 .
Equality queries on _id with projection on _id may return orphan documents on sharded collections SERVER-
143041240 .
All issues closed in 2.6.31241 .
Query plans with differing performance can tie during plan ranking SERVER-136751242 .
mongod may terminate if x.509 authentication certificate is invalid SERVER-137531243 .
Temporary map/reduce collections are incorrectly replicated to secondaries SERVER-139811244 .
mongos incorrectly targets multiple shards for nested field shard key predicates SERVER-141381245 .
rs.stepDown() during mapReduce causes fassert when writing to op log SERVER-141861246 .
1231 https://jira.mongodb.org/browse/SERVER-14961
1232 https://jira.mongodb.org/browse/SERVER-10642
1233 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.6.5%22%20AND%20project%20%3D%20SERVER
1234 https://jira.mongodb.org/browse/SERVER-14738
1235 https://jira.mongodb.org/browse/SERVER-14431
1236 https://jira.mongodb.org/browse/SERVER-9788
1237 https://jira.mongodb.org/browse/SERVER-14170
1238 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.6.4%22%20AND%20project%20%3D%20SERVER
1239 https://jira.mongodb.org/browse/SERVER-14302
1240 https://jira.mongodb.org/browse/SERVER-14304
1241 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.6.3%22%20AND%20project%20%3D%20SERVER
1242 https://jira.mongodb.org/browse/SERVER-13675
1243 https://jira.mongodb.org/browse/SERVER-13753
1244 https://jira.mongodb.org/browse/SERVER-13981
1245 https://jira.mongodb.org/browse/SERVER-14138
1246 https://jira.mongodb.org/browse/SERVER-14186
Fix to install MongoDB service on Windows with the --install option SERVER-135151248 .
Allow direct upgrade from 2.4.x to 2.6.0 via yum SERVER-135631249 .
Fix issues with background index builds on secondaries: SERVER-135891250 and SERVER-136201251 .
Redact credential information passed as startup options SERVER-136441252 .
2.6.1 Changelog (page 980).
All issues closed in 2.6.11253 .
Major Changes
The following changes in MongoDB affect both the standard and Enterprise editions:
Aggregation Enhancements
The aggregation pipeline adds the ability to return result sets of any size, either by returning a cursor or writing the
output to a collection. Additionally, the aggregation pipeline supports variables and adds new operations to handle sets
and redact data.
The db.collection.aggregate() now returns a cursor, which enables the aggregation pipeline to return
result sets of any size.
Aggregation pipelines now support an explain operation to aid analysis of aggregation operations.
Aggregation can now use a more efficient external-disk-based sorting process.
New pipeline stages:
$out stage to output to a collection.
$redact stage to allow additional control to accessing the data.
New or modified operators:
set expression operators.
$let and $map operators to allow for the use of variables.
$literal operator and $size operator.
$cond expression now accepts either an object or an array.
1247 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.6.2%22%20AND%20project%20%3D%20SERVER
1248 https://jira.mongodb.org/browse/SERVER-13515
1249 https://jira.mongodb.org/browse/SERVER-13563
1250 https://jira.mongodb.org/browse/SERVER-13589
1251 https://jira.mongodb.org/browse/SERVER-13620
1252 https://jira.mongodb.org/browse/SERVER-13644
1253 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.6.1%22%20AND%20project%20%3D%20SERVER
Text search is now enabled by default, and the query system, including the aggregation pipeline $match stage,
includes the $text operator, which resolves text-search queries.
MongoDB 2.6 includes an updated text index (page 508) format and deprecates the text command.
Improvements to the update and insert systems include additional operations and improvements that increase consis-
tency of modified data.
MongoDB preserves the order of the document fields following write operations except for the following cases:
The _id field is always the first field in the document.
Updates that include renaming of field names may result in the reordering of fields in the document.
New or enhanced update operators:
$bit operator supports bitwise xor operation.
$min and $max operators that perform conditional update depending on the relative size of the specified
value and the current value of a field.
$push operator has enhanced support for the $sort, $slice, and $each modifiers and supports a new
$position modifier.
$currentDate operator to set the value of a field to the current date.
The $mul operator for multiplicative increments for insert and update operations.
See also:
Update Operator Syntax Validation (page 996)
A new write protocol integrates write operations with write concerns. The protocol also provides improved support
for bulk operations.
MongoDB 2.6 adds the write commands insert, update, and delete, which provide the basis for the improved
bulk insert. All officially supported MongoDB drivers support the new write commands.
The mongo shell now includes methods to perform bulk-write operations. See Bulk() for more information.
See also:
Write Method Acknowledgements (page 992)
MongoDB now distributes MSI packages for Microsoft Windows. This is the recommended method for MongoDB
installation under Windows.
Security Improvements
MongoDB 2.6 enhances support for secure deployments through improved SSL support, x.509-based authentication,
an improved authorization system with more granular controls, as well as centralized credential storage, and improved
user management tools.
Specifically these changes include:
A new authorization model (page 331) that provides the ability to create custom User-Defined Roles (page 335)
and the ability to specify user privileges at a collection-level granularity.
Global user management, which stores all user and user-defined role data in the admin database and provides
a new set of commands for managing users and roles.
x.509 certificate authentication for client authentication (page 353) as well as for internal authentication
(page 355) of sharded and/or replica set cluster members. x.509 authentication is only available for deploy-
ments using SSL.
Enhanced SSL Support:
Rolling upgrades of clusters (page 390) to use SSL.
MongoDB Tools (page 389) support connections to mongod and mongos instances using SSL connec-
tions.
Prompt for passphrase (page 386) by mongod or mongos at startup.
Require the use of strong SSL ciphers, with a minimum 128-bit key length for all connections. The strong-
cipher requirement prevents an old or malicious client from forcing use of a weak cipher.
MongoDB disables the http interface by default, limiting network exposure (page 343). To enable the interface,
see enabled.
See also:
New Authorization Model (page 994), SSL Certificate Hostname Validation (page 994), and Security Checklist
(page 315).
MongoDB can now use index intersection (page 524) to fulfill queries supported by more than one index.
Index Filters (page 73) to limit which indexes can become the winning plan for a query.
https://docs.mongodb.org/manual/reference/method/js-plan-cache methods to view
and clear the query plans (page 72) cached by the query optimizer.
MongoDB can now use count() with hint(). See count() for details.
Improvements
Geospatial Enhancements
See also:
2dsphere Index Version 2 (page 995), $maxDistance Changes (page 997), Deprecated $uniqueDocs (page 998),
Stronger Validation of Geospatial Queries (page 998)
Background index build (page 522) allowed on secondaries. If you initiate a background index build on a
primary, the secondaries will replicate the index build in the background.
Automatic rebuild of interrupted index builds after a restart.
If a standalone or a primary instance terminates during an index build without a clean shutdown, mongod
now restarts the index build when the instance restarts. If the instance shuts down cleanly or if a user kills
the index build, the interrupted index builds do not automatically restart upon the restart of the server.
If a secondary instance terminates during an index build, the mongod instance will now restart the inter-
rupted index build when the instance restarts.
To disable this behavior, use the --noIndexBuildRetry command-line option.
ensureIndex() now wraps a new createIndex command.
The dropDups option to ensureIndex() and createIndex is deprecated.
See also:
Enforce Index Key Length Limit (page 991)
MongoDB 2.6 supports a YAML-based configuration file format in addition to the previous configuration file format.
See the documentation of the Configuration File for more information.
Operational Changes
Storage
usePowerOf2Sizes is now the default allocation strategy for all new collections. The new allocation strategy uses
more storage relative to total document size but results in lower levels of storage fragmentation and more predictable
storage capacity planning over time.
To use the previous exact-fit allocation strategy:
For a specific collection, use collMod with usePowerOf2Sizes set to false.
Networking
Removed upward limit for the maxIncomingConnections for mongod and mongos. Previous versions
capped the maximum possible maxIncomingConnections setting at 20,000 connections.
Connection pools for a mongos instance may be used by multiple MongoDB servers. This can reduce the
number of connections needed for high-volume workloads and reduce resource consumption in sharded clusters.
The C++ driver now monitors replica set health with the isMaster command instead of
replSetGetStatus. This allows the C++ driver to support systems that require authentication.
New cursor.maxTimeMS() and corresponding maxTimeMS option for commands to specify a time limit.
Tool Improvements
MongoDB Enterprise for Windows (page 52) is now available. It includes support for Kerberos, SSL, and SNMP.
MongoDB Enterprise for Windows does not include LDAP support for authentication. However, MongoDB Enterprise
for Linux supports using LDAP authentication with an ActiveDirectory server.
MongoDB Enterprise for Windows includes OpenSSL version 1.0.1g.
1254 https://docs.mongodb.org/v2.6/core/storage
Auditing
MongoDB Enterprise adds auditing (page 340) capability for mongod and mongos instances. See Auditing
(page 340) for details.
MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure a
MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory Ac-
cess Protocol (LDAP) service. See Authenticate Using SASL and LDAP with OpenLDAP (page 370) and Authenticate
Using SASL and LDAP with ActiveDirectory (page 367) for details.
MongoDB Enterprise for Windows does not include LDAP support for authentication. However, MongoDB Enterprise
for Linux supports using LDAP authentication with an ActiveDirectory server.
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 1001) for upgrade instructions.
MongoDB Enterprise has greatly expanded its SNMP support to provide SNMP access to nearly the full range of
metrics provided by db.serverStatus().
See also:
SNMP Changes (page 995)
Additional Information
On this page
Index Changes (page 991)
Write Method Acknowledgements (page 992)
db.collection.aggregate() Change (page 993)
Write Concern Validation (page 994)
Security Changes (page 994)
2dsphere Index Version 2 (page 995)
Compatibility Changes in MongoDB 2.6 Log Messages (page 995)
Package Configuration Changes (page 995)
Remove Method Signature Change (page 996)
Update Operator Syntax Validation (page 996)
Updates Enforce Field Name Restrictions (page 996)
Query and Sort Changes (page 996)
Replica Set/Sharded Cluster Validation (page 1000)
Time Format Changes (page 1000)
Other Resources (page 1000)
The following 2.6 changes can affect the compatibility with older versions of MongoDB. See Release Notes for
MongoDB 2.6 (page 959) for the full list of the 2.6 changes.
Index Changes
if you specify an index name that already exists but the key specifications differ; e.g. in the following
example, the second db.collection.ensureIndex() will error.
db.mycollection.ensureIndex( { a: 1 }, { name: "myIdx" } )
db.mycollection.ensureIndex( { z: 1 }, { name: "myIdx" } )
Previous versions did not create the index but did not error.
write method to provide acknowledgment of the write. Scripts, however, would observe fire-and-forget behavior in previous versions unless the
scripts included an explicit call to the getLastError command after a write method.
Solution Scripts that used these mongo shell methods for bulk write operations with fire-and-forget behavior should
use the Bulk() methods.
In sharded environments, applications using any driver or mongo shell should use Bulk() methods for optimal
performance when inserting or modifying groups of documents.
For example, instead of:
for (var i = 1; i <= 1000000; i++) {
db.test.insert( { x : i } );
}
bulk.execute( { w: 1 } );
Bulk method returns a BulkWriteResult object that contains the result of the operation.
See also:
New Write Operation Protocol (page 986), Bulk(), Bulk.execute(),
db.collection.initializeUnorderedBulkOp(), db.collection.initializeOrderedBulkOp()
db.collection.aggregate() Change
Description The db.collection.aggregate() method in the mongo shell defaults to returning a cursor to
the results set. This change enables the aggregation pipeline to return result sets of any size and requires cursor
iteration to access the result set. For example:
var myCursor = db.orders.aggregate( [
{
$group: {
_id: "$cust_id",
total: { $sum: "$price" }
}
}
] );
Previous versions returned a single document with a field results that contained an array of the result set,
subject to the BSON Document size limit. Accessing the result set in the previous versions of MongoDB required
accessing the results field and iterating the array. For example:
var returnedDoc = db.orders.aggregate( [
{
$group: {
_id: "$cust_id",
total: { $sum: "$price" }
}
}
] );
Solution Update scripts that currently expect db.collection.aggregate() to return a document with a
results array to handle cursors instead.
See also:
Aggregation Enhancements (page 985), db.collection.aggregate(),
Security Changes
Important: Before upgrading the authorization model, you should first upgrade MongoDB binaries to 2.6.
For sharded clusters, ensure that all cluster components are 2.6. If there are users in any database, be sure you
have at least one user in the admin database with the role userAdminAnyDatabase (page 421) before
upgrading the MongoDB binaries.
See also:
Security Improvements (page 987)
When using the allowInvalidCertificates setting, MongoDB logs as a warning the use of the invalid
certificates.
Warning: The allowInvalidCertificates setting bypasses the other certificate validation, such as
checks for expiration and valid signatures.
Log Messages
SNMP Changes
Description
The IANA enterprise identifier for MongoDB changed from 37601 to 34601.
MongoDB changed the MIB field name globalopcounts to globalOpcounts.
Solution
Users of SNMP monitoring must modify their SNMP configuration (i.e. MIB) from 37601 to 34601.
Update references to globalopcounts to globalOpcounts.
Update operators (e.g $set) cannot repeat in the update statement. For example, the following
expression is invalid:
{ $set: { a: 5 }, $set: { b: 5 } }
Geospatial Changes
$maxDistance Changes
Description
For $near queries on GeoJSON data, if the queries specify a $maxDistance, $maxDistance must
be inside of the $near document.
In previous version, $maxDistance could be either inside or outside the $near document.
$maxDistance must be a positive value.
Solution
Update any existing $near queries on GeoJSON data that currently have the $maxDistance outside
the $near document
Update any existing queries where $maxDistance is a negative value.
Deprecated $uniqueDocs
Description MongoDB 2.6 deprecates $uniqueDocs, and geospatial queries no longer return duplicated results
when a document matches the query multiple times.
The following query uses the index to search for documents where price is not greater than or equal to
50:
db.orders.find( { price: { $not: { $gte: 50 } } } )
In previous versions, indexed plans would only return matching documents where the type of the field
matches the type of the query predicate:
{ "_id" : 1, "status" : "A", "cust_id" : "123", "price" : 40 }
If using a collection scan, previous versions would return the same results as those in 2.6.
MongoDB 2.6 allows chaining of $not expressions.
$exists and notablescan If the MongoDB server has disabled collection scans, i.e. notablescan, then
$exists queries that have no indexed solution will error.
In 2.6, the following $elemMatch query does not match the document:
db.test.find( { a: { $elemMatch: { $gt: 1, $lt: 5 } } } )
Solution Update existing queries that rely upon the old behavior.
Text Search Compatibility MongoDB does not support the use of the $text query operator in mixed sharded
cluster deployments that contain both version 2.4 and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 1001)
for upgrade instructions.
Time Format Changes MongoDB now uses iso8601-local when formatting time data in many out-
puts. This format follows the template YYYY-MM-DDTHH:mm:ss.mmm<+/-Offset>. For example,
2014-03-04T20:13:38.944-0500.
This change impacts all clients using Extended JSON in Strict mode, such as mongoexport and the REST and
HTTP Interfaces1256 .
Other Resources
All backwards incompatible changes (JIRA)1257 .
Release Notes for MongoDB 2.6 (page 959).
1256 https://docs.mongodb.org/ecosystem/tools/http-interfaces
1257 https://jira.mongodb.org/issues/?jql=project%20%3D%20SERVER%20AND%20fixVersion%20in%20(%222.5.0%22%2C%20%222.5.1%22%2C%20%222.5.2%22
rc1%22%2C%20%222.6.0-rc2%22)%20AND%20%22Backwards%20Compatibility%22%20in%20%20(%22Major%20Change%22%2C%20%22Minor%20Change%2
Upgrade Process
On this page
In the general case, the upgrade from MongoDB 2.4 to 2.6 is a binary-compatible drop-in upgrade: shut down the
mongod instances and replace them with mongod instances running 2.6. However, before you attempt any upgrade,
familiarize yourself with the content of this document, particularly the Upgrade Recommendations and Checklists
(page 1001), the procedure for upgrading sharded clusters (page 1003), and the considerations for reverting to 2.4
after running 2.6 (page 1007).
Upgrade Requirements To upgrade an existing MongoDB deployment to 2.6, you must be running 2.4. If youre
running a version of MongoDB before 2.4, you must upgrade to 2.4 before upgrading to 2.6. See Upgrade MongoDB
to 2.4 (page 1028) for the procedure to upgrade from 2.2 to 2.4.
If you use MongoDB Cloud Manager1259 Backup, ensure that youre running at least version v20131216.1 of the
Backup agent before upgrading. Version 1.4.0 of the backup agent followed v20131216.1
Preparedness Before upgrading MongoDB always test your application in a staging environment before deploying
the upgrade to your production environment.
To begin the upgrade procedure, connect a 2.6 mongo shell to your MongoDB 2.4 mongos or mongod and run the
db.upgradeCheckAllDBs() to check your data set for compatibility. This is a preliminary automated check.
Assess and resolve all issues identified by db.upgradeCheckAllDBs().
Some changes in MongoDB 2.6 require manual checks and intervention. See Compatibility Changes in MongoDB 2.6
(page 990) for an explanation of these changes. Resolve all incompatibilities in your deployment before continuing.
For a deployment that uses authentication and authorization, be sure you have at least one user in the admin database
with the role userAdminAnyDatabase (page 421) before upgrading the MongoDB binaries. For deployments
currently using authentication and authorization, see the consideration for deployments that use authentication and
authorization (page 1002).
1258 https://jira.mongodb.org/issues/?jql=project%20%3D%20SERVER%20AND%20fixVersion%20in%20(%222.5.0%22%2C%20%222.5.1%22%2C%20%222.5.2%22
rc1%22%2C%20%222.6.0-rc2%22%2C%20%222.6.0-rc3%22)%20AND%20%22Backwards%20Compatibility%22%20in%20(%20%22Minor%20Change%22%2C%2
1259 https://cloud.mongodb.com/?jmp=docs
Authentication MongoDB 2.6 includes significant changes to the authorization model, which requires changes to
the way that MongoDB stores users credentials. As a result, in addition to upgrading MongoDB processes, if your
deployment uses authentication and authorization, after upgrading all MongoDB process to 2.6 you must also upgrade
the authorization model.
Before beginning the upgrade process for a deployment that uses authentication and authorization:
Ensure that at least one user exists in the admin database with the role userAdminAnyDatabase
(page 421).
If your application performs CRUD operations on the <database>.system.users collection or uses a
db.addUser()-like method, then you must upgrade those drivers (i.e. client libraries) before mongod or
mongos instances.
You must fully complete the upgrade procedure for all MongoDB processes before upgrading the authorization
model.
After you begin to upgrade a MongoDB deployment that uses authentication to 2.6, you cannot modify existing user
data until you complete the authorization user schema upgrade (page 1005).
See Upgrade User Authorization Data to 2.6 Format (page 1005) for a complete discussion of the upgrade procedure
for the authorization model including additional requirements and procedures.
Downgrade Limitations Once upgraded to MongoDB 2.6, you cannot downgrade to any version earlier than Mon-
goDB 2.4. If you created text or 2dsphere indexes while running 2.6, you can only downgrade to MongoDB
2.4.10 or later.
Package Upgrades If you installed MongoDB from the MongoDB apt or yum repositories, upgrade to 2.6 using
the package manager.
For Debian, Ubuntu, and related operating systems, type these commands:
sudo apt-get update
sudo apt-get install mongodb-org
If you did not install the mongodb-org package, and installed a subset of MongoDB components replace
mongodb-org in the commands above with the appropriate package names.
See installation instructions for Ubuntu (page 17), RHEL (page 7), Debian (page 20), or other Linux Systems (page 23)
for a list of the available packages and complete MongoDB installation instructions.
Upgrade Standalone mongod Instance to MongoDB 2.6 The following steps outline the procedure to upgrade a
standalone mongod from version 2.4 to 2.6. To upgrade from version 2.2 to 2.6, upgrade to version 2.4 (page 1028)
first, and then follow the procedure to upgrade from 2.4 to 2.6.
1. Download binaries of the latest release in the 2.6 series from the MongoDB Download Page1260 . See Install
MongoDB (page 5) for more information.
2. Shut down your mongod instance. Replace the existing binary with the 2.6 mongod binary and restart
mongod.
1260 http://www.mongodb.org/downloads
Upgrade a Replica Set to 2.6 The following steps outline the procedure to upgrade a replica set from MongoDB
2.4 to MongoDB 2.6. To upgrade from MongoDB 2.2 to 2.6, upgrade all members of the replica set to version 2.4
(page 1028) first, and then follow the procedure to upgrade from MongoDB 2.4 to 2.6.
You can upgrade from MongoDB 2.4 to 2.6 using a rolling upgrade to minimize downtime by upgrading the mem-
bers individually while the other members are available:
Step 1: Upgrade secondary members of the replica set. Upgrade the secondary members of the set one at a time
by shutting down the mongod and replacing the 2.4 binary with the 2.6 binary. After upgrading a mongod instance,
wait for the member to recover to SECONDARY state before upgrading the next instance. To check the members state,
issue rs.status() in the mongo shell.
Step 2: Step down the replica set primary. Use rs.stepDown() in the mongo shell to step down the primary
and force the set to failover (page 635). rs.stepDown() expedites the failover procedure and is preferable to
shutting down the primary directly.
Step 3: Upgrade the primary. When rs.status() shows that the primary has stepped down and another mem-
ber has assumed PRIMARY state, shut down the previous primary and replace the mongod binary with the 2.6 binary
and start the new instance.
Replica set failover is not instant but will render the set unavailable accept writes until the failover process completes.
Typically this takes 30 seconds or more: schedule the upgrade procedure during a scheduled maintenance window.
Upgrade a Sharded Cluster to 2.6 Only upgrade sharded clusters to 2.6 if all members of the cluster are currently
running instances of 2.4. The only supported upgrade path for sharded clusters running 2.2 is via 2.4. The upgrade
process checks all components of the cluster and will produce warnings if any component is running version 2.2.
Considerations The upgrade process does not require any downtime. However, while you upgrade the sharded
cluster, ensure that clients do not make changes to the collection meta-data. For example, during the upgrade, do not
do any of the following:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster metadata in any way. See Sharding Reference (page 814) for a com-
plete list of sharding commands. Note, however, that not all commands on the Sharding Reference (page 814)
page modifies the cluster meta-data.
Upgrade Sharded Clusters Optional but Recommended. As a precaution, take a backup of the config database
before upgrading the sharded cluster.
Step 1: Disable the Balancer. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the
Balancer (page 794).
Step 2: Upgrade the clusters meta data. Start a single 2.6 mongos instance with the configDB pointing to the
clusters config servers and with the --upgrade option.
To run a mongos with the --upgrade option, you can upgrade an existing mongos instance to 2.6, or if you need
to avoid reconfiguring a production mongos instance, you can use a new 2.6 mongos that can reach all the config
servers.
To upgrade the meta data, run:
mongos --configdb <configDB string> --upgrade
You can include the --logpath option to output the log messages to a file instead of the standard output. Also
include any other options required to start mongos instances in your cluster, such as --sslOnNormalPorts or
--sslPEMKeyFile.
The mongos will exit upon completion of the --upgrade process.
The upgrade will prevent any chunk moves or splits from occurring during the upgrade process. If the data files have
many sharded collections or if failed processes hold stale locks, acquiring the locks for all collections can take seconds
or minutes. Watch the log for progress updates.
Step 3: Ensure mongos --upgrade process completes successfully. The mongos will exit upon completion
of the meta data upgrade process. If successful, the process will log the following messages:
upgrade of config server to v5 successful
Config database is at version v5
After a successful upgrade, restart the mongos instance. If mongos fails to start, check the log for more information.
If the mongos instance loses its connection to the config servers during the upgrade or if the upgrade is otherwise
unsuccessful, you may always safely retry the upgrade.
Step 4: Upgrade the remaining mongos instances to v2.6. Upgrade and restart without the --upgrade option
the other mongos instances in the sharded cluster. After upgrading all the mongos, see Complete Sharded Cluster
Upgrade (page 1004) for information on upgrading the other cluster components.
Complete Sharded Cluster Upgrade After you have successfully upgraded all mongos instances, you can upgrade
the other instances in your MongoDB deployment.
Warning: Do not upgrade mongod instances until after you have upgraded all mongos instances.
While the balancer is still disabled, upgrade the components of your sharded cluster in the following order:
Upgrade all 3 mongod config server instances, leaving the first system in the mongos --configdb argu-
ment to upgrade last.
Upgrade each shard, one at a time, upgrading the mongod secondaries before running replSetStepDown
and upgrading the primary of each shard.
When this process is complete, re-enable the balancer (page 795).
Upgrade Procedure Once upgraded to MongoDB 2.6, you cannot downgrade to any version earlier than MongoDB
2.4. If you have text or 2dsphere indexes, you can only downgrade to MongoDB 2.4.10 or later.
Except as described on this page, moving between 2.4 and 2.6 is a drop-in replacement:
Step 1: Stop the existing mongod instance. For example, on Linux, run 2.4 mongod with the --shutdown
option as follows:
mongod --dbpath /var/mongod/data --shutdown
Replace /var/mongod/data with your MongoDB dbPath. See also the Stop mongod Processes (page 246) for
alternate methods of stopping a mongod instance.
Step 2: Start the new mongod instance. Ensure you start the 2.6 mongod with the same dbPath:
mongod --dbpath /var/mongod/data
On this page
Considerations (page 1005)
Upgrade User Authorization Data to 2.6 Format Requirements (page 1006)
Procedure (page 1006)
Result (page 1006)
MongoDB 2.6 includes significant changes to the authorization model, which requires changes to the way that Mon-
goDB stores users credentials. As a result, in addition to upgrading MongoDB processes, if your deployment uses
authentication and authorization, after upgrading all MongoDB process to 2.6 you must also upgrade the authorization
model.
Considerations
Complete all other Upgrade Requirements Before upgrading the authorization model, you should first upgrade
MongoDB binaries to 2.6. For sharded clusters, ensure that all cluster components are 2.6. If there are users in
any database, be sure you have at least one user in the admin database with the role userAdminAnyDatabase
(page 421) before upgrading the MongoDB binaries.
Timing Because downgrades are more difficult after you upgrade the user authorization model, once you upgrade
the MongoDB binaries to version 2.6, allow your MongoDB deployment to run a day or two without upgrading the
user authorization model.
This allows 2.6 some time to burn in and decreases the likelihood of downgrades occurring after the user privilege
model upgrade. The user authentication and access control will continue to work as it did in 2.4, but it will be
impossible to create or modify users or to use user-defined roles until you run the authorization upgrade.
If you decide to upgrade the user authorization model immediately instead of waiting the recommended burn in
period, then for sharded clusters, you must wait at least 10 seconds after upgrading the sharded clusters to run the
authorization upgrade script.
Replica Sets For a replica set, it is only necessary to run the upgrade process on the primary as the changes will
automatically replicate to the secondaries.
Sharded Clusters For a sharded cluster, connect to a mongos and run the upgrade procedure to upgrade the clusters
authorization data. By default, the procedure will upgrade the authorization data of the shards as well.
To override this behavior, run the upgrade command with the additional parameter upgradeShards: false. If
you choose to override, you must run the upgrade procedure on the mongos first, and then run the procedure on the
primary members of each shard.
For a sharded cluster, do not run the upgrade process directly against the config servers (page 734). Instead, perform
the upgrade process using one mongos instance to interact with the config database.
Requirements To upgrade the authorization model, you must have a user in the admin database with the role
userAdminAnyDatabase (page 421).
Procedure
Step 1: Connect to MongoDB instance. Connect and authenticate to the mongod instance for a single deployment
or a mongos for a sharded cluster as an admin database user with the role userAdminAnyDatabase (page 421).
Step 2: Upgrade authorization schema. Use the authSchemaUpgrade command in the admin database to
update the user data using the mongo shell.
Sharded cluster authSchemaUpgrade consideration. For a sharded cluster, authSchemaUpgrade will up-
grade the authorization data of the shards as well and the upgrade is complete. You can, however, override this behavior
by including upgradeShards: false in the command, as in the following example:
db.getSiblingDB("admin").runCommand({authSchemaUpgrade: 1,
upgradeShards: false });
If you override the behavior, after running authSchemaUpgrade on a mongos instance, you will need to connect
to the primary for each shard and repeat the upgrade process after upgrading on the mongos.
Result All users in a 2.6 system are stored in the admin.system.users (page 300) collection. To manipulate
these users, use the user management methods.
The upgrade procedure copies the version 2.4 admin.system.users collection to
admin.system.backup_users.
The upgrade procedure leaves the version 2.4 <database>.system.users collection(s) intact.
On this page
Downgrade Recommendations and Checklist (page 1007)
Downgrade MongoDB from 2.6 Downgrade 2.6 User Authorization Model (page 1007)
Downgrade Updated Indexes (page 1010)
Downgrade MongoDB Processes (page 1011)
Downgrade Procedure (page 1012)
Before you attempt any downgrade, familiarize yourself with the content of this document, particularly the Downgrade
Recommendations and Checklist (page 1007) and the procedure for downgrading sharded clusters (page 1011).
Downgrade Path Once upgraded to MongoDB 2.6, you cannot downgrade to any version earlier than MongoDB
2.4. If you created text or 2dsphere indexes while running 2.6, you can only downgrade to MongoDB 2.4.10 or
later.
Preparedness
Remove or downgrade version 2 text indexes (page 1010) before downgrading MongoDB 2.6 to 2.4.
Remove or downgrade version 2 2dsphere indexes (page 1010) before downgrading MongoDB 2.6 to 2.4.
Downgrade 2.6 User Authorization Model (page 1007). If you have upgraded to the 2.6 user authorization
model, you must downgrade the user model to 2.4 before downgrading MongoDB 2.6 to 2.4.
Downgrade 2.6 User Authorization Model If you have upgraded to the 2.6 user authorization model, you must
first downgrade the user authorization model to 2.4 before before downgrading MongoDB 2.6 to 2.4.
Considerations
For a replica set, it is only necessary to run the downgrade process on the primary as the changes will automati-
cally replicate to the secondaries.
For sharded clusters, although the procedure lists the downgrade of the clusters authorization data first, you
may downgrade the authorization data of the cluster or shards first.
You must have the admin.system.backup_users and admin.system.new_users collections cre-
ated during the upgrade process.
Important. The downgrade process returns the user data to its state prior to upgrading to 2.6 authorization
model. Any changes made to the user/role data using the 2.6 users model will be lost.
Access Control Prerequisites To downgrade the authorization model, you must connect as a user with the following
privileges:
{ resource: { db: "admin", collection: "system.new_users" }, actions: [ "find", "insert", "update" ]
{ resource: { db: "admin", collection: "system.backup_users" }, actions: [ "find" ] }
{ resource: { db: "admin", collection: "system.users" }, actions: [ "find", "remove", "insert"] }
{ resource: { db: "admin", collection: "system.version" }, actions: [ "find", "update" ] }
If no user exists with the appropriate privileges, create an authorization model downgrade user:
Step 1: Connect as user with privileges to manage users and roles. Connect and authenticate as a user with
userAdminAnyDatabase (page 421).
Step 2: Create a role with required privileges. Using the db.createRole method, create a role (page 335)
with the required privileges.
use admin
db.createRole(
{
role: "downgradeAuthRole",
privileges: [
{ resource: { db: "admin", collection: "system.new_users" }, actions: [ "find", "insert", "upda
{ resource: { db: "admin", collection: "system.backup_users" }, actions: [ "find" ] },
{ resource: { db: "admin", collection: "system.users" }, actions: [ "find", "remove", "insert"]
{ resource: { db: "admin", collection: "system.version" }, actions: [ "find", "update" ] }
],
roles: [ ]
}
)
Step 3: Create a user with the new role. Create a user and assign the user the downgradeRole.
use admin
db.createUser(
{
user: "downgradeAuthUser",
pwd: "somePass123",
roles: [ { role: "downgradeAuthRole", db: "admin" } ]
}
)
Note: Instead of creating a new user, you can also grant the role to an existing user. See
db.grantRolesToUser() method.
Step 4: Authenticate as the new user. Authenticate as the newly created user.
use admin
db.auth( "downgradeAuthUser", "somePass123" )
Procedure The following downgrade procedure requires <database>.system.users collections used in ver-
sion 2.4. to be intact for non-admin databases.
Step 1: Connect and authenticate to MongoDB instance. Connect and authenticate to the mongod instance for a
single deployment or a mongos for a sharded cluster with the appropriate privileges. See Access Control Prerequisites
(page 1008) for details.
Step 2: Create backup of 2.6 admin.system.users collection. Copy all documents in the
admin.system.users (page 300) collection to the admin.system.new_users collection:
db.getSiblingDB("admin").system.users.find().forEach( function(userDoc) {
status = db.getSiblingDB("admin").system.new_users.save( userDoc );
if (status.hasWriteError()) {
print(status.writeError);
}
}
);
The method returns a WriteResult object with the status of the operation. Upon successful update, the
WriteResult object should have "nModified" equal to 1.
The method returns a WriteResult object with the number of documents removed in the "nRemoved" field.
Step 5: Copy documents from the admin.system.backup_users collection. Copy all documents from the
admin.system.backup_users, created during the 2.6 upgrade, to admin.system.users.
db.getSiblingDB("admin").system.backup_users.find().forEach(
function (userDoc) {
status = db.getSiblingDB("admin").system.users.insert( userDoc );
if (status.hasWriteError()) {
print(status.writeError);
}
}
);
For a sharded cluster, repeat the downgrade process by connecting to the primary replica set member for each shard.
Note: The clusters mongos instances will fail to detect the authorization model downgrade until the user cache
is refreshed. You can run invalidateUserCache on each mongos instance to refresh immediately, or you can
wait until the cache is refreshed automatically at the end of the user cache invalidation interval. To
run invalidateUserCache, you must have privilege with invalidateUserCache (page 431) action, which
is granted by userAdminAnyDatabase (page 421) and hostManager (page 419) roles.
Result The downgrade process returns the user data to its state prior to upgrading to 2.6 authorization model. Any
changes made to the user/role data using the 2.6 users model will be lost.
Text Index Version Check If you have version 2 text indexes (i.e. the default version for text indexes in MongoDB
2.6), drop the version 2 text indexes before downgrading MongoDB. After the downgrade, enable text search and
recreate the dropped text indexes.
To determine the version of your text indexes, run db.collection.getIndexes() to view index specifica-
tions. For text indexes, the method returns the version information in the field textIndexVersion. For example,
the following shows that the text index on the quotes collection is version 2.
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "quote_text_translation.quote_text",
"ns" : "test.quotes",
"weights" : {
"quote" : 1,
"translation.quote" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 2
}
2dsphere Index Version Check If you have version 2 2dsphere indexes (i.e. the default version for 2dsphere
indexes in MongoDB 2.6), drop the version 2 2dsphere indexes before downgrading MongoDB. After the down-
grade, recreate the 2dsphere indexes.
To determine the version of your 2dsphere indexes, run db.collection.getIndexes() to view
index specifications. For 2dsphere indexes, the method returns the version information in the field
2dsphereIndexVersion. For example, the following shows that the 2dsphere index on the locations
collection is version 2.
{
"v" : 1,
"key" : {
"geo" : "2dsphere"
},
"name" : "geo_2dsphere",
"ns" : "test.locations",
"sparse" : true,
"2dsphereIndexVersion" : 2
}
Downgrade 2.6 Standalone mongod Instance The following steps outline the procedure to downgrade a stan-
dalone mongod from version 2.6 to 2.4.
1. Download binaries of the latest release in the 2.4 series from the MongoDB Download Page1261 . See Install
MongoDB (page 5) for more information.
2. Shut down your mongod instance. Replace the existing binary with the 2.4 mongod binary and restart
mongod.
Downgrade a 2.6 Replica Set The following steps outline a rolling downgrade process for the replica set. The
rolling downgrade process minimizes downtime by downgrading the members individually while the other members
are available:
Step 1: Downgrade each secondary member, one at a time. For each secondary in a replica set:
Replace and restart secondary mongod instances. First, shut down the mongod, then replace these binaries with
the 2.4 binary and restart mongod. See Stop mongod Processes (page 246) for instructions on safely terminating
mongod processes.
Allow secondary to recover. Wait for the member to recover to SECONDARY state before upgrading the next sec-
ondary.
To check the members state, use the rs.status() method in the mongo shell.
Step 2: Step down the primary. Use rs.stepDown() in the mongo shell to step down the primary and force
the normal failover (page 635) procedure.
rs.stepDown()
rs.stepDown() expedites the failover procedure and is preferable to shutting down the primary directly.
Step 3: Replace and restart former primary mongod. When rs.status() shows that the primary has stepped
down and another member has assumed PRIMARY state, shut down the previous primary and replace the mongod
binary with the 2.4 binary and start the new instance.
Replica set failover is not instant but will render the set unavailable to writes and interrupt reads until the failover pro-
cess completes. Typically this takes 10 seconds or more. You may wish to plan the downgrade during a predetermined
maintenance window.
Requirements While the downgrade is in progress, you cannot make changes to the collection meta-data. For
example, during the downgrade, do not do any of the following:
sh.enableSharding()
sh.shardCollection()
sh.addShard()
1261 http://www.mongodb.org/downloads
db.createCollection()
db.collection.drop()
db.dropDatabase()
any operation that creates a database
any other operation that modifies the cluster meta-data in any way. See Sharding Reference (page 814) for a com-
plete list of sharding commands. Note, however, that not all commands on the Sharding Reference (page 814)
page modifies the cluster meta-data.
Procedure The downgrade procedure for a sharded cluster reverses the order of the upgrade procedure.
1. Turn off the balancer (page 750) in the sharded cluster, as described in Disable the Balancer (page 794).
2. Downgrade each shard, one at a time. For each shard,
(a) Downgrade the mongod secondaries before downgrading the primary.
(b) To downgrade the primary, run replSetStepDown and downgrade.
3. Downgrade all 3 mongod config server instances, leaving the first system in the mongos --configdb
argument to downgrade last.
4. Downgrade and restart each mongos, one at a time. The downgrade process is a binary drop-in replacement.
5. Turn on the balancer, as described in Enable the Balancer (page 795).
Downgrade Procedure Once upgraded to MongoDB 2.6, you cannot downgrade to any version earlier than Mon-
goDB 2.4. If you have text or 2dsphere indexes, you can only downgrade to MongoDB 2.4.10 or later.
Except as described on this page, moving between 2.4 and 2.6 is a drop-in replacement:
Step 1: Stop the existing mongod instance. For example, on Linux, run 2.6 mongod with the --shutdown
option as follows:
mongod --dbpath /var/mongod/data --shutdown
Replace /var/mongod/data with your MongoDB dbPath. See also the Stop mongod Processes (page 246) for
alternate methods of stopping a mongod instance.
Step 2: Start the new mongod instance. Ensure you start the 2.4 mongod with the same dbPath:
mongod --dbpath /var/mongod/data
Download
Other Resources
On this page
Minor Releases (page 1013)
Major New Features (page 1020)
Security Enhancements (page 1021)
Performance Improvements (page 1021)
Enterprise (page 1027)
Additional Information (page 1028)
MongoDB 2.4 includes enhanced geospatial support, switch to V8 JavaScript engine, security enhancements, and text
search (beta) and hashed index.
Minor Releases
2.4 Changelog
On this page
2.4.14 (page 1013)
2.4.13 - Changes (page 1014)
2.4.12 - Changes (page 1014)
2.4.11 - Changes (page 1014)
2.4.10 - Changes (page 1014)
Previous Releases (page 1016)
2.4.14
Packaging: Init script sets process ulimit to different value compared to documentation (SERVER-177801265 )
Security: Compute BinData length in v8 (SERVER-176471266 )
Build: Upgrade PCRE Version from 8.30 to Latest (SERVER-172521267 )
1263 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.5.0%22%2C+%222.5.1%22%2
rc1%22%2C+%222.6.0-rc2%22%2C+%222.6.0-rc3%22%29
1264 https://github.com/mongodb/mongo/blob/v2.6/distsrc/THIRD-PARTY-NOTICES
1265 https://jira.mongodb.org/browse/SERVER-17780
1266 https://jira.mongodb.org/browse/SERVER-17647
1267 https://jira.mongodb.org/browse/SERVER-17252
2.4.13 - Changes
Security: Enforce BSON BinData length validation (SERVER-172781268 )
Security: Disable SSLv3 ciphers (SERVER-156731269 )
Networking: Improve BSON validation (SERVER-172641270 )
2.4.12 - Changes
Sharding: Sharded connection cleanup on setup error can crash mongos (SERVER-150561271 )
Sharding: type 7 (OID) error when acquiring distributed lock for first time (SERVER-136161272 )
Storage: explicitly zero .ns files on creation (SERVER-153691273 )
Storage: partially written journal last section causes recovery to fail (SERVER-151111274 )
2.4.11 - Changes
Security: Potential information leak (SERVER-142681275 )
Replication: _id with $prefix field causes replication failure due to unvalidated insert (SERVER-122091276 )
Sharding: Invalid access: seg fault in SplitChunkCommand::run (SERVER-143421277 )
Indexing: Creating descending index on _id can corrupt namespace (SERVER-148331278 )
Text Search: Updates to documents with text-indexed fields may lead to incorrect entries (SERVER-147381279 )
Build: Add SCons flag to override treating all warnings as errors (SERVER-137241280 )
Packaging: Fix mongodb enterprise 2.4 init script to allow multiple processes per host (SERVER-143361281 )
JavaScript: Do not store native function pointer as a property in function prototype (SERVER-142541282 )
2.4.10 - Changes
Indexes: Fixed issue that can cause index corruption when building indexes concurrently (SERVER-129901283 )
Indexes: Fixed issue that can cause index corruption when shutting down secondary node during index build
(SERVER-129561284 )
Indexes: Mongod now recognizes incompatible future text and geo index versions and exits gracefully
(SERVER-129141285 )
1268 https://jira.mongodb.org/browse/SERVER-17278
1269 https://jira.mongodb.org/browse/SERVER-15673
1270 https://jira.mongodb.org/browse/SERVER-17264
1271 https://jira.mongodb.org/browse/SERVER-15056
1272 https://jira.mongodb.org/browse/SERVER-13616
1273 https://jira.mongodb.org/browse/SERVER-15369
1274 https://jira.mongodb.org/browse/SERVER-15111
1275 https://jira.mongodb.org/browse/SERVER-14268
1276 https://jira.mongodb.org/browse/SERVER-12209
1277 https://jira.mongodb.org/browse/SERVER-14342
1278 https://jira.mongodb.org/browse/SERVER-14833
1279 https://jira.mongodb.org/browse/SERVER-14738
1280 https://jira.mongodb.org/browse/SERVER-13724
1281 https://jira.mongodb.org/browse/SERVER-14336
1282 https://jira.mongodb.org/browse/SERVER-14254
1283 https://jira.mongodb.org/browse/SERVER-12990
1284 https://jira.mongodb.org/browse/SERVER-12956
1285 https://jira.mongodb.org/browse/SERVER-12914
Indexes: Fixed issue that can cause secondaries to fail replication when building the same index multiple times
concurrently (SERVER-126621286 )
Indexes: Fixed issue that can cause index corruption on the tenth index in a collection if the index build fails
(SERVER-124811287 )
Indexes: Introduced versioning for text and geo indexes to ensure backwards compatibility (SERVER-
121751288 )
Indexes: Disallowed building indexes on the system.indexes collection, which can lead to initial sync failure on
secondaries (SERVER-102311289 )
Sharding: Avoid frequent immediate balancer retries when config servers are out of sync (SERVER-129081290 )
Sharding: Add indexes to locks collection on config servers to avoid long queries in case of large numbers of
collections (SERVER-125481291 )
Sharding: Fixed issue that can corrupt the config metadata cache when sharding collections concurrently
(SERVER-125151292 )
Sharding: Dont move chunks created on collections with a hashed shard key if the collection already contains
data (SERVER-92591293 )
Replication: Fixed issue where node appears to be down in a replica set during a compact operation (SERVER-
122641294 )
Replication: Fixed issue that could cause delays in elections when a node is not vetoing an election (SERVER-
121701295 )
Replication: Step down all primaries if multiple primaries are detected in replica set to ensure correct election
result (SERVER-107931296 )
Replication: Upon clock skew detection, secondaries will switch to sync directly from the primary to avoid sync
cycles (SERVER-83751297 )
Runtime: The SIGXCPU signal is now caught and mongod writes a log message and exits gracefully (SERVER-
120341298 )
Runtime: Fixed issue where mongod fails to start on Linux when /sys/dev/block directory is not readable
(SERVER-92481299 )
Windows: No longer zero-fill newly allocated files on systems other than Windows 7 or Windows Server 2008
R2 (SERVER-84801300 )
GridFS: Chunk size is decreased to 255 kB (from 256 kB) to avoid overhead with usePowerOf2Sizes option
(SERVER-133311301 )
1286 https://jira.mongodb.org/browse/SERVER-12662
1287 https://jira.mongodb.org/browse/SERVER-12481
1288 https://jira.mongodb.org/browse/SERVER-12175
1289 https://jira.mongodb.org/browse/SERVER-10231
1290 https://jira.mongodb.org/browse/SERVER-12908
1291 https://jira.mongodb.org/browse/SERVER-12548
1292 https://jira.mongodb.org/browse/SERVER-12515
1293 https://jira.mongodb.org/browse/SERVER-9259
1294 https://jira.mongodb.org/browse/SERVER-12264
1295 https://jira.mongodb.org/browse/SERVER-12170
1296 https://jira.mongodb.org/browse/SERVER-10793
1297 https://jira.mongodb.org/browse/SERVER-8375
1298 https://jira.mongodb.org/browse/SERVER-12034
1299 https://jira.mongodb.org/browse/SERVER-9248
1300 https://jira.mongodb.org/browse/SERVER-8480
1301 https://jira.mongodb.org/browse/SERVER-13331
Previous Releases
All 2.4.9 improvements1305 .
All 2.4.8 improvements1306 .
All 2.4.7 improvements1307 .
All 2.4.6 improvements1308 .
All 2.4.5 improvements1309 .
All 2.4.4 improvements1310 .
All 2.4.3 improvements1311 .
All 2.4.2 improvements1312
All 2.4.1 improvements1313 .
Init script sets process ulimit to different value compared to documentation SERVER-177801314
Compute BinData length in v8 SERVER-176471315
Upgrade PCRE Version from 8.30 to Latest SERVER-172521316
2.4.14 Changelog (page 1013).
All 2.4.14 improvements1317 .
Fix for instances where mongos incorrectly reports a successful write SERVER-121461337 .
Make non-primary read preferences consistent with slaveOK versioning logic SERVER-119711338 .
Allow new sharded cluster connections to read from secondaries when primary is down SERVER-72461339 .
All 2.4.9 improvements1340 .
Fix for possible loss of documents during the chunk migration process if a document in the chunk is very large
SERVER-104781349 .
Fix for C++ client shutdown issues SERVER-88911350 .
Improved replication robustness in presence of high network latency SERVER-100851351 .
1336 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.10%22%20AND%20project%20%3D%20SERVER
1337 https://jira.mongodb.org/browse/SERVER-12146
1338 https://jira.mongodb.org/browse/SERVER-11971
1339 https://jira.mongodb.org/browse/SERVER-7246
1340 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.9%22%20AND%20project%20%3D%20SERVER
1341 https://jira.mongodb.org/browse/SERVER-11478
1342 https://jira.mongodb.org/browse/SERVER-11421
1343 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.8%22%20AND%20project%20%3D%20SERVER
1344 https://jira.mongodb.org/browse/SERVER-10596
1345 https://jira.mongodb.org/browse/SERVER-9907
1346 https://jira.mongodb.org/browse/SERVER-11021
1347 https://jira.mongodb.org/browse/SERVER-10554
1348 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.7%22%20AND%20project%20%3D%20SERVER
1349 https://jira.mongodb.org/browse/SERVER-10478
1350 https://jira.mongodb.org/browse/SERVER-8891
1351 https://jira.mongodb.org/browse/SERVER-10085
Fix for CVE-2013-4650 Improperly grant user system privileges on databases other than local SERVER-
99831356 .
Fix for CVE-2013-3969 Remotely triggered segmentation fault in Javascript engine SERVER-98781357 .
Fix to prevent identical background indexes from being built SERVER-98561358 .
Config server performance improvements SERVER-98641359 and SERVER-54421360 .
Improved initial sync resilience to network failure SERVER-98531361 .
All 2.4.5 improvements1362 .
Fix for mongo shell ignoring modified objects _id field SERVER-93851367 .
Fix for race condition in log rotation SERVER-47391368 .
Fix for copydb command with authorization in a sharded cluster SERVER-90931369 .
All 2.4.3 improvements1370 .
1352 https://jira.mongodb.org/browse/SERVER-9832
1353 https://jira.mongodb.org/browse/SERVER-9786
1354 https://jira.mongodb.org/browse/SERVER-7080
1355 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.6%22%20AND%20project%20%3D%20SERVER
1356 https://jira.mongodb.org/browse/SERVER-9983
1357 https://jira.mongodb.org/browse/SERVER-9878
1358 https://jira.mongodb.org/browse/SERVER-9856
1359 https://jira.mongodb.org/browse/SERVER-9864
1360 https://jira.mongodb.org/browse/SERVER-5442
1361 https://jira.mongodb.org/browse/SERVER-9853
1362 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.5%22%20AND%20project%20%3D%20SERVER
1363 https://jira.mongodb.org/browse/SERVER-9721
1364 https://jira.mongodb.org/browse/SERVER-9661
1365 https://jira.mongodb.org/browse/SERVER-8813
1366 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.4%22%20AND%20project%20%3D%20SERVER
1367 https://jira.mongodb.org/browse/SERVER-9385
1368 https://jira.mongodb.org/browse/SERVER-4739
1369 https://jira.mongodb.org/browse/SERVER-9093
1370 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.3%22%20AND%20project%20%3D%20SERVER
The following changes in MongoDB affect both standard and Enterprise editions:
Text Search
Add support for text search of content in MongoDB databases as a beta feature. See Text Indexes (page 508) for more
information.
Add new 2dsphere index (page 503). The new index supports GeoJSON1378 objects Point, LineString,
and Polygon. See 2dsphere Indexes (page 503) and Geospatial Indexes and Queries (page 500).
Introduce operators $geometry, $geoWithin and $geoIntersects to work with the GeoJSON data.
Hashed Index
Add new hashed index (page 512) to index documents using hashes of field values. When used to index a shard key,
the hashed index ensures an evenly distributed shard key. See also Hashed Shard Keys (page 740).
Improve support for geospatial queries. See the $geoWithin operator and the $geoNear pipeline stage.
Improve sort efficiency when the $sort stage immediately precedes a $limit in the pipeline.
Add new operators $millisecond and $concat and modify how $min operator processes null values.
1371 https://jira.mongodb.org/browse/SERVER-9267
1372 https://jira.mongodb.org/browse/SERVER-9230
1373 https://jira.mongodb.org/browse/SERVER-9125
1374 https://jira.mongodb.org/browse/SERVER-9014
1375 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.2%22%20AND%20project%20%3D%20SERVER
1376 https://jira.mongodb.org/browse/SERVER-9087
1377 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.1%22%20AND%20project%20%3D%20SERVER
1378 http://geojson.org/geojson-spec.html
The mapReduce command, group command, and the $where operator expressions cannot access certain global
functions or properties, such as db, that are available in the mongo shell. See the individual command or operator for
details.
Provide additional metrics and customization for the serverStatus command. See db.serverStatus() and
serverStatus for more information.
Security Enhancements
Introduce a role-based access control system User Privileges1379 now use a new format for Privilege
Documents.
Enforce uniqueness of the user in user privilege documents per database. Previous versions of MongoDB did
not enforce this requirement, and existing databases may have duplicates.
Support encrypted connections using SSL certificates signed by a Certificate Authority. See Configure mongod
and mongos for TLS/SSL (page 382).
For more information on security and risk management strategies, see MongoDB Security Practices and Procedures
(page 315).
Performance Improvements
V8 JavaScript Engine
On this page
Consider the following impacts of V8 JavaScript Engine (page 1021) in MongoDB 2.4:
Tip
Use the new interpreterVersion() method in the mongo shell and the javascriptEngine field in the
output of db.serverBuildInfo() to determine which JavaScript engine a MongoDB binary uses.
1379 https://docs.mongodb.org/v2.4/reference/user-privileges
Improved Concurrency Previously, MongoDB operations that required the JavaScript interpreter had to acquire
a lock, and a single mongod could only run a single JavaScript operation at a time. The switch to V8 improves
concurrency by permitting multiple JavaScript operations to run at the same time.
Modernized JavaScript Implementation (ES5) The 5th edition of ECMAscript1380 , abbreviated as ES5, adds many
new language features, including:
standardized JSON1381 ,
strict mode1382 ,
function.bind()1383 ,
array extensions1384 , and
getters and setters.
With V8, MongoDB supports the ES5 implementation of Javascript with the following exceptions.
Note: The following features do not work as expected on documents returned from MongoDB queries:
Object.seal() throws an exception on documents returned from MongoDB queries.
Object.freeze() throws an exception on documents returned from MongoDB queries.
Object.preventExtensions() incorrectly allows the addition of new properties on documents returned
from MongoDB queries.
enumerable properties, when added to documents returned from MongoDB queries, are not saved during
write operations.
See SERVER-82161385 , SERVER-82231386 , SERVER-82151387 , and SERVER-82141388 for more information.
For objects that have not been returned from MongoDB queries, the features work as expected.
Removed Non-Standard SpiderMonkey Features V8 does not support the following non-standard SpiderMon-
key1389 JavaScript extensions, previously supported by MongoDBs use of SpiderMonkey as its JavaScript engine.
E4X Extensions V8 does not support the non-standard E4X1390 extensions. E4X provides a native XML1391 object
to the JavaScript language and adds the syntax for embedding literal XML documents in JavaScript code.
You need to use alternative XML processing if you used any of the following constructors/methods:
XML()
Namespace()
QName()
1380 http://www.ecma-international.org/publications/standards/Ecma-262.htm
1381 http://www.ecma-international.org/ecma-262/5.1/#sec-15.12.1
1382 http://www.ecma-international.org/ecma-262/5.1/#sec-4.2.2
1383 http://www.ecma-international.org/ecma-262/5.1/#sec-15.3.4.5
1384 http://www.ecma-international.org/ecma-262/5.1/#sec-15.4.4.16
1385 https://jira.mongodb.org/browse/SERVER-8216
1386 https://jira.mongodb.org/browse/SERVER-8223
1387 https://jira.mongodb.org/browse/SERVER-8215
1388 https://jira.mongodb.org/browse/SERVER-8214
1389 https://developer.mozilla.org/en-US/docs/SpiderMonkey
1390 https://developer.mozilla.org/en-US/docs/E4X
1391 https://developer.mozilla.org/en-US/docs/E4X/Processing_XML_with_E4X
XMLList()
isXMLName()
Destructuring Assignment V8 does not support the non-standard destructuring assignments. Destructuring assign-
ment extract[s] data from arrays or objects using a syntax that mirrors the construction of array and object literals. -
Mozilla docs1392
Example
The following destructuring assignment is invalid with V8 and throws a SyntaxError:
original = [4, 8, 15];
var [b, ,c] = a; // <== destructuring assignment
print(b) // 4
print(c) // 15
Iterator(), StopIteration(), and Generators V8 does not support Iterator(), StopIteration(), and gener-
ators1393 .
for each...in Construct V8 does not support the use of for each...in1394 construct. Use for (var x in
y) construct instead.
Example
The following for each (var x in y) construct is invalid with V8:
var o = { name: 'MongoDB', version: 2.4 };
Instead, in version 2.4, you can use the for (var x in y) construct:
var o = { name: 'MongoDB', version: 2.4 };
You can also use the array instance method forEach() with the ES5 method Object.keys():
Object.keys(o).forEach(function (key) {
var value = o[key];
print(value);
});
1392 https://developer.mozilla.org/en-US/docs/JavaScript/New_in_JavaScript/1.7#Destructuring_assignment_(Merge_into_own_page.2Fsection)
1393 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Iterators_and_Generators
1394 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/for_each...in
Example
With V8, the following array comprehension is invalid:
var a = { w: 1, x: 2, y: 3, z: 4 }
Instead, you can implement using the Array instance method forEach() and the ES5 method Object.keys()
:
var a = { w: 1, x: 2, y: 3, z: 4 }
Note: The new logic uses the Array instance method forEach() and not the generic method
Array.forEach(); V8 does not support Array generic methods. See Array Generic Methods (page 1026) for
more information.
Multiple Catch Blocks V8 does not support multiple catch blocks and will throw a SyntaxError.
Example
The following multiple catch blocks is invalid with V8 and will throw "SyntaxError: Unexpected token
if":
try {
something()
} catch (err if err instanceof SomeError) {
print('some error')
} catch (err) {
print('standard error')
}
Conditional Function Definition V8 will produce different outcomes than SpiderMonkey with conditional function
definitions1396 .
Example
The following conditional function definition produces different outcomes in SpiderMonkey versus V8:
function test () {
if (false) {
1395 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Predefined_Core_Objects#Array_comprehensions
1396 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Functions
function go () {};
}
print(typeof go)
}
With SpiderMonkey, the conditional function outputs undefined, whereas with V8, the conditional function outputs
function.
If your code defines functions this way, it is highly recommended that you refactor the code. The following example
refactors the conditional function definition to work in both SpiderMonkey and V8.
function test () {
var go;
if (false) {
go = function () {}
}
print(typeof go)
}
Note: ECMAscript prohibits conditional function definitions. To force V8 to throw an Error, enable strict mode1397 .
function test () {
'use strict';
if (false) {
function go () {}
}
}
String Generic Methods V8 does not support String generics1398 . String generics are a set of methods on the
String class that mirror instance methods.
Example
The following use of the generic method String.toLowerCase() is invalid with V8:
var name = 'MongoDB';
With V8, use the String instance method toLowerCase() available through an instance of the String class
instead:
var name = 'MongoDB';
With V8, use the String instance methods instead of following generic methods:
String.charAt() String.quote() String.toLocaleLowerCase()
String.charCodeAt() String.replace() String.toLocaleUpperCase()
String.concat() String.search() String.toLowerCase()
String.endsWith() String.slice() String.toUpperCase()
String.indexOf() String.split() String.trim()
String.lastIndexOf() String.startsWith() String.trimLeft()
String.localeCompare() String.substr() String.trimRight()
String.match() String.substring()
Array Generic Methods V8 does not support Array generic methods1399 . Array generics are a set of methods on
the Array class that mirror instance methods.
Example
The following use of the generic method Array.every() is invalid with V8:
var arr = [4, 8, 15, 16, 23, 42];
With V8, use the Array instance method every() available through an instance of the Array class instead:
var allEven = arr.every(isEven);
print(allEven);
With V8, use the Array instance methods instead of the following generic methods:
Array.concat() Array.lastIndexOf() Array.slice()
Array.every() Array.map() Array.some()
Array.filter() Array.pop() Array.sort()
Array.forEach() Array.push() Array.splice()
Array.indexOf() Array.reverse() Array.unshift()
Array.join() Array.shift()
Array Instance Method toSource() V8 does not support the Array instance method toSource()1400 . Use the
Array instance method toString() instead.
uneval() V8 does not support the non-standard method uneval(). Use the standardized JSON.stringify()1401
method instead.
Change default JavaScript engine from SpiderMonkey to V8. The change provides improved concurrency for
JavaScript operations, modernized JavaScript implementation, and the removal of non-standard SpiderMonkey fea-
tures, and affects all JavaScript behavior including the commands mapReduce, group, and eval and the query
operator $where.
1399 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array#Array_generic_methods
1400 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/toSource
1401 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/JSON/stringify
See JavaScript Changes in MongoDB 2.4 (page 1021) for more information about all changes .
Enable basic BSON object validation for mongod and mongorestore when writing to MongoDB data files. See
wireObjectCheck for details.
Add support for multiple concurrent index builds in the background by a single mongod instance. See building
indexes in the background (page 522) for more information on background index builds.
Allow the db.killOp() method to terminate a foreground index build.
Improve index validation during index creation. See Compatibility and Index Type Changes in MongoDB 2.4
(page 1035) for more information.
Provide --setParameter as a command line option for mongos and mongod. See mongod and mongos for
list of available options for setParameter.
By default, each document move during chunk migration (page 752) in a sharded cluster propagates to at least one
secondary before the balancer proceeds with its next operation. See Chunk Migration and Replication (page 753).
Increase performance for moving multiple chunks off an overloaded shard. The balancer no longer waits for the
current migrations delete phase to complete before starting the next chunk migration. See Chunk Migration Queuing
(page 753) for details.
Enterprise
In 2.4.4, MongoDB Enterprise uses Cyrus SASL. Earlier 2.4 Enterprise versions use GNU SASL (libgsasl). To
upgrade to 2.4.4 MongoDB Enterprise or greater, you must install all package dependencies related to this change,
including the appropriate Cyrus SASL GSSAPI library. See Install MongoDB Enterprise (page 33) for details of the
dependencies.
In 2.4, the MongoDB Enterprise now supports authentication via a Kerberos mechanism. See Configure MongoDB
with Kerberos Authentication on Linux (page 359) for more information. For drivers that provide support for Kerberos
authentication to MongoDB, refer to Driver Support (page 327).
For more information on security and risk management strategies, see MongoDB Security Practices and Procedures
(page 315).
Additional Information
Platform Notes
For OS X, MongoDB 2.4 only supports OS X versions 10.6 (Snow Leopard) and later. There are no other platform
support changes in MongoDB 2.4. See the downloads page1402 for more information on platform support.
Upgrade Process
On this page
Upgrade Recommendations and Checklist (page 1028)
Upgrade Standalone mongod Instance to MongoDB 2.4 (page 1029)
Upgrade MongoDB to 2.4 Upgrade a Replica Set from MongoDB 2.2 to MongoDB 2.4 (page 1029)
Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4 (page 1029)
Rolling Upgrade Limitation for 2.2.0 Deployments Running with auth Enabled (page 1033)
Upgrade from 2.3 to 2.4 (page 1033)
Downgrade MongoDB from 2.4 to Previous Versions (page 1033)
In the general case, the upgrade from MongoDB 2.2 to 2.4 is a binary-compatible drop-in upgrade: shut down the
mongod instances and replace them with mongod instances running 2.4. However, before you attempt any upgrade
please familiarize yourself with the content of this document, particularly the procedure for upgrading sharded clusters
(page 1029) and the considerations for reverting to 2.2 after running 2.4 (page 1033).
Upgrade a Replica Set from MongoDB 2.2 to MongoDB 2.4 You can upgrade to 2.4 by performing a rolling up-
grade of the set by upgrading the members individually while the other members are available to minimize downtime.
Use the following procedure:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod and replacing the 2.2
binary with the 2.4 binary. After upgrading a mongod instance, wait for the member to recover to SECONDARY
state before upgrading the next instance. To check the members state, issue rs.status() in the mongo
shell.
2. Use the mongo shell method rs.stepDown() to step down the primary to allow the normal failover
(page 635) procedure. rs.stepDown() expedites the failover procedure and is preferable to shutting down
the primary directly.
Once the primary has stepped down and another member has assumed PRIMARY state, as observed in the output
of rs.status(), shut down the previous primary and replace mongod binary with the 2.4 binary and start
the new process.
Note: Replica set failover is not instant but will render the set unavailable to read or accept writes until the
failover process completes. Typically this takes 10 seconds or more. You may wish to plan the upgrade during
a predefined maintenance window.
Overview Upgrading a sharded cluster from MongoDB version 2.2 to 2.4 (or 2.3) requires that you run a 2.4
mongos with the --upgrade option, described in this procedure. The upgrade process does not require down-
time.
The upgrade to MongoDB 2.4 adds epochs to the meta-data for all collections and chunks in the existing cluster.
MongoDB 2.2 processes are capable of handling epochs, even though 2.2 did not require them. This procedure applies
only to upgrades from version 2.2. Earlier versions of MongoDB do not correctly handle epochs. See Cluster Meta-
data Upgrade (page 1029) for more information.
After completing the meta-data upgrade you can fully upgrade the components of the cluster. With the balancer
disabled:
Upgrade all mongos instances in the cluster.
Upgrade all 3 mongod config server instances.
Upgrade the mongod instances for each shard, one at a time.
See Upgrade Sharded Cluster Components (page 1033) for more information.
Meta-data Upgrade Procedure Changes to the meta-data format for sharded clusters, stored in the config database
(page 816), require a special meta-data upgrade procedure when moving to 2.4.
Do not perform operations that modify meta-data while performing this procedure. See Upgrade a Sharded Cluster
from MongoDB 2.2 to MongoDB 2.4 (page 1029) for examples of prohibited operations.
1. Before you start the upgrade, ensure that the amount of free space on the filesystem for the config database
(page 816) is at least 4 to 5 times the amount of space currently used by the config database (page 816) data
files.
Additionally, ensure that all indexes in the config database (page 816) are {v:1} indexes. If a critical index is
a {v:0} index, chunk splits can fail due to known issues with the {v:0} format. {v:0} indexes are present
on clusters created with MongoDB 2.0 or earlier.
The duration of the metadata upgrade depends on the network latency between the node performing the upgrade
and the three config servers. Ensure low latency between the upgrade process and the config servers.
To check the version of your indexes, use db.collection.getIndexes().
If any index on the config database is {v:0}, you should rebuild those indexes by connecting to the mongos
and either: rebuild all indexes using the db.collection.reIndex() method, or drop and rebuild specific
Optional
For additional security during the upgrade, you can make a backup of the config database using mongodump
or other backup tools.
3. Ensure there are no version 2.0 mongod or mongos processes still active in the sharded cluster. The automated
upgrade process checks for 2.0 processes, but network availability can prevent a definitive check. Wait 5 minutes
after stopping or upgrading version 2.0 mongos processes to confirm that none are still active.
4. Start a single 2.4 mongos process with configDB pointing to the sharded clusters config servers (page 734)
and with the --upgrade option. The upgrade process happens before the process becomes a daemon (i.e.
before --fork.)
You can upgrade an existing mongos instance to 2.4 or you can start a new mongos instance that can reach all
config servers if you need to avoid reconfiguring a production mongos.
Start the mongos with a command that resembles the following:
mongos --configdb <config servers> --upgrade
Without the --upgrade option 2.4 mongos processes will fail to start until the upgrade process is complete.
The upgrade will prevent any chunk moves or splits from occurring during the upgrade process. If there are
very many sharded collections or there are stale locks held by other failed processes, acquiring the locks for all
collections can take seconds or minutes. See the log for progress updates.
5. When the mongos process starts successfully, the upgrade is complete. If the mongos process fails to start,
check the log for more information.
If the mongos terminates or loses its connection to the config servers during the upgrade, you may always
safely retry the upgrade.
However, if the upgrade failed during the short critical section, the mongos will exit and report that the up-
grade will require manual intervention. To continue the upgrade process, you must follow the Resync after an
Interruption of the Critical Section (page 1032) procedure.
Optional
If the mongos logs show the upgrade waiting for the upgrade lock, a previous upgrade process may still be
active or may have ended abnormally. After 15 minutes of no remote activity mongos will force the upgrade
lock. If you can verify that there are no running upgrade processes, you may connect to a 2.2 mongos process
and force the lock manually:
mongo <mongos.example.net>
If the process specified in the process field of this document is verifiably offline, run the following operation
to force the lock.
db.getMongo().getCollection("config.locks").update({ _id : "configUpgrade" }, { $set : { state :
It is always more safe to wait for the mongos to verify that the lock is inactive, if you have any doubts about
the activity of another upgrade operation. In addition to the configUpgrade, the mongos may need to wait
for specific collection locks. Do not force the specific collection locks.
6. Upgrade and restart other mongos processes in the sharded cluster, without the --upgrade option.
See Upgrade Sharded Cluster Components (page 1033) for more information.
7. Re-enable the balancer (page 794). You can now perform operations that modify cluster meta-data.
Once you have upgraded, do not introduce version 2.0 MongoDB processes into the sharded cluster. This can rein-
troduce old meta-data formats into the config servers. The meta-data change made by this upgrade process will help
prevent errors caused by cross-version incompatibilities in future versions of MongoDB.
Resync after an Interruption of the Critical Section During the short critical section of the upgrade that applies
changes to the meta-data, it is unlikely but possible that a network interruption can prevent all three config servers
from verifying or modifying data. If this occurs, the config servers (page 734) must be re-synced, and there may be
problems starting new mongos processes. The sharded cluster will remain accessible, but avoid all cluster meta-
data changes until you resync the config servers. Operations that change meta-data include: adding shards, dropping
databases, and dropping collections.
Note: Only perform the following procedure if something (e.g. network, power, etc.) interrupts the upgrade process
during the short critical section of the upgrade. Remember, you may always safely attempt the meta data upgrade
procedure (page 1030).
mongo <mongos.example.net>
11. Finally retry the upgrade process, as in Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4
(page 1029).
Upgrade Sharded Cluster Components After you have successfully completed the meta-data upgrade process
described in Meta-data Upgrade Procedure (page 1030), and the 2.4 mongos instance starts, you can upgrade the
other processes in your MongoDB deployment.
While the balancer is still disabled, upgrade the components of your sharded cluster in the following order:
Upgrade all mongos instances in the cluster, in any order.
Upgrade all 3 mongod config server instances, upgrading the first system in the mongos --configdb ar-
gument last.
Upgrade each shard, one at a time, upgrading the mongod secondaries before running replSetStepDown
and upgrading the primary of each shard.
When this process is complete, you can now re-enable the balancer (page 794).
Rolling Upgrade Limitation for 2.2.0 Deployments Running with auth Enabled MongoDB cannot support
deployments that mix 2.2.0 and 2.4.0, or greater, components. MongoDB version 2.2.1 and later processes can exist in
mixed deployments with 2.4-series processes. Therefore you cannot perform a rolling upgrade from MongoDB 2.2.0
to MongoDB 2.4.0. To upgrade a cluster with 2.2.0 components, use one of the following procedures.
1. Perform a rolling upgrade of all 2.2.0 processes to the latest 2.2-series release (e.g. 2.2.3) so that there are no
processes in the deployment that predate 2.2.1. When there are no 2.2.0 processes in the deployment, perform a
rolling upgrade to 2.4.0.
2. Stop all processes in the cluster. Upgrade all processes to a 2.4-series release of MongoDB, and start all pro-
cesses at the same time.
Upgrade from 2.3 to 2.4 If you used a mongod from the 2.3 or 2.4-rc (release candidate) series, you can safely
transition these databases to 2.4.0 or later; however, if you created 2dsphere or text indexes using a mongod
before v2.4-rc2, you will need to rebuild these indexes. For example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
Downgrade MongoDB from 2.4 to Previous Versions For some cases the on-disk format of data files used by 2.4
and 2.2 mongod is compatible, and you can upgrade and downgrade if needed. However, several new features in 2.4
are incompatible with previous versions:
2dsphere indexes are incompatible with 2.2 and earlier mongod instances.
text indexes are incompatible with 2.2 and earlier mongod instances.
using a hashed index as a shard key are incompatible with 2.2 and earlier mongos instances.
hashed indexes are incompatible with 2.0 and earlier mongod instances.
Important: Collections sharded using hashed shard keys, should not use 2.2 mongod instances, which cannot
correctly support cluster operations for these collections.
If you completed the meta-data upgrade for a sharded cluster (page 1029), you can safely downgrade to 2.2 MongoDB
processes. Do not use 2.0 processes after completing the upgrade procedure.
Note: In sharded clusters, once you have completed the meta-data upgrade procedure (page 1029), you cannot use
2.0 mongod or mongos instances in the same cluster.
If you complete the meta-data upgrade, you can safely downgrade components in any order. When upgrade again,
always upgrade mongos instances before mongod instances.
Do not create 2dsphere or text indexes in a cluster that has 2.2 components.
Considerations and Compatibility If you upgrade to MongoDB 2.4, and then need to run MongoDB 2.2 with the
same data files, consider the following limitations.
If you use a hashed index as the shard key index, which is only possible under 2.4 you will not be able to
query data in this sharded collection. Furthermore, a 2.2 mongos cannot properly route an insert operation
for a collections sharded using a hashed index for the shard key index: any data that you insert using a 2.2
mongos, will not arrive on the correct shard and will not be reachable by future queries.
If you never create an 2dsphere or text index, you can move between a 2.4 and 2.2 mongod for a given
data set; however, after you create the first 2dsphere or text index with a 2.4 mongod you will need to run
a 2.2 mongod with the --upgrade option and drop any 2dsphere or text index.
Basic Downgrade and Upgrade Except as described below, moving between 2.2 and 2.4 is a drop-in replacement:
stop the existing mongod, using the --shutdown option as follows:
mongod --dbpath /var/mongod/data --shutdown
Downgrade to 2.2 After Creating a 2dsphere or text Index If you have created 2dsphere or text in-
dexes while running a 2.4 mongod instance, you can downgrade at any time, by starting the 2.2 mongod with the
--upgrade option as follows:
mongod --dbpath /var/mongod/data/ --upgrade
Then, you will need to drop any existing 2dsphere or text indexes using db.collection.dropIndex(),
for example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
Warning: --upgrade will run repairDatabase on any database where you have created a 2dsphere or
text index, which will rebuild all indexes.
Troubleshooting Upgrade/Downgrade Operations If you do not use --upgrade, when you attempt to start a
2.2 mongod and you have created a 2dsphere or text index, mongod will return the following message:
'need to upgrade database index_plugin_upgrade with pdfile version 4.6, new version: 4.5 Not upgradin
While running 2.4, to check the data file version of a MongoDB database, use the following operation in the shell:
db.getSiblingDB('<databasename>').stats().dataFileVersion
The major data file 1404 version for both 2.2 and 2.4 is 4, the minor data file version for 2.2 is 5 and the minor data file
version for 2.4 is 6 after you create a 2dsphere or text index.
On this page
Compatibility and Index Type Changes in MongoDB 2.4 New Index Types (page 1035)
Index Type Validation (page 1035)
In 2.4 MongoDB includes two new features related to indexes that users upgrading to version 2.4 must consider,
particularly with regard to possible downgrade paths. For more information on downgrades, see Downgrade MongoDB
from 2.4 to Previous Versions (page 1033).
New Index Types In 2.4 MongoDB adds two new index types: 2dsphere and text. These index types do not
exist in 2.2, and for each database, creating a 2dsphere or text index, will upgrade the data-file version and make
that database incompatible with 2.2.
If you intend to downgrade, you should always drop all 2dsphere and text indexes before moving to 2.2.
You can use the downgrade procedure (page 1033) to downgrade these databases and run 2.2 if needed, however this
will run a full database repair (as with repairDatabase) for all affected databases.
Index Type Validation In MongoDB 2.2 and earlier you could specify invalid index types that did not exist. In
these situations, MongoDB would create an ascending (e.g. 1) index. Invalid indexes include index types specified by
strings that do not refer to an existing index type, and all numbers other than 1 and -1. 1405
In 2.4, creating any invalid index will result in an error. Furthermore, you cannot create a 2dsphere or text index
on a collection if its containing database has any invalid index types. 1
Example
If you attempt to add an invalid index in MongoDB 2.4, as in the following:
db.coll.ensureIndex( { field: "1" } )
1404 The data file version (i.e. pdfile version) is independent and unrelated to the release version of MongoDB.
1405 In 2.4, indexes that specify a type of "1" or "-1" (the strings "1" and "-1") will continue to exist, despite a warning on start-up. However,
a secondary in a replica set cannot complete an initial sync from a primary that has a "1" or "-1" index. Avoid all indexes with invalid types.
{
"err" : "Unknown index plugin '1' in index { field: \"1\" }"
"code": 16734,
"n": <number>,
"connectionId": <number>,
"ok": 1
}
See Upgrade MongoDB to 2.4 (page 1028) for full upgrade instructions.
Other Resources
MongoDB Downloads1406 .
All JIRA issues resolved in 2.41407 .
All Backwards incompatible changes1408 .
All Third Party License Notices1409 .
On this page
Upgrading (page 1036)
Changes (page 1038)
Licensing Changes (page 1045)
Resources (page 1045)
Upgrading
MongoDB 2.2 is a production release series and succeeds the 2.0 production release series.
MongoDB 2.0 data files are compatible with 2.2-series binaries without any special migration process. However,
always perform the upgrade process for replica sets and sharded clusters using the procedures that follow.
Synopsis
rc0%22,+%222.4.0-rc1%22,+%222.4.0-rc2%22,+%222.4.0-rc3%22%29
1408 https://jira.mongodb.org/issues/?jql=project%20%3D%20SERVER%20AND%20fixVersion%20in%20(%222.3.2%22%2C%20%222.3.1%22%2C%20%222.3.0%22
rc0%22%2C%20%222.4.0-rc1%22%2C%20%222.4.0-rc2%22%2C%20%222.4.0-rc3%22)%20AND%20%22Backwards%20Compatibility%22%20in%20(%22Major%
1409 https://github.com/mongodb/mongo/blob/v2.4/distsrc/THIRD-PARTY-NOTICES
For all deployments using authentication, upgrade the drivers (i.e. client libraries), before upgrading the
mongod instance or instances.
For all upgrades of sharded clusters:
turn off the balancer during the upgrade process. See the Disable the Balancer (page 794) section for more
information.
upgrade all mongos instances before upgrading any mongod instances.
Other than the above restrictions, 2.2 processes can interoperate with 2.0 and 1.8 tools and processes. You can safely
upgrade the mongod and mongos components of a deployment one by one while the deployment is otherwise oper-
ational. Be sure to read the detailed upgrade procedures below before upgrading production systems.
1. Download binaries of the latest release in the 2.2 series from the MongoDB Download Page1410 .
2. Shutdown your mongod instance. Replace the existing binary with the 2.2 mongod binary and restart Mon-
goDB.
You can upgrade to 2.2 by performing a rolling upgrade of the set by upgrading the members individually while the
other members are available to minimize downtime. Use the following procedure:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod and replacing the 2.0
binary with the 2.2 binary. After upgrading a mongod instance, wait for the member to recover to SECONDARY
state before upgrading the next instance. To check the members state, issue rs.status() in the mongo
shell.
2. Use the mongo shell method rs.stepDown() to step down the primary to allow the normal failover
(page 635) procedure. rs.stepDown() expedites the failover procedure and is preferable to shutting down
the primary directly.
Once the primary has stepped down and another member has assumed PRIMARY state, as observed in the output
of rs.status(), shut down the previous primary and replace mongod binary with the 2.2 binary and start
the new process.
Note: Replica set failover is not instant but will render the set unavailable to read or accept writes until the
failover process completes. Typically this takes 10 seconds or more. You may wish to plan the upgrade during
a predefined maintenance window.
Note: Balancing is not currently supported in mixed 2.0.x and 2.2.0 deployments. Thus you will want to reach a
consistent version for all shards within a reasonable period of time, e.g. same-day. See SERVER-69021411 for more
information.
Changes
Major Features
Aggregation Framework The aggregation framework makes it possible to do aggregation operations without need-
ing to use map-reduce. The aggregate command exposes the aggregation framework, and the aggregate()
helper in the mongo shell provides an interface to these operations. Consider the following resources for background
on the aggregation framework and its use:
Documentation: Aggregation (page 443)
Reference: Aggregation Reference (page 473)
TTL Collections TTL collections remove expired data from a collection, using a special index and a background
thread that deletes expired documents every minute. These collections are useful as an alternative to capped collections
in some cases, such as for data warehousing and caching cases, including: machine generated event data, logs, and
session information that needs to persist in a database for only a limited period of time.
For more information, see the Expire Data from Collections by Setting TTL (page 231) tutorial.
Concurrency Improvements MongoDB 2.2 increases the servers capacity for concurrent operations with the fol-
lowing improvements:
1. DB Level Locking1412
2. Improved Yielding on Page Faults1413
3. Improved Page Fault Detection on Windows1414
To reflect these changes, MongoDB now provides changed and improved reporting for concurrency and use. See locks,
recordStats1415 , db.currentOp(), mongotop, and mongostat.
Improved Data Center Awareness with Tag Aware Sharding MongoDB 2.2 adds additional support for geo-
graphic distribution or other custom partitioning for sharded collections in clusters. By using this tag aware shard-
ing, you can automatically ensure that data in a sharded database system is always on specific shards. For example,
with tag aware sharding, you can ensure that data is closest to the application servers that use that data most frequently.
Shard tagging controls data location, and is complementary but separate from replica set tagging, which controls
read preference (page 641) and write concern (page 141). For example, shard tagging can pin all USA data to
one or more logical shards, while replica set tagging can control which mongod instances (e.g. production or
reporting) the application uses to service requests.
See the documentation for the following helpers in the mongo shell that support tagged sharding configuration:
sh.addShardTag()
1411 https://jira.mongodb.org/browse/SERVER-6902
1412 https://jira.mongodb.org/browse/SERVER-4328
1413 https://jira.mongodb.org/browse/SERVER-3357
1414 https://jira.mongodb.org/browse/SERVER-4538
1415 https://docs.mongodb.org/v2.2/reference/server-status
sh.addTagRange()
sh.removeShardTag()
Also, see Tag Aware Sharding (page 748) and Manage Shard Tags (page 808).
Fully Supported Read Preference Semantics All MongoDB clients and drivers now support full read preferences
(page 641), including consistent support for a full range of read preference modes (page 721) and tag sets (page 644).
This support extends to the mongos and applies identically to single replica sets and to the replica sets for each shard
in a sharded cluster.
Additional read preference support now exists in the mongo shell using the readPref() cursor method.
Compatibility Changes
Authentication Changes MongoDB 2.2 provides more reliable and robust support for authentication clients, in-
cluding drivers and mongos instances.
If your cluster runs with authentication:
For all drivers, use the latest release of your driver and check its release notes.
In sharded environments, to ensure that your cluster remains available during the upgrade process you must use
the upgrade procedure for sharded clusters (page 1037).
findAndModify Returns Null Value for Upserts that Perform Inserts In version 2.2, for upsert that perform
inserts with the new option set to false, findAndModify commands will now return the following output:
{ 'ok': 1.0, 'value': null }
In the mongo shell, upsert findAndModify operations that perform inserts (with new set to false.)only output a
null value.
In version 2.0 these operations would return an empty document, e.g. { }.
See: SERVER-62261416 for more information.
mongodump 2.2 Output Incompatible with Pre-2.2 mongorestore If you use the mongodump tool from the
2.2 distribution to create a dump of a database, you must use a 2.2 (or later) version of mongorestore to restore
that dump.
See: SERVER-69611417 for more information.
If compatibility between versions 2.0 and 2.2 is required, use ObjectId().str (page 192), which holds the hexadecimal
string value in both versions.
ObjectId().valueOf() Returns hexadecimal string In version 2.2, the valueOf() method returns the
value of the ObjectId() (page 192) object as a lowercase hexadecimal string.
Consider the following example that calls the valueOf() method on the
ObjectId("507c7f79bcf86cd7994f6c0e") object:
ObjectId("507c7f79bcf86cd7994f6c0e").valueOf()
Behavioral Changes
Restrictions on Database Names for Windows Database names running on Windows can no longer contain the
following characters:
/\. "*<>:|?
The names of the data files include the database name. If you attempt to upgrade a database instance with one or more
of these characters, mongod will refuse to start.
Change the name of these databases before upgrading. See SERVER-45841419 and SERVER-67291420 for more infor-
mation.
1418 https://jira.mongodb.org/browse/SERVER-4442
1419 https://jira.mongodb.org/browse/SERVER-4584
1420 https://jira.mongodb.org/browse/SERVER-6729
_id Fields and Indexes on Capped Collections All capped collections now have an _id field by default, if they
exist outside of the local database, and now have indexes on the _id field. This change only affects capped
collections created with 2.2 instances and does not affect existing capped collections.
See: SERVER-55161421 for more information.
New $elemMatch Projection Operator The $elemMatch operator allows applications to narrow the data re-
turned from queries so that the query operation will only return the first matching element in an array. See the
$elemMatch reference and the SERVER-22381422 and SERVER-8281423 issues for more information.
Windows XP is Not Supported As of 2.2, MongoDB does not support Windows XP. Please upgrade to a more
recent version of Windows to use the latest releases of MongoDB. See SERVER-56481424 for more information.
Service Support for mongos.exe You may now run mongos.exe instances as a Windows Service. See the
mongos.exe reference and Configure a Windows Service for MongoDB Community Edition (page 31) and SERVER-
15891425 for more information.
Log Rotate Command Support MongoDB for Windows now supports log rotation by way of the logRotate
database command. See SERVER-26121426 for more information.
New Build Using SlimReadWrite Locks for Windows Concurrency Labeled 2008+ on the Downloads
Page1427 , this build for 64-bit versions of Windows Server 2008 R2 and for Windows 7 or newer, offers increased
performance over the standard 64-bit Windows build of MongoDB. See SERVER-38441428 for more information.
Tool Improvements
Index Definitions Handled by mongodump and mongorestore When you specify the --collection option
to mongodump, mongodump will now backup the definitions for all indexes that exist on the source database. When
you attempt to restore this backup with mongorestore, the target mongod will rebuild all indexes. See SERVER-
8081429 for more information.
mongorestore now includes the --noIndexRestore option to provide the preceding behavior. Use
--noIndexRestore to prevent mongorestore from building previous indexes.
mongooplog for Replaying Oplogs The mongooplog tool makes it possible to pull oplog entries from mongod
instance and apply them to another mongod instance. You can use mongooplog to achieve point-in-time backup of
a MongoDB data set. See the SERVER-38731430 case and the mongooplog reference.
1421 https://jira.mongodb.org/browse/SERVER-5516
1422 https://jira.mongodb.org/browse/SERVER-2238
1423 https://jira.mongodb.org/browse/SERVER-828
1424 https://jira.mongodb.org/browse/SERVER-5648
1425 https://jira.mongodb.org/browse/SERVER-1589
1426 https://jira.mongodb.org/browse/SERVER-2612
1427 http://www.mongodb.org/downloads
1428 https://jira.mongodb.org/browse/SERVER-3844
1429 https://jira.mongodb.org/browse/SERVER-808
1430 https://jira.mongodb.org/browse/SERVER-3873
Authentication Support for mongotop and mongostat mongotop and mongostat now contain support for
username/password authentication. See SERVER-38751431 and SERVER-38711432 for more information regarding
this change. Also consider the documentation of the following options for additional information:
mongotop --username
mongotop --password
mongostat --username
mongostat --password
Write Concern Support for mongoimport and mongorestore mongoimport now provides an option to
halt the import if the operation encounters an error, such as a network interruption, a duplicate key exception, or a
write error. The --stopOnError option will produce an error rather than silently continue importing data. See
SERVER-39371433 for more information.
In mongorestore, the --w option provides support for configurable write concern.
mongodump Support for Reading from Secondaries You can now run mongodump when connected to a sec-
ondary member of a replica set. See SERVER-38541434 for more information.
mongoimport Support for full 16MB Documents Previously, mongoimport would only import documents
that were less than 4 megabytes in size. This issue is now corrected, and you may use mongoimport to import
documents that are at least 16 megabytes ins size. See SERVER-45931435 for more information.
Timestamp() Extended JSON format MongoDB extended JSON now includes a new Timestamp() type to
represent the Timestamp type that MongoDB uses for timestamps in the oplog among other contexts.
This permits tools like mongooplog and mongodump to query for specific timestamps. Consider the following
mongodump operation:
mongodump --db local --collection oplog.rs --query '{"ts":{"$gt":{"$timestamp" : {"t": 1344969612000,
Shell Improvements
Improved Shell User Interface 2.2 includes a number of changes that improve the overall quality and consistency
of the user interface for the mongo shell:
Full Unicode support.
Bash-like line editing features. See SERVER-43121437 for more information.
Multi-line command support in shell history. See SERVER-34701438 for more information.
Windows support for the edit command. See SERVER-39981439 for more information.
1431 https://jira.mongodb.org/browse/SERVER-3875
1432 https://jira.mongodb.org/browse/SERVER-3871
1433 https://jira.mongodb.org/browse/SERVER-3937
1434 https://jira.mongodb.org/browse/SERVER-3854
1435 https://jira.mongodb.org/browse/SERVER-4593
1436 https://jira.mongodb.org/browse/SERVER-3483
1437 https://jira.mongodb.org/browse/SERVER-4312
1438 https://jira.mongodb.org/browse/SERVER-3470
1439 https://jira.mongodb.org/browse/SERVER-3998
Helper to load Server-Side Functions The db.loadServerScripts() loads the contents of the current
databases system.js collection into the current mongo shell session. See SERVER-16511440 for more information.
Support for Bulk Inserts If you pass an array of documents to the insert() method, the mongo shell will now
perform a bulk insert operation. See SERVER-38191441 and SERVER-23951442 for more information.
Note: For bulk inserts on sharded clusters, the getLastError command alone is insufficient to verify success.
Applications should must verify the success of bulk inserts in application logic.
Operations
Support for Logging to Syslog See the SERVER-29571443 case and the documentation of the syslogFacility
run-time option or the mongod --syslog and mongos --syslog command line-options.
touch Command Added the touch command to read the data and/or indexes from a collection into memory. See:
SERVER-20231444 and touch for more information.
indexCounters No Longer Report Sampled Data indexCounters now report actual counters that reflect
index use and state. In previous versions, these data were sampled. See SERVER-57841445 and indexCounters
for more information.
Padding Specifiable on compact Command See the documentation of the compact and the SERVER-40181446
issue for more information.
Added Build Flag to Use System Libraries The Boost library, version 1.49, is now embedded in the MongoDB
code base.
If you want to build MongoDB binaries using system Boost libraries, you can pass scons using the
--use-system-boost flag, as follows:
scons --use-system-boost
When building MongoDB, you can also pass scons a flag to compile MongoDB using only system libraries rather
than the included versions of the libraries. For example:
scons --use-system-all
Memory Allocator Changed to TCMalloc To improve performance, MongoDB 2.2 uses the TCMalloc memory
allocator from Google Perftools. For more information about this change see the SERVER-1881449 and SERVER-
46831450 . For more information about TCMalloc, see the documentation of TCMalloc1451 itself.
Replication
Improved Logging for Replica Set Lag When secondary members of a replica set fall behind in replication,
mongod now provides better reporting in the log. This makes it possible to track replication in general and iden-
tify what process may produce errors or halt replication. See SERVER-35751452 for more information.
Replica Set Members can Sync from Specific Members The new replSetSyncFrom command and new
rs.syncFrom() helper in the mongo shell make it possible for you to manually configure from which mem-
ber of the set a replica will poll oplog entries. Use these commands to override the default selection logic if needed.
Always exercise caution with replSetSyncFrom when overriding the default behavior.
Replica Set Members will not Sync from Members Without Indexes Unless buildIndexes: false To
prevent inconsistency between members of replica sets, if the member of a replica set has buildIndexes set to
true, other members of the replica set will not sync from this member, unless they also have buildIndexes set
to true. See SERVER-41601453 for more information.
New Option To Configure Index Pre-Fetching during Replication By default, when replicating options, secon-
daries will pre-fetch Indexes (page 487) associated with a query to improve replication throughput in most cases. The
replication.secondaryIndexPrefetch setting and --replIndexPrefetch option allow administra-
tors to disable this feature or allow the mongod to pre-fetch only the index on the _id field. See SERVER-67181454
for more information.
Sharding Improvements
Index on Shard Keys Can Now Be a Compound Index If your shard key uses the prefix of an existing index,
then you do not need to maintain a separate index for your shard key in addition to your existing index. This index,
however, cannot be a multi-key index. See the Shard Key Indexes (page 755) documentation and SERVER-15061457
for more information.
1449 https://jira.mongodb.org/browse/SERVER-188
1450 https://jira.mongodb.org/browse/SERVER-4683
1451 http://goog-perftools.sourceforge.net/doc/tcmalloc.html
1452 https://jira.mongodb.org/browse/SERVER-3575
1453 https://jira.mongodb.org/browse/SERVER-4160
1454 https://jira.mongodb.org/browse/SERVER-6718
1455 https://jira.mongodb.org/browse/SERVER-4521
1456 https://jira.mongodb.org/browse/SERVER-4158
1457 https://jira.mongodb.org/browse/SERVER-1506
Migration Thresholds Modified The migration thresholds (page 751) have changed in 2.2 to permit more even
distribution of chunks in collections that have smaller quantities of data. See the Migration Thresholds (page 751)
documentation for more information.
Licensing Changes
Added License notice for Google Perftools (TCMalloc Utility). See the License Notice1458 and the SERVER-46831459
for more information.
Resources
MongoDB Downloads1460 .
All JIRA issues resolved in 2.21461 .
All backwards incompatible changes1462 .
All third party license notices1463 .
Whats New in MongoDB 2.2 Online Conference1464 .
On this page
Upgrading (page 1045)
Changes (page 1046)
Resources (page 1050)
Upgrading
Although the major version number has changed, MongoDB 2.0 is a standard, incremental production release and
works as a drop-in replacement for MongoDB 1.8.
Preparation
Read through all release notes before upgrading, and ensure that no changes will affect your deployment.
If you create new indexes in 2.0, then downgrading to 1.8 is possible but you must reindex the new collections.
mongoimport and mongoexport now correctly adhere to the CSV spec for handling CSV input/output. This
may break existing import/export workflows that relied on the previous behavior. For more information see SERVER-
10971465 .
1458 https://github.com/mongodb/mongo/blob/v2.2/distsrc/THIRD-PARTY-NOTICES#L231
1459 https://jira.mongodb.org/browse/SERVER-4683
1460 http://mongodb.org/downloads
1461 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.1.0%22%2C+%222.1.1%22%2
rc0%22%2C+%222.2.0-rc1%22%2C+%222.2.0-rc2%22%29+ORDER+BY+component+ASC%2C+key+DESC
1462 https://jira.mongodb.org/issues/?filter=11225&jql=project%20%3D%20SERVER%20AND%20fixVersion%20in%20(10483%2C%2010893%2C%2010894%2C%20
1463 https://github.com/mongodb/mongo/blob/v2.2/distsrc/THIRD-PARTY-NOTICES
1464 http://www.mongodb.com/events/webinar/mongodb-online-conference-sept
1465 https://jira.mongodb.org/browse/SERVER-1097
Journaling (page 598) is enabled by default in 2.0 for 64-bit builds. If you still prefer to run without journaling, start
mongod with the --nojournal run-time option. Otherwise, MongoDB creates journal files during startup. The
first time you start mongod with journaling, you will see a delay as mongod creates new files. In addition, you may
see reduced write throughput.
2.0 mongod instances are interoperable with 1.8 mongod instances; however, for best results, upgrade your deploy-
ments using the following procedures:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod and replacing the 1.8
binary with the 2.0.x binary from the MongoDB Download Page1467 .
2. To avoid losing the last few updates on failover you can temporarily halt your application (failover should take
less than 10 seconds), or you can set write concern (page 141) in your application code to confirm that each
update reaches multiple servers.
3. Use the rs.stepDown() to step down the primary to allow the normal failover (page 635) procedure.
rs.stepDown() and replSetStepDown provide for shorter and more consistent failover procedures than
simply shutting down the primary directly.
When the primary has stepped down, shut down its instance and upgrade by replacing the mongod binary with
the 2.0.x binary.
1. Upgrade all config server instances first, in any order. Since config servers use two-phase commit, shard con-
figuration metadata updates will halt until all are up and running.
2. Upgrade mongos routers in any order.
Changes
Compact Command
A compact command is now available for compacting a single collection and its indexes. Previously, the only way
to compact was to repair the entire database.
Concurrency Improvements
When going to disk, the server will yield the write lock when writing data that is not likely to be in memory. The
initial implementation of this feature now exists:
1466 http://downloads.mongodb.org/
1467 http://downloads.mongodb.org/
MongoDB 2.0 reduces the default stack size. This change can reduce total memory usage when there are many (e.g.,
1000+) client connections, as there is a thread per connection. While portions of a threads stack can be swapped out
if unused, some operating systems do this slowly enough that it might be an issue. The default stack size is lesser of
the system setting or 1MB.
v2.0 includes significant improvements to the index (page 540). Indexes are often 25% smaller and 25% faster (depends
on the use case). When upgrading from previous versions, the benefits of the new index type are realized only if you
create a new index or re-index an old one.
Dates are now signed, and the max index key size has increased slightly from 819 to 1024 bytes.
All operations that create a new index will result in a 2.0 index by default. For example:
Reindexing results on an older-version index results in a 2.0 index. However, reindexing on a secondary does
not work in versions prior to 2.0. Do not reindex on a secondary. For a workaround, see SERVER-38661469 .
The repairDatabase command converts indexes to a 2.0 indexes.
To convert all indexes for a given collection to the 2.0 type (page 1047), invoke the compact command.
Once you create new indexes, downgrading to 1.8.x will require a re-index of any indexes created using 2.0. See Build
Old Style Indexes (page 540).
Sharding Authentication
Replica Sets
Hidden Nodes in Sharded Clusters In 2.0, mongos instances can now determine when a member of a replica set
becomes hidden without requiring a restart. In 1.8, mongos if you reconfigured a member as hidden, you had to
restart mongos to prevent queries from reaching the hidden member.
Priorities Each replica set member can now have a priority value consisting of a floating-point from 0 to 1000,
inclusive. Priorities let you control which member of the set you prefer to have as primary the member with the
highest priority that can see a majority of the set will be elected primary.
For example, suppose you have a replica set with three members, A, B, and C, and suppose that their priorities are set
as follows:
1468 https://jira.mongodb.org/browse/SERVER-2563
1469 https://jira.mongodb.org/browse/SERVER-3866
As priority is 2.
Bs priority is 3.
Cs priority is 1.
During normal operation, the set will always chose B as primary. If B becomes unavailable, the set will elect A as
primary.
For more information, see the priority documentation.
Data-Center Awareness You can now tag replica set members to indicate their location. You can use these tags
to design custom write rules (page 141) across data centers, racks, specific servers, or any other architecture choice.
For example, an administrator can define rules such as very important write or customerData or audit-trail to
replicate to certain servers, racks, data centers, etc. Then in the application code, the developer would say:
db.foo.insert(doc, {w : "very important write"})
which would succeed if it fulfilled the conditions the DBA defined for very important write.
For more information, see Data Center Awareness (page 226).
Drivers may also support tag-aware reads. Instead of specifying slaveOk, you specify slaveOk with tags indicating
which data-centers to read from. For details, see the Drivers1470 documentation.
w : majority You can also set w to majority to ensure that the write propagates to a majority of nodes, ef-
fectively committing it. The value for majority will automatically adjust as you add or remove nodes from the
set.
For more information, see Write Concern (page 141).
Reconfiguration with a Minority Up If the majority of servers in a set has been permanently lost, you can now
force a reconfiguration of the set to bring it back online.
For more information see Reconfigure a Replica Set with Unavailable Members (page 695).
Primary Checks for a Caught up Secondary before Stepping Down To minimize time without a primary, the
rs.stepDown() method will now fail if the primary does not see a secondary within 10 seconds of its latest
optime. You can force the primary to step down anyway, but by default it will return an error message.
See also Force a Member to Become Primary (page 688).
Extended Shutdown on the Primary to Minimize Interruption When you call the shutdown command, the
primary will refuse to shut down unless there is a secondary whose optime is within 10 seconds of the primary. If such
a secondary isnt available, the primary will step down and wait up to a minute for the secondary to be fully caught up
before shutting down.
Note that to get this behavior, you must issue the shutdown command explicitly; sending a signal to the process will
not trigger this behavior.
You can also force the primary to shut down, even without an up-to-date secondary available.
Maintenance Mode When repair or compact runs on a secondary, the secondary will automatically drop into
recovering mode until the operation finishes. This prevents clients from trying to read from it while its busy.
1470 https://docs.mongodb.org/ecosystem/drivers
Geospatial Features
Multi-Location Documents Indexing is now supported on documents which have multiple location objects, em-
bedded either inline or in embedded documents. Additional command options are also supported, allowing results to
return with not only distance but the location used to generate the distance.
For more information, see Multi-location Documents for 2d Indexes (page 507).
Polygon searches Polygonal $within queries are also now supported for simple polygon shapes. For details, see
the $within operator documentation.
Journaling Enhancements
Journaling is now enabled by default for 64-bit platforms. Use the --nojournal command line option to
disable it.
The journal is now compressed for faster commits to disk.
A new --journalCommitInterval run-time option exists for specifying your own group commit interval.
The default settings do not change.
A new { getLastError: { j: true } } option is available to wait for the group commit. The
group commit will happen sooner when a client is waiting on {j: true}. If journaling is disabled, {j:
true} is a no-op.
Set the continueOnError option for bulk inserts, in the driver, so that bulk insert will continue to insert any
remaining documents even if an insert fails, as is the case with duplicate key exceptions or network interruptions. The
getLastError command will report whether any inserts have failed, not just the last one. If multiple errors occur,
the client will only receive the most recent getLastError results.
Note: For bulk inserts on sharded clusters, the getLastError command alone is insufficient to verify success.
Applications should must verify the success of bulk inserts in application logic.
Map Reduce
Output to a Sharded Collection Using the new sharded flag, it is possible to send the result of a map/reduce to
a sharded collection. Combined with the reduce or merge flags, it is possible to keep adding data to very large
collections from map/reduce jobs.
For more information, see Map-Reduce (page 462) and the mapReduce reference.
Additional regex options: s Allows the dot (.) to match all characters including new lines. This is in addition to
the currently supported i, m and x. See $regex.
The output of the validate command and the documents in the system.profile collection have both been
enhanced to return information as BSON objects with keys for each value rather than as free-form strings.
Shell Features
Custom Prompt You can define a custom prompt for the mongo shell. You can change the prompt at any time by
setting the prompt variable to a string or a custom JavaScript function returning a string. For examples, see shell-use-
a-custom-prompt.
Default Shell Init Script On startup, the shell will check for a .mongorc.js file in the users home directory.
The shell will execute this file after connecting to the database and before displaying the prompt.
If you would like the shell not to run the .mongorc.js file automatically, start the shell with --norc.
For more information, see the mongo reference.
In 2.0, when running with authentication (e.g. authorization) all database commands require authentication,
except the following commands.
isMaster
authenticate
getnonce
buildInfo
ping
isdbgrid
Resources
MongoDB Downloads1471
All JIRA Issues resolved in 2.01472
All Backward Incompatible Changes1473
1471 http://mongodb.org/downloads
1472 https://jira.mongodb.org/secure/IssueNavigator.jspa?mode=hide&requestId=11002
1473 https://jira.mongodb.org/issues/?filter=11023&jql=project%20%3D%20SERVER%20AND%20fixVersion%20in%20(10889%2C%2010886%2C%2010784%2C%20
On this page
Upgrading (page 1051)
Changes (page 1054)
Resources (page 1056)
Upgrading
MongoDB 1.8 is a standard, incremental production release and works as a drop-in replacement for MongoDB 1.6,
except:
Replica set members should be upgraded in a particular order, as described in Upgrading a Replica Set
(page 1051).
The mapReduce command has changed in 1.8, causing incompatibility with previous releases. mapReduce
no longer generates temporary collections (thus, keepTemp has been removed). Now, you must always supply
a value for out. See the out field options in the mapReduce document. If you use MapReduce, this also
likely means you need a recent version of your client driver.
Preparation
Read through all release notes before upgrading and ensure that no changes will affect your deployment.
5. Shut down the primary (the final 1.6 server), and then restart it with the 1.8.x binary from the MongoDB
Download Page1477 .
Returning to 1.6
If for any reason you must move back to 1.6, follow the steps above in reverse. Please be careful that you have not
inserted any documents larger than 4MB while running on 1.8 (where the max size has increased to 16MB). If you
have you will get errors when the server tries to read those documents.
1476 http://downloads.mongodb.org/
1477 http://downloads.mongodb.org/
1478 http://downloads.mongodb.org/
1479 http://downloads.mongodb.org/
1480 http://downloads.mongodb.org/
Journaling Returning to 1.6 after using 1.8 Journaling (page 598) works fine, as journaling does not change anything
about the data file format. Suppose you are running 1.8.x with journaling enabled and you decide to switch back to
1.6. There are two scenarios:
If you shut down cleanly with 1.8.x, just restart with the 1.6 mongod binary.
If 1.8.x shut down uncleanly, start 1.8.x up again and let the journal files run to fix any damage (incomplete
writes) that may have existed at the crash. Then shut down 1.8.x cleanly and restart with the 1.6 mongod binary.
Changes
Journaling
MongoDB now supports write-ahead Journaling (page 598) to facilitate fast crash recovery and durability in the
storage engine. With journaling enabled, a mongod can be quickly restarted following a crash without needing to
repair the collections. The aggregation framework makes it possible to do aggregation
Sparse Indexes (page 519) are indexes that only include documents that contain the fields specified in the index.
Documents missing the field will not appear in the index at all. This can significantly reduce index size for indexes of
fields that contain only a subset of documents within a collection.
Covered Indexes (page 70) enable MongoDB to answer queries entirely from the index when the query only selects
fields that the index contains.
The mapReduce command supports new options that enable incrementally updating existing collections. Previously,
a MapReduce job could output either to a temporary collection or to a named permanent collection, which it would
overwrite with new data.
You now have several options for the output of your MapReduce jobs:
You can merge MapReduce output into an existing collection. Output from the Reduce phase will replace
existing keys in the output collection if it already exists. Other keys will remain in the collection.
You can now re-reduce your output with the contents of an existing collection. Each key output by the reduce
phase will be reduced with the existing document in the output collection.
You can replace the existing output collection with the new results of the MapReduce job (equivalent to setting
a permanent output collection in previous releases)
You can compute MapReduce inline and return results to the caller without persisting the results of the job. This
is similar to the temporary collections generated in previous releases, except results are limited to 8MB.
For more information, see the out field options in the mapReduce document.
1.8.1
Sharding migrate fix when moving larger chunks.
Durability fix with background indexing.
1.8.0
All changes from 1.7.x series.
1.7.6
Bug fixes.
1.7.5
Journaling (page 598).
Extent allocation improvements.
Improved replica set connectivity for mongos.
getLastError improvements for sharding.
1.7.4
mongos routes slaveOk queries to secondaries in replica sets.
New mapReduce output options.
Sparse Indexes (page 519).
1.7.3
Initial covered index (page 70) support.
Distinct can use data from indexes when possible.
mapReduce can merge or reduce results into an existing collection.
mongod tracks and mongostat displays network usage. See mongostat.
Sharding stability improvements.
1.7.2
$rename operator allows renaming of fields in a document.
db.eval() not to block.
Geo queries with sharding.
mongostat --discover option
Chunk splitting enhancements.
Replica sets network enhancements for servers behind a nat.
1.7.1
Many sharding performance enhancements.
Better support for $elemMatch on primitives in embedded arrays.
Query optimizer enhancements on range queries.
Window service enhancements.
Replica set setup improvements.
$pull works on primitives in arrays.
1.7.0
Sharding performance improvements for heavy insert loads.
Slave delay support for replica sets.
getLastErrorDefaults for replica sets.
Auto completion in the shell.
Spherical distance for geo search.
All fixes from 1.6.1 and 1.6.2.
1.8.11481 , 1.8.01482
1.7.61483 , 1.7.51484 , 1.7.41485 , 1.7.31486 , 1.7.21487 , 1.7.11488 , 1.7.01489
Resources
MongoDB Downloads1490
All JIRA Issues resolved in 1.81491
On this page
Upgrading (page 1057)
Sharding (page 1057)
Replica Sets (page 1057)
Other Improvements (page 1057)
Installation (page 1058)
1.6.x Release Notes (page 1058)
1.5.x Release Notes (page 1058)
Upgrading
MongoDB 1.6 is a drop-in replacement for 1.4. To upgrade, simply shutdown mongod then restart with the new
binaries.
Please note that you should upgrade to the latest version of whichever driver youre using. Certain drivers, including
the Ruby driver, will require the upgrade, and all the drivers will provide extra features for connecting to replica sets.
Sharding
Sharding (page 725) is now production-ready, making MongoDB horizontally scalable, with no single point of failure.
A single instance of mongod can now be upgraded to a distributed cluster with zero downtime when the need arises.
Sharding (page 725)
Deploy a Sharded Cluster (page 757)
Convert a Replica Set to a Sharded Cluster (page 767)
Replica Sets
Replica sets (page 613), which provide automated failover among a cluster of n nodes, are also now available.
Please note that replica pairs are now deprecated; we strongly recommend that replica pair users upgrade to replica
sets.
Replication (page 613)
Deploy a Replica Set (page 657)
Convert a Standalone to a Replica Set (page 669)
Other Improvements
The w option (and wtimeout) forces writes to be propagated to n servers before returning success (this works
especially well with replica sets)
$or queries
Improved concurrency
$slice operator for returning subsets of arrays
64 indexes per collection (formerly 40 indexes per collection)
64-bit integers can now be represented in the shell using NumberLong
The findAndModify command now supports upserts. It also allows you to specify fields to return
$showDiskLoc option to see disk location of a document
Support for IPv6 and UNIX domain sockets
Installation
1.6.51492
1.5.81493
1.5.71494
1.5.61495
1.5.51496
1.5.41497
1.5.31498
1.5.21499
1.5.11500
1.5.01501
You can see a full list of all changes on JIRA1502 .
Thank you everyone for your support and suggestions!
On this page
Upgrading (page 1059)
Core Server Enhancements (page 1059)
Replication and Sharding (page 1059)
Deployment and Production (page 1059)
Query Language Improvements (page 1060)
Geo (page 1060)
Upgrading
Were pleased to announce the 1.4 release of MongoDB. 1.4 is a drop-in replacement for 1.2. To upgrade you just
need to shutdown mongod, then restart with the new binaries. (Users upgrading from release 1.0 should review the
1.2 release notes (page 1060), in particular the instructions for upgrading the DB format.)
Release 1.4 includes the following improvements over release 1.2:
Geo
On this page
New Features (page 1060)
DB Upgrade Required (page 1060)
Replication Changes (page 1061)
mongoimport (page 1061)
field filter changing (page 1061)
New Features
DB Upgrade Required
There are some changes that will require doing an upgrade if your previous version is <= 1.0.x. If youre already using
a version >= 1.1.x then these changes arent required. There are 2 ways to do it:
--upgrade
stop your mongod process
Replication Changes
There have been minor changes in replication. If you are upgrading a master/slave setup from <= 1.1.2 you have
to update the slave first.
mongoimport
mongoimportjson has been removed and is replaced with mongoimport that can do json/csv/tsv
Weve changed the semantics of the field filter a little bit. Previously only objects with those fields would be
returned. Now the field filter only changes the output, not which objects are returned. If you need that behavior,
you can use $exists
For MongoDB 2.4.1, 2.4 refers to the release series and .1 refers to the revision. The second component of the
release series (e.g. 4 in 2.4.1) describes the type of release series. Release series ending with even numbers (e.g. 4
above) are stable and ready for production, while odd numbers are for development and testing only.
Generally, changes in the release series (e.g. 2.2 to 2.4) mark the introduction of new features that may break back-
wards compatibility. Changes to the revision number mark the release bug fixes and backwards-compatible changes.
Important: Always upgrade to the latest stable revision of your release series.
The version numbering system for MongoDB differs from the system used for the MongoDB drivers. Drivers use only
the first number to indicate a major version. For details, see drivers-version-numbers.
Example
Version numbers
2.0.0 : Stable release.
2.0.1 : Revision.
2.1.0 : Development release for testing only. Includes new features and changes for testing. Interfaces and
stability may not be compatible in development releases.
2.2.0 : Stable release. This is a culmination of the 2.1.x development series.
On this page
License (page 1063)
Editions (page 1063)
Version and Revisions (page 1064)
Report an Issue or Make a Change Request (page 1064)
Contribute to the Documentation (page 1065)
The MongoDB Manual1 contains comprehensive documentation on MongoDB. This page describes the manuals
licensing, editions, and versions, and describes how to make a change request and how to contribute to the manual.
14.1 License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License2
MongoDB, Inc. 2008-2016
14.2 Editions
In addition to the MongoDB Manual3 , you can also access this content in the following editions:
PDF Format4 (without reference).
HTML tar.gz5
ePub Format6
You also can access PDF files that contain subsets of the MongoDB Manual:
MongoDB Reference Manual7
1 http://docs.mongodb.org/manual/#
2 http://creativecommons.org/licenses/by-nc-sa/3.0/us/
3 http://docs.mongodb.org/manual/#
4 http://docs.mongodb.org/master/MongoDB-manual.pdf
5 http://docs.mongodb.org/master/manual.tar.gz
6 http://docs.mongodb.org/master/MongoDB-manual.epub
7 http://docs.mongodb.org/master/MongoDB-reference-manual.pdf
1063
MongoDB Documentation, Release 3.2.4
To report an issue with this manual or to make a change request, file a ticket at the MongoDB DOCS Project on Jira20 .
8 http://docs.mongodb.org/master/MongoDB-crud-guide.pdf
9 http://docs.mongodb.org/master/MongoDB-data-models-guide.pdf
10 http://docs.mongodb.org/master/MongoDB-aggregation-guide.pdf
11 http://docs.mongodb.org/master/MongoDB-replication-guide.pdf
12 http://docs.mongodb.org/master/MongoDB-sharding-guide.pdf
13 http://docs.mongodb.org/master/MongoDB-administration-guide.pdf
14 http://docs.mongodb.org/master/MongoDB-security-guide.pdf
15 http://kapeli.com/dash
16 http://docs.mongodb.org/master/manpages.tar.gz
17 http://docs.mongodb.org
18 https://github.com/mongodb/docs
19 http://docs.mongodb.org/master/release.txt
20 https://jira.mongodb.org/browse/DOCS
The original language of all MongoDB documentation is American English. However it is of critical importance to
the documentation project to ensure that speakers of other languages can read and understand the documentation.
To this end, the MongoDB Documentation Project is preparing to launch a translation effort to allow the community
to help bring the documentation to speakers of other languages.
If you would like to express interest in helping to translate the MongoDB documentation once this project is opened
to the public, please:
complete the MongoDB Contributor Agreement21 , and
join the mongodb-translators22 user group.
The mongodb-translators23 user group exists to facilitate collaboration between translators and the documentation
team at large. You can join the group without signing the Contributor Agreement, but you will not be allowed to
contribute translations.
See also:
Contribute to the Documentation (page 1065)
Style Guide and Documentation Conventions (page 1066)
MongoDB Manual Organization (page 1075)
MongoDB Documentation Practices and Processes (page 1072)
MongoDB Documentation Build System (page 1076)
The entire documentation source for this manual is available in the mongodb/docs repository24 , which is one of the
MongoDB project repositories on GitHub25 .
To contribute to the documentation, you can open a GitHub account26 , fork the mongodb/docs repository27 , make a
change, and issue a pull request.
In order for the documentation team to accept your change, you must complete the MongoDB Contributor Agree-
ment28 .
You can clone the repository by issuing the following command at your system shell:
git clone git://github.com/mongodb/docs.git
The MongoDB Manual uses Sphinx29 , a sophisticated documentation engine built upon Python Docutils30 . The orig-
inal reStructured Text31 files, as well as all necessary Sphinx extensions and build tools, are available in the same
21 http://www.mongodb.com/legal/contributor-agreement
22 http://groups.google.com/group/mongodb-translators
23 http://groups.google.com/group/mongodb-translators
24 https://github.com/mongodb/docs
25 http://github.com/mongodb
26 https://github.com/
27 https://github.com/mongodb/docs
28 http://www.mongodb.com/contributor
29 http://sphinx-doc.org//
30 http://docutils.sourceforge.net/
31 http://docutils.sourceforge.net/rst.html
This document provides an overview of the style for the MongoDB documentation stored in this repository. The
overarching goal of this style guide is to provide an accessible base style to ensure that our documentation is easy to
read, simple to use, and straightforward to maintain.
For information regarding the MongoDB Manual organization, see MongoDB Manual Organization (page 1075).
Document History
2011-09-27: Document created with a (very) rough list of style guidelines, conventions, and questions.
2012-01-12: Document revised based on slight shifts in practice, and as part of an effort of making it easier for people
outside of the documentation team to contribute to documentation.
2012-03-21: Merged in content from the Jargon, and cleaned up style in light of recent experiences.
2012-08-10: Addition to the Referencing section.
2013-02-07: Migrated this document to the manual. Added map-reduce terminology convention. Other edits.
2013-11-15: Added new table of preferred terms.
2016-01-05: Standardizing on embedded document
Naming Conventions
This section contains guidelines on naming files, sections, documents and other document elements.
File naming Convention:
For Sphinx, all files should have a .txt extension.
Separate words in file names with hyphens (i.e. -.)
For most documents, file names should have a terse one or two word name that de-
scribes the material covered in the document. Allow the path of the file within the doc-
ument tree to add some of the required context/categorization. For example its accept-
able to have https://docs.mongodb.org/manual/core/sharding.rst and
https://docs.mongodb.org/manual/administration/sharding.rst.
For tutorials, the full title of the document should be in the file name. For example,
https://docs.mongodb.org/manual/tutorial/replace-one-configuration-server-in-a-shar
Phrase headlines and titles so users can determine what questions the text will answer, and material that will
be addressed, without needing them to read the content. This shortens the amount of time that people spend
looking for answers, and improvise search/scanning, and possibly SEO.
Prefer titles and headers in the form of Using foo over How to Foo.
When using target references (i.e. :ref: references in documents), use names that include enough context to
be intelligible through all documentation. For example, use replica-set-secondary-only-node as
opposed to secondary-only-node. This makes the source more usable and easier to maintain.
Style Guide
This includes the local typesetting, English, grammatical, conventions and preferences that all documents in the manual
should use. The goal here is to choose good standards, that are clear, and have a stylistic minimalism that does not
interfere with or distract from the content. A uniform style will improve user experience and minimize the effect of a
multi-authored document.
Punctuation
Use the Oxford comma.
Oxford commas are the commas in a list of things (e.g. something, something else, and another thing) before
the conjunction (e.g. and or or.).
Do not add two spaces after terminal punctuation, such as periods.
Place commas and periods inside quotation marks.
Headings Use title case for headings and document titles. Title case capitalizes the first letter of the first, last, and
all significant words.
Referencing
To refer to future or planned functionality in MongoDB or a driver, always link to the Jira case. The Manuals
conf.py provides an :issue: role that links directly to a Jira case (e.g. :issue:\SERVER-9001\).
For non-object references (i.e. functions, operators, methods, database commands, settings) always reference
only the first occurrence of the reference in a section. You should always reference objects, except in section
headings.
Structure references with the why first; the link second.
For example, instead of this:
Use the Convert a Replica Set to a Sharded Cluster (page 767) procedure if you have an existing replica set.
Type this:
To deploy a sharded cluster for an existing replica set, see Convert a Replica Set to a Sharded Cluster (page 767).
General Formulations
Contractions are acceptable insofar as they are necessary to increase readability and flow. Avoid otherwise.
Make lists grammatically correct.
Do not use a period after every item unless the list item completes the unfinished sentence before the list.
Use appropriate commas and conjunctions in the list items.
Typically begin a bulleted list with an introductory sentence or clause, with a colon or comma.
The following terms are one word:
standalone
workflow
Use unavailable, offline, or unreachable to refer to a mongod instance that cannot be accessed. Do not
use the colloquialism down.
Always write out units (e.g. megabytes) rather than using abbreviations (e.g. MB.)
Structural Formulations
There should be at least two headings at every nesting level. Within an h2 block, there should be either: no
h3 blocks, 2 h3 blocks, or more than 2 h3 blocks.
Section headers are in title case (capitalize first, last, and all important words) and should effectively describe
the contents of the section. In a single document you should strive to have section titles that are not redundant
and grammatically consistent with each other.
Use paragraphs and paragraph breaks to increase clarity and flow. Avoid burying critical information in the
middle of long paragraphs. Err on the side of shorter paragraphs.
Prefer shorter sentences to longer sentences. Use complex formations only as a last resort, if at all (e.g. com-
pound complex structures that require semi-colons).
Avoid paragraphs that consist of single sentences as they often represent a sentence that has unintentionally
become too complex or incomplete. However, sometimes such paragraphs are useful for emphasis, summary,
or introductions.
As a corollary, most sections should have multiple paragraphs.
For longer lists and more complex lists, use bulleted items rather than integrating them inline into a sentence.
Do not expect that the content of any example (inline or blocked) will be self explanatory. Even when it feels
redundant, make sure that the function and use of every example is clearly described.
Place footnotes and other references, if you use them, at the end of a section rather than the end of a file.
Use the footnote format that includes automatic numbering and a target name for ease of use. For instance a
footnote tag may look like: [#note]_ with the corresponding directive holding the body of the footnote that
resembles the following: .. [#note].
Do not include .. code-block:: [language] in footnotes.
As it makes sense, use the .. code-block:: [language] form to insert literal blocks into the text.
While the double colon, ::, is functional, the .. code-block:: [language] form makes the source
easier to read and understand.
For all mentions of referenced types (i.e. commands, operators, expressions, functions, statuses, etc.) use the
reference types to ensure uniform formatting and cross-referencing.
Other Terms
Use example.net (and .org or .com if needed) for all examples and samples.
Hyphenate map-reduce in order to avoid ambiguous reference to the command name. Do not camel-case.
Geo-Location
1. While MongoDB is capable of storing coordinates in embedded documents, in practice, users should only
store coordinates in arrays. (See: DOCS-4132 .)
Commits
When relevant, include a Jira case identifier in a commit message. Reference documentation cases when applicable,
but feel free to reference other cases from jira.mongodb.org33 .
Err on the side of creating a larger number of discrete commits rather than bundling large set of changes into one
commit.
32 https://jira.mongodb.org/browse/DOCS-41
33 http://jira.mongodb.org/
For the sake of consistency, remove trailing whitespaces in the source file.
Hard wrap files to between 72 and 80 characters per-line.
At least two people should vet all non-trivial changes to the documentation before publication. One of the
reviewers should have significant technical experience with the material covered in the documentation.
All development and editorial work should transpire on GitHub branches or forks that editors can then merge
into the publication branches.
Collaboration
Builds
Building the documentation is useful because Sphinx37 and docutils can catch numerous errors in the format and
syntax of the documentation. Additionally, having access to an example documentation as it will appear to the users
is useful for providing more effective basis for the review process. Besides Sphinx, Pygments, and Python-Docutils,
the documentation repository contains all requirements for building the documentation resource.
Talk to someone on the documentation team if you are having problems running builds yourself.
Publication
The makefile for this repository contains targets that automate the publication process. Use make html to publish
a test build of the documentation in the build/ directory of your repository. Use make publish to build the full
contents of the manual from the current branch in the ../public-docs/ directory relative the docs repository.
Other targets include:
man - builds UNIX Manual pages for all Mongodb utilities.
push - builds and deploys the contents of the ../public-docs/.
pdfs - builds a PDF version of the manual (requires LaTeX dependencies.)
Branches
This section provides an overview of the git branches in the MongoDB documentation repository and their use.
34 https://jira.mongodb.org/browse/DOCS
35 https://github.com/
36 https://github.com/mongodb/docs
37 http://sphinx.pocoo.org/
At the present time, future work transpires in the master, with the main publication being current. As the
documentation stabilizes, the documentation team will begin to maintain branches of the documentation for specific
MongoDB releases.
The MongoDB.org Wiki contains a wealth of information. As the transition to the Manual (i.e. this project and
resource) continues, its critical that no information disappears or goes missing. The following process outlines how
to migrate a wiki page to the manual:
1. Read the relevant sections of the Manual, and see what the new documentation has to offer on a specific topic.
In this process you should follow cross references and gain an understanding of both the underlying information
and how the parts of the new content relates its constituent parts.
2. Read the wiki page you wish to redirect, and take note of all of the factual assertions, examples presented by the
wiki page.
3. Test the factual assertions of the wiki page to the greatest extent possible. Ensure that example output is accurate.
In the case of commands and reference material, make sure that documented options are accurate.
4. Make corrections to the manual page or pages to reflect any missing pieces of information.
The target of the redirect need not contain every piece of information on the wiki page, if the manual as a
whole does, and relevant section(s) with the information from the wiki page are accessible from the target of the
redirection.
5. As necessary, get these changes reviewed by another writer and/or someone familiar with the area of the infor-
mation in question.
At this point, update the relevant Jira case with the target that youve chosen for the redirect, and make the ticket
unassigned.
6. When someone has reviewed the changes and published those changes to Manual, you, or preferably someone
else on the team, should make a final pass at both pages with fresh eyes and then make the redirect.
Steps 1-5 should ensure that no information is lost in the migration, and that the final review in step 6 should be
trivial to complete.
Review Process
Types of Review The content in the Manual undergoes many types of review, including the following:
Initial Technical Review Review by an engineer familiar with MongoDB and the topic area of the documentation.
This review focuses on technical content, and correctness of the procedures and facts presented, but can improve any
aspect of the documentation that may still be lacking. When both the initial technical review and the content review
are complete, the piece may be published.
Content Review Textual review by another writer to ensure stylistic consistency with the rest of the manual. De-
pending on the content, this may precede or follow the initial technical review. When both the initial technical review
and the content review are complete, the piece may be published.
Consistency Review This occurs post-publication and is content focused. The goals of consistency reviews are to
increase the internal consistency of the documentation as a whole. Insert relevant cross-references, update the style as
needed, and provide background fact-checking.
When possible, consistency reviews should be as systematic as possible and we should avoid encouraging stylistic and
information drift by editing only small sections at a time.
Subsequent Technical Review If the documentation needs to be updated following a change in functionality of the
server or following the resolution of a user issue, changes may be significant enough to warrant additional technical
review. These reviews follow the same form as the initial technical review, but is often less involved and covers a
smaller area.
Review Methods If youre not a usual contributor to the documentation and would like to review something, you
can submit reviews in any of the following methods:
If youre reviewing an open pull request in GitHub, the best way to comment is on the overview diff, which
you can find by clicking on the diff button in the upper left portion of the screen. You can also use the
following URL to reach this interface:
https://github.com/mongodb/docs/pull/[pull-request-id]/files
Replace [pull-request-id] with the identifier of the pull request. Make all comments inline, using
GitHubs comment system.
You may also provide comments directly on commits, or on the pull request itself but these commit-comments
are archived in less coherent ways and generate less useful emails, while comments on the pull request lead to
less specific changes to the document.
Leave feedback on Jira cases in the DOCS38 project. These are better for more general changes that arent
necessarily tied to a specific line, or affect multiple files.
Create a fork of the repository in your GitHub account, make any required changes and then create a pull request
with your changes.
If you insert lines that begin with any of the following annotations:
.. TODO:
TODO:
.. TODO
TODO
followed by your comments, it will be easier for the original writer to locate your comments. The two dots ..
format is a comment in reStructured Text, which will hide your comments from Sphinx and publication if youre
worried about that.
This format is often easier for reviewers with larger portions of content to review.
This document provides an overview of the global organization of the documentation resource. Refer to the notes
below if you are having trouble understanding the reasoning behind a files current location, or if you want to add new
documentation but arent sure how to integrate it into the existing resource.
If you have questions, dont hesitate to open a ticket in the Documentation Jira Project39 or contact the documentation
team40 .
38 http://jira.mongodb.org/browse/DOCS
39 https://jira.mongodb.org/browse/DOCS
40 docs@mongodb.com
Global Organization
Indexes and Experience The documentation project has two index files:
https://docs.mongodb.org/manual/contents.txt and https://docs.mongodb.org/manual/index.txt.
The contents file provides the documentations tree structure, which Sphinx uses to create the left-pane navigational
structure, to power the Next and Previous page functionality, and to provide all overarching outlines of the
resource. The index file is not included in the contents file (and thus builds will produce a warning here) and is
the page that users first land on when visiting the resource.
Having separate contents and index files provides a bit more flexibility with the organization of the resource while
also making it possible to customize the primary user experience.
Topical Organization The placement of files in the repository depends on the type of documentation rather than the
topic of the content. Like the difference between contents.txt and index.txt, by decoupling the organization
of the files from the organization of the information the documentation can be more flexible and can more adequately
address changes in the product and in users needs.
Files in the source/ directory represent the tip of a logical tree of documents, while directories are containers of
types of content. The administration and applications directories, however, are legacy artifacts and with a
few exceptions contain sub-navigation pages.
With several exceptions in the reference/ directory, there is only one level of sub-directories in the source/
directory.
Tools
The organization of the site, like all Sphinx sites derives from the toctree structure. However, in order to annotate
the table of contents and provide additional flexibility, the MongoDB documentation generates toctree structures
using data from YAML files stored in the source/includes/ directory. These files start with ref-toc or toc
and generate output in the source/includes/toc/ directory. Briefly this system has the following behavior:
files that start with ref-toc refer to the documentation of API objects (i.e. commands, operators and methods),
and the build system generates files that hold toctree directives as well as files that hold tables that list objects
and a brief description.
files that start with toc refer to all other documentation and the build system generates files that hold toctree
directives as well as files that hold definition lists that contain links to the documents and short descriptions the
content.
file names that have spec following toc or ref-toc will generate aggregated tables or definition lists and
allow ad-hoc combinations of documents for landing pages and quick reference guides.
This document contains more direct instructions for building the MongoDB documentation.
Getting Started
Install Dependencies The MongoDB Documentation project depends on the following tools:
Python
Git
Inkscape (Image generation.)
OS X Install Sphinx, Docutils, and their dependencies with easy_install the following command:
easy_install giza
Feel free to use pip rather than easy_install to install python packages.
To generate the images used in the documentation, download and install Inkscape42 .
Optional
To generate PDFs for the full production build, install a TeX distribution (for building the PDF.) If you do not have a
LaTeX installation, use MacTeX43 . This is only required to build PDFs.
Arch Linux Install packages from the system repositories with the following command:
pacman -S inkscape python2-pip
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
pacman -S texlive-bin texlive-core texlive-latexextra
Debian/Ubuntu Install the required system packages with the following command:
apt-get install inkscape python-pip
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
apt-get install texlive-latex-recommended texlive-latex-recommended
The MongoDB documentation build system is entirely accessible via make targets. For example, to build an HTML
version of the documentation issue the following command:
make html
You can find the build output in build/<branch>/html, where <branch> is the name of the current branch.
In addition to the html target, the build system provides the following targets:
publish Builds and integrates all output for the production build. Build output is in
build/public/<branch>/. When you run publish in the master, the build will generate
some output in build/public/.
push; stage Uploads the production build to the production or staging web servers. Depends on publish. Re-
quires access production or staging environment.
push-all; stage-all Uploads the entire content of build/public/ to the web servers. Depends on
publish. Not used in common practice.
push-with-delete; stage-with-delete Modifies the action of push and stage to remove remote file
that dont exist in the local build. Use with caution.
html; latex; dirhtml; epub; texinfo; man; json These are standard targets derived from the default
Sphinx Makefile, with adjusted dependencies. Additionally, for all of these targets you can append -nitpick
to increase Sphinxs verbosity, or -clean to remove all Sphinx build artifacts.
latex performs several additional post-processing steps on .tex output generated by Sphinx. This target will
also compile PDFs using pdflatex.
html and man also generates a .tar.gz file of the build outputs for inclusion in the final releases.
If you have any questions, please feel free to open a Jira Case44 .
44 https://jira.mongodb.org/browse/DOCS